Introduction

Introduction Tag syntax Page structure Validation Useful tags Style sheets Tables Useful links

HTML (HyperText Markup Language) was invented as a way of sharing academic information. HTML is a way of marking up textual information in order to help identify the meaning and relative importance of the parts of a document. This markup is then used by a browser to help it to display the information to best advantage.

The intention was to provide a means of sharing information that was completely independent of any particular manufacturer's software, or even the type of device involved in the process. Most people think of a web browser as a software application running on a desktop computer and displaying HTML pages on a computer screen. This is the currently the most popular form of web browser, and offers the highest level of capability in terms of available processing power, screen area and colour, but there are many others in widespread use, and more due in the near future:

Special needs browsers: These may output to a braille reader, or even speak a web page.
Browsers on portable devices: Many high end mobile phones and personal organisers have web browsers, but have poor processing power, limited memory and small - often monochrome - screens.
Browsers on TV set top boxes: These often link to internet access provided by cable companies. The set top boxes are really low powered computers, and the TV screen has lower resolution than a computer monitor.
Browsers built into cars: Currently limited to the luxury end of the market, some car dashboard computers provide web access.

Marking up the meaning of a page - e.g. by identifying something as a heading, rather than saying it is in 20 point red times bold - means that indexers can automatically identify the important elements of a page, produce summaries and so on, and search engines can seek out results most appropriate for your search query. But more importantly, it means that any type of browser on any platform can do it's best to convey the meaning - for example, a speaking browser might use inflection, volume and pauses to indicate a heading - just as you might do if you were reading a page. It would not be able to do that if the markup had simply defined the font style rather than indicating that the phrase was a heading.

Sadly, it did not take long before the sales and marketing teams from the big companies jumped in and started to mess things up. The chief culprits were Microsoft and Netscape. In the browser wars of the 1990s, each company tried to introduce as many bells and whistles as possible in order to make their browser better than the competition. Many of the new tags they added were not designed to be handled easily by browsers, they introduced exceptions to well established rules. Most of the bells and whistles were ill-conceived, and led to the situation where certain web pages only looked correct in certain browsers - exactly the opposite of what the web is all about.

As you might expect, many of these 'enhancements' to html were designed to make web pages look pretty, and increasingly, details about page style and layout was mixed in with the information contained in HTML documents. At the same time, it seemed that every man and his dog was a web designer, and millions upon millions of web pages were produced by people who simply did not understand how to do it. These pages are full of mistakes: tags in the wrong order, missing tags, you name it - they did it. So web browsers had to be modified so that they could understand broken html and try and make a best guess at how to display the information.

Most 'web designers' don't actually know the first thing about HTML and they use WYSIWYG software tools such as Front Page to build their web pages. Sadly, these tools concentrate almost exclusively on style and layout, and if you examine the HTML they produce, you will find it messy and cluttered with lots of superfluous data.

And so browsers became the huge extravaganzas they are today - many many megabytes of code designed to make the best of the mess that HTML and the web had become. Luckily, the World Wide Web Consortium or W3C (www.w3.org), headed by the Tim Berners-Lee, the English scientist who invented the World Wide Web, are on the case and are tidying the mess up.

Their latest HTML standard, called XHTML gets rid of all the silly tags which have crept in and insists that web pages are technically correct. All styling information has been moved out of the HTML document and into a separate file called a Cascading Style Sheet. This means that pages conforming to the standard can be viewed on all types of browser on all types of device, and browsers will not need enormous amounts of processing power in order to display the pages. This, combined with accessibility legislation in countries like the USA, means that we may see the end of 'best viewed with browser X' logos, and look forward to a web which can be accessed by anybody, anywhere and on any device.

Designing web pages which are technically correct is important. It means that they will look great in your browser, and also in any other browser with similar capabilities. Your pages will degrade gracefully in less capabable browsers and the information contained in the pages will be available to everyone - even those with special needs.

Note that this course was written in 2003, and things have moved on since then! You might like to check out this site for a more up to date html overview.

© Dial Solutions 2003
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; A copy of the license can be found at www.gnu.org/copyleft/fdl.html