There have been many different versions of HTML since the World Wide Web was invented in the early 90s by Tim Berners-Lee. The ‘rules’ for using each version are encapsulated in the standards published by the World Wide Web Consortium (W3C). The standards dictate the tags publishers are allowed to use (and in what order), and how those tags should be interpreted by browsers and ‘user agents’. For example, text within header tags are interpreted as headings, text within paragraph tags are interpreted as paragraphs.
How else would your web browser know the <h1> tag around a particular piece of text means, ‘display this as a headings’, if it didn’t have a set of rules to follow?
It is said (with irony), that ‘the great thing about standards is that there are so many to choose from’; and in this case it is quite true. However, if you are beginner, you won’t go far wrong if you decide to use the latest – and final – version of HTML released by the W3C, i.e., HTML 4.01 Strict. This is a good standard to adopt as it will never change (giving you a solid reliable way to markup your pages), and web browsers will understand your pages for a good while yet.
The ‘Strict’ part of the name means that you should not use tags and attributes that are no longer part of the final HTML standard (the jargon used when referring to these non-standard tags and attributes is ‘deprecated’). Mostly these are tags and attributes that are related to altering page presentation, e.g., the <font> tag and the bgcolor attribute. Staying away from deprecated tags and attributes removes a few more potential barriers to accessibility for your visitors.
If you are a more experienced coder and you have the tools to ensure that you don’t make mistakes when marking up documents, you may prefer to use XHTML 1 Strict. Adopting this more up-to-date standard will make you feel more virtuous, assist with accessibility, and help to future-proof your pages.
(If you read the excellent book, Designing With Web Standards by Jeffrey Zeldman, you will be convinced that you should be using XHTML 1 Transitional – but if you read the warnings by Ian Hickson about the dangers of sending XHTML as Text/HTML you will be slightly less sure.)
Future developments in the area of ‘markup for the web’ are based around the use of Extensible Markup Language – XML for short. This is not the place for a long discussion about what XML is, and what it can be used for, suffice to say it is a way of labeling and adding structure to data; in the case of XHTML – which is an application of XML – the data being labelled and structured is the content of a web page. XML is potentially a more flexible and ‘intelligent’ way of adding labels to your web published documents – because it is designed to make it easier for computers to process and transform documents into different formats.
All HTML pages must include a Document Type Declaration (DTD) as the first element in on the page. For example, if you are using HTML 4.01 Transitional, that declaration will be:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
There are three DTDs for HTML 4.01, ‘Strict’, ‘Transitional’ and ‘Frameset’. Using ‘Transitional’ means you are allowed to use some of the ‘deprecated’ tags in your pages (i.e. tags that are no longer part of the standard). If you are interested in finding out more I suggest the article, Fixing Your Site With the Right DOCTYPE.
Stardard HTML documents are ‘marked up’ in a way that makes it clear up what parts are headings, what parts are paragraphs, what parts are images, lists, and so on.
Using HTML in a standard way means you have to put the correct labels (i.e tags) around the appropriate document structures; headings should be marked up using heading tags, lists should be marked up using list tags, passages of quotation should be marked up with the blockquote element, and so on. In other words the HTML tags should be used to label the various elements of your documents according the rules of HTML. Using standard markup also means that you are using the tags in the correct places and in the correct order, e.g. you can’t put a heading tag inside an image tag.
An example of a heading created using non-valid markup:
<b><font size=4>First heading</font></b>
The presentation tag <b> has been used to make the text look bold, and the <font> tag has been used to make it bigger than the default size.
The same heading marked up using standard HTML:
Having marked up the text as a heading, the presentation aspects can be altered using Cascading Style Sheets (CSS).
See the article HTML – looking down at it from a very great height for a short introduction to using standard HTML.
The W3C provide a free validation tool to check whether your document is valid HTML: http://validator.w3.org/. Your document must include a Document Type Declaration (DTD), so that the validator will know the HTML version used to markup your page.
Once a document has been created using standard HTML you can alter the way it is presented, by using different ‘style sheets’; in much the same way you can use different styles to alter the look of the text in a Microsoft Word document.
Cascading Style Sheets contain information to set, among other things, the size and colour of headings, the justification of text, the layout of the page, and so on. In other words CSS should be used to provide information about how the page looks for visual users, and in more general terms, how it is presented to different types of user and ‘user agent’ (i.e., browsers).
HTML should ideally be used only to markup the structure of your document (i.e. say what bits are headings, paragraphs, list, quotes, and so on), with CSS being used to determine the presentational aspects of that document (e.g. how it looks). By separating structure from presentation you
are creating more flexible pages. A given user can then
apply their own style sheet so that the content is
presented in a way that suits their needs.
Using standard HTML and CSS will mean your pages are more flexible in the way they can be presented to the end users.
Using standard Hypertext Markup Language (HTML) also ensures that your pages will work on the widest range of hardware and software.
See the Web standards Project (WASP) for further information, and the advantages of using standard HTML