Introduction
In your normal work as an HTML author you'll always start your HTML file with either a wizard or by copying from another HTML file. However, for a real understanding of HTML, it's a good idea to know how to build one from scratch. That's what this document does. Read on.
A Minimal HTML Hello World
This section showcases a 4 line Hello World HTML file that renders as expected, is well formed XML, but produces an error and a warning upon HTML validation. Links for this minimal file example follow:
View in browser View source code
This file creates the these messages when run through the W3C HTML Validator:
The error is that the "head" element has no title, which isn't surprising given that there's no "head" element at all, and there's nothing signifying the title at all. Not complained about but very true is if you have a "head" element you must also have a "body" element following it. And in fact you always need a "head" element to contain your "title" element.
The warning says we really should have a language attribute for the "html" element. All these problems are addressed in the next section.
A Completely Valid Hello World HTML File
The preceding section featured a 4 line HTML file that renders fine but throws an error and a warning. This is unacceptable if you want your page to render identically on all competent browsers.
I KNOW, I KNOW
Naysayers will tell you that "in the real world" validation errors aren't bad, and that most web pages have validation errors. They're right about most web pages having errors, but they're wrong that errors aren't bad. Most web pages on the Internet have validation errors in the double digits, and a heck of a lot of those render horribly on at least some mainstream browsers. Many have substantial extra code to pander to the different ways different browsers handle non-valid HTML. Once you know HTML, it's not hard to eliminate all errors and most of the warnings. Be one of the minority who does it right.
The previous 4 line HTML Hello World had the following problems:
- No "head" element.
- No "title" element contained in the "head" element.
- No "body" element (not pointed out by the validator).
- No "lang" attribute in the "html" element.
The following HTML file fixes all these problems:
View in browser View source code
Looking at this file's source, you see we did the following:
- Added a "lang" attribute to the "html" element, and gave this attribute the value "en" because this document is in English. You'd change the value to accommodate other languages.
- Added a "head" element containing a "title" element.
- Added a "body" element, and placed the "p" element inside it.
Although this is a perfectly valid HTML file, it's missing two things that are a must in modern HTML files:
- A viewport.
- CSS
But before discussing the preceding two additions, let's discuss why your HTML should also be well formed XML.
Well Formed XML
HTML5, which is the version you should use, allows but doesn't require the HTML to also be well formed XML. Well formed means conforming the XML grammar, or syntax if you prefer. Always make your HTML well-formed XML. Here's why.
- Well formed XML is several times easier to understand.
- Well formed XML is several times easier to debug.
- Well formed XML is several times easier to author.
XML Checker Program:
You can download and use my Free Software xmlchecker.py program to check for well formed XML almost instantly, even on very large HTML files. Note that it stops on the first XML error it finds, so you iterate between checking and fixing problems.
Making your HTML well formed XML is as simple as putting ending tags on all container elements and putting a slash before the closing angle bracket of non-container elements. For example:
Well formed | Not well formed |
<p>Pgf 1</p><p>Pgf 2</p> |
<p>Pgf 1<p>Pgf 2 |
<hr/> |
<hr> |
Most of the example code in this document has no indentation because I want you to get used to matching open and close tags rather than depending on indentation. On actual websites, it's amazing how often indentation doesn't match what's really going on. What I recommend is the use of blank lines to make things clearer. Certainly there should be a blank line between the end of the "head" element and the beginning of the "body" element.
But Blank Lines Lengthen Download Time, Oh My!
You can use blank lines to your heart's content because they cost one character on a Linux/Unix server, two characters on a Windows server. You might find this surprising because many websites and tools such as Bootstrap remove all blank lines and a lot of space characters in order to save space. The explanation is that these sites and tools are so bloated that people insist on a reduction, even if it's as symbolic as removing blank lines. The real way to make smaller, quicker loading HTML and CSS is to quickly author something custom for your website or group of websites so your visitors needn't download everything plus a kitchen sink and marching band.
Adding a Viewport
Valid HTML doesn't require a viewport, but if you want your site to render properly on both full sized monitors and little smart phones, you need a viewport. Adding a viewport requires exactly one line inside the "head" element, but it's a difficult to remember line, as shown in the following code snippet:
<meta name="viewport" content="width=device-width, initial-scale=1.00"/>
Remember to add the slash before the trailing right angle bracket in order to make it well formed XML. Also, be sure to make a file, in easy reach, to copy from and paste into new HTML files. After adding the viewport, the Hello World file looks like the following:
View in browser View source code
Adding CSS
CSS is the way you alter appearance of specific categories of text, containers, and so much more. Given modern expectations of what a web page should look like, CSS is a must. This section presents a trivial CSS example that does nothing except make all text sans-serif by applying the sans-serif property to the "body" object. Consensus is that sans-serif looks much better on a computer or device screen than does the serif fonts used in books printed on paper.
To put CSS into your web page, you insert a "style" element inside your "head" element, and then inside the "style" element you insert CSS rules.
NOTE:
The "style" element should not have any attributes, or the W3C HTML validator throws a warning. In times long ago it required attributes, and unfortunately many website editors and constructors still automatically add the attributes. Unfortunately out-of-date documentation tells you to add the attributes. Ignore such documentation's suggestion, and if you use an editor or constructor that adds the attributes, manually take them out.
In this section's example source code, the "style" element is surrounded by blank lines to make it easier for you to see at a glance. In this section's example browser rendering, you'll notice that unlike previous examples in this document, the font is sans-serif.
View in browser View source code
Wrapup
This document walked you through constructing an well formed XML style HTML document, starting with a four line file that renders well but has validation errors, and then fixing the validation errors and warnings by adding a language attribute to the "html" tag, including a "head" element followed by a "body" element, inserting the "title" element into the "head" element and moving the "p" element with the text "Hello world!" into the "body" element.
Then it was illustrated how to insert a viewport into the "head" element in order to facilitate your ability to create a web page rendering well on all sized screens, and after that CSS was introduced into the HTML file.
You should be able to create this simple HTML file from scratch so it doesn't seem like "magic". However, in real life, you'll probably create it with a template file or an editor macro or by copying another HTML file to the one you want to begin.
Your HTML file should also be well formed XML in order to make creation, maintenance and debugging easier. Don't listen to folks saying that HTML shouldn't be XML: To write non-XML HTML requires a whole lot more HTML knowledge and attentiveness, practice and patience. All for maybe a 5% decrease in the size of the HTML part of your document (not the CSS), and bragging rights that they did "real HTML". But the "real HTML" argument doesn't pass muster because the HTML5 standard specifically allows you the option of making your HTML5 well formed XML. As far as the 5% HTML-only size reduction, simply authoring your own HTML probably saves much more than that when compared to kitchen sink tools.
You're now ready to start the Web Workmanship HTML Primer.