Troubleshooters.Com and Code Corner Present

Javascript 101
Start today, avoid the landmines

Copyright (C) 2006 by Steve Litt
 

Introduction

Javascript solves one of the biggest problems in HTML -- bandwidth constraints. By working entirely on the client without the need to consult the server, it reduces dependency on the wire and also on the (probably overworked) server. Additionally, in certain situations Javascript solves another huge HTML problem -- statelessness. What happens on the client persists on the client. It doesn't eliminate the need for cookies and databases, but it can cut the overhead for minor updates. Javascript is a miracle!

Besides being a miracle, Javascript is also a scourge. It is by far the least debuggable language I've used, and I've written TSR's in Assembler. The precise attributes that make Javascript the perfect client side language makes it a black box. Although you code it somewhat like a CGI language, there's no way to view the "produced HTML code". Indeed, if you use your browser's "view source" facility, you'll see your Javascript verbatim, which of course doesn't help a bit. When using Javascript, debugging will be your top priority.

What do I mean by the phrase "debugging will be your top priority"? I mean that throughout your development, you will constantly work to avoid the situation where a missing set of parentheses somewhere cause your web page to be blank or fail, without the slightest trace of an error message. In any Javascript project, and I mean as little as 20 lines, your first step will be to set up a debugging facility. You will not proceed until that debugging facility works.

Once your debugging facility works, you will develop in small steps, testing frequently. Yes, that slows you down, but not half as much as needing to comment out your whole program and start uncommenting in order to debug. You will frequently save working versions so you can either drop back to them, or compare them against your current working version. The successful Javascript programmer accepts the debugging frustrations as just part of the job, and always pays attention to his attitude, taking mental steps to avoid anger or depression (yes, depression -- Javascript can be that bad). For tips on The Attitude, click here.

Javascript is so slow in debugging that nobody would use it, except for the exceptional role it plays in making the browser into an application, or at least a part of an application. Compare Javascript's debugging speed to that of other langages. Python, Ruby and Java almost debug themselves. Even the most careless coder can be super productive in those languages, always assuming he's good at program design. The various Basic interpreters and compilers, including VB, are fairly easy to debug, although their lack of structure makes good design difficult. Perl is difficult to debug because of its insistance on viewing everything as legitimate, even when you use strict. C is even more difficult to debug (runtime problems) because it allows the programmer to freely manipulate memory locations, with almost no checks and balances. C++ could conceivably correct some of that, but in practice it often simply adds a rather complex OOP syntax to the memory problems. As far as I know, Standard Template Libraries still don't work right on GNU compilers.

The Queen of debugging difficulty is Assembler, where simply coding a debugging print is difficult, and the entire program manipulates memory, with direct, indirect and doubly indirect addressing. I know of only one language more difficult to debug than Assembler: Javascript. Javascript is a frustrating black box with no facilities for seeing what your code does. You see no error messages. All you see is a finished product. Or not.

Javascript is fairly portable between browsers, so it's probably the best client side scripting language to use. If you need client side scripting, use Javascript. If you can get by without client side scripting, don't use Javascript for the fun of it -- it's not fun.

Hello World

As discussed in the introduction, Javascript is a bear to debug. Therefore, our "Hello World" exercise will be creating a debugging facility. In most "Hello World" proofs of concept I recomment coding it by hand, but in this case I recommend cutting and pasting. A single typo could make the web page blank out, without the slightest trace of what caused it, and without any kind of debugging facility to help you figure it out. Once you complete this "Hello World", you'll have a debugging facility.

First, code this Javascript file, and name it mydebugger.js:
<!-- //HIDE THIS SCRIPT FROM NON-JAVASCRIPT BROWSERS
function errmsg(string)
{
var p=document.createElement('p');
p.appendChild(document.createTextNode(string));
document.getElementById("errmsg").appendChild(p);
}
// END HIDING FROM NON-JAVASCRIPT BROWSERS -->

Next, code this HTML file, and name it mydebugger_test.html:
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">

<html>
<head>
<title>Test my debugger</title>
<script language='javascript' src='mydebugger.js'></script>
</head>

<body>
<small id='errmsg'>MY DEBUGGER</small>

<script language='javascript'>
<!--
errmsg('Steve was here');
errmsg('and now is gone');
errmsg('but left his name');
errmsg('to carry on!');
//-->
</script>

<p>Hello World</p>
</body>
</html>

Now browse debugger_test.html, using either http:// or file:///. Either way, you'll see something like this:
Screenshot of hello world

You've just created a web page in which embedded Javascript code (the calls to errmsg()) write to the page in real time, without contacting the server. Now let's review the code:

mydebugger_test.html

As you remember, mydebugger_test.html looked like this:
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">

<html>
<head>
<title>Test my debugger</title>
<script language='javascript' src='mydebugger.js'></script>
</head>

<body>
<small id='errmsg'>MY DEBUGGER</small>

<script language='javascript'>
<!--
errmsg('Steve was here');
errmsg('and now is gone');
errmsg('but left his name');
errmsg('to carry on!');
//-->
</script>

<p>Hello World</p>
</body>
</html>

The Top Line

The top line is the document type line. Depending on the web server, getting it wrong can cause the page to fail with or without error messages. Often this line must be followed by a blank line or the page will fail (on some servers). The bottom line is, if you get errors on the server that you don't get when browsing the file directly on your hard disk, pay close attention to this line, or maybe even make a trival HTML page using the top line (and its associated blank line if any), to see whether the problem is the top line.

The Include Line

This line does nothing but include the mydebugger.js Javascript file. It looks like this:
<script language='javascript' src='mydebugger.js'></script>
Whenever using include files, be VERY careful to include the ending tag (</script>). Otherwise, it will foul up the whole HTML page, you will not have a debugger, and it will be very hard to figure out. This is no big deal with a little Hello World program, but on a web page with a couple hundred lines of HTML and a couple hundred lines of Javascript in an include file, this is a VERY hard error to catch.

This include line must always be in the <head></head> section so that it is included before rendering, ready to use, and in scope everywhere.

The Error Box

The error box is where error messages go. The word "box" is used loosely, because in fact it can be almost any HTML element that can be assigned an ID and "contain" text, including <small>, <b>, <td>, <pre>, <div>, and several others. The Javascript code in errmsg() simply finds the HTML element by its ID, and then appends a paragraph of text as a child of that element, following all its other children. In that way, you get a running log. Here's what the error box HTML code looks like:
<small id='errmsg'>MY DEBUGGER</small>
The preceding code creates a <small> element whose ID is 'errmsg'. The element contains one child element -- a text node of value MY DEBUGGER. The first call to errmsg() will create a child paragraph containing text, and append that child after the MY DEBUGGER child.

The error box should be the first element in the web page's body. That's because it cannot be written to until it's rendered, so calls to errmsg() must either occur below the error box, or must occur after the entire page has rendered, in response to a user action (clicking something or changing focus).

In this example, for simplicity we make the containing element a <small> element. As you advance, you'll probably want to make your debugging box be a real box in order to completely separate the debugging log from genuine web content. For instance, the following debugging line (actually two lines) would make the debugging log small and bold, inside a box with an 11 pixel border:
<table border='11'><tr><td><b><small id='errmsg'>MY DEBUGGER</small></b></td></tr></table>
Notice that it's basically the same thing, with the element directly containing the text possessing an ID value of errmsg. Here's how the web page renders if you replace the original error box line with the preceding:
Hello world with text in a table

Now it's obvious which text is the debugging log and which text is the other web content.

The Inline Script Tag

Everything contained within tags <script> and </script> is a script. The scripting language is defined by the language attribute. Unlike the include line in the header, this <script> tag does not contain a src attribute, because in fact the source is contained between the tags. The opening tag looks like this:
<script language='javascript'>
The language to be used in interpreting everything up to the closing tag is Javascript.

The Comment Tags

You don't want your inline Javascript read as HTML by the non-Javascript browsers of the 20th century, or by the a current browser whose owner has disabled Javascript. Therefore, you encase the entire script in HTML comment tags, <!-- and -->. Javascript ignores those lines, and thanks to the fact that they're HTML comment delimiters, non-Javascript browsers will also. Note that a Javascript aware browser would work just fine without these comment lines, but for portability's sake they're a good idea.
mydebugger_test.html

mydebugger.js

As a reminder, here's what mydebugger.js looked like:

<!-- //HIDE THIS SCRIPT FROM NON-JAVASCRIPT BROWSERS
function errmsg(string)
{
var p=document.createElement('p');
p.appendChild(document.createTextNode(string));
document.getElementById("errmsg").appendChild(p);
}
// END HIDING FROM NON-JAVASCRIPT BROWSERS -->

As mentioned in the previous section, the <!-- and --> tags bracketing the file make it invisible to non-Javascript browsers. In this case, I also included Javascript comments on those lines, and of course preceded them with Javascript's line commenter (//). As you can see, Javascript syntax looks a heck of a lot like C, except object oriented.

Inside the curly braces, the first line declares a variable called p, and initializes it to be an HTML paragraph element, using the createElement() method of the document object. This method is available by virtue of Javascript and your (Javascript aware) browser, so you needn't code it. the createElement() method always takes a string for an argument, that string being the string inside the equivalent HTML tag:
var myparagraph   = document.createElement('p');      // create a <p></p> 
var mytable = document.createElement('table'); // create a <table></table>
var myrow = document.createElement('tr'); // create a <tr></tr>
var mycell = document.createElement('td'); // create a <td></td>
var mybold = document.createElement('b'); // create a <b></b>
var mymonospace = document.createElement('tt'); // create a <tt></tt>
var mycode = document.createElement('pre'); // create a <pre></pre>
var mydiv = document.createElement('div'); // create a <div></div>
All of these elements can contain text, and many of them can contain the others (<td> being a great example). If you're confused why one uses these routines instead of just issuing HTML tags, that will be explained in the next article.

The second line appends a newly created text node as a child of the paragraph created in line 1. Note that text nodes are different from normal elements. For instance, in HTML they are expressed literally, and not as tags. Also, they cannot contain anything else. So it's no surprise that they are created with the document.createTextNode() method instead of document.createElement(). Once the new text node is created, it's appended as the last child of the paragraph element with the appendChild() method.

On the third line, the paragraph itself is appended as the last child of the element whose ID is 'errmsg'. That element is found via the document.getElementById() method, and then that element's appendChild() method appends the paragraph after the last child in the retrieved element, thereby extending the error log.

This was more complex than most Hello World type programs, but it's all for a good cause. Armed with your debugger, you will have a much easier time with all your Javascript activities.

DOM

Consider three ways to represent a hierarchy. The first way is an outline:
Table
Row
Cell
Bold
Small
'This is some text'
An outline is probably the most readable way to represent a hierarchy, as long as no elements are too long for the single line they've been allocated. A second way to represent a hierarchy is with begin and end tags, as seen in HTML:
<table><tr><td><b><small>This is some text</small></b></td></tr></table>
Begin and end tags are certainly concise, requiring very little "coding". Begin and end tags are, BY FAR, the easiest way to represent complex hierarchies with large elements in a disk file. However, to the human eye, they don't immediately look like a hierarchy. Worse, it's easy for a human to make mistakes coding such HTML, either by leaving out a tag, or worse yet by interleaving tags:
<small>This is <b>some</small> text</b>
The intent here was probably to make something like this:

This is some text

The first two words are small, the last word is bold, and the third word is both small and bold. The human concept is perfectly understandable. But it's not a hierarchy, and therefore isn't legitimate HTML. The small and bold sections are interlinked. Various browsers might render it the way you want, but there's no guarantee. The right way would be to represent it as a true hierarchy:
<small>This is </small><b><small>some</small></b><b> text</b>
An outline representation of the preceding would be this:
small
'This is '
bold
small
'some'
bold
' text'
You might be thinking this is an awful lot of work to represent a concept extremely obvious to a human, and you'd be right. But computer programs aren't human, and if required to regularly calculate the state of each character, the program would be huge and buggy. Computers work much better with hierarchies, and if worst comes to worst, one can use a front end to type in a human-centric way, yet output a hierarchy.

Modern browsers operate on HTML describing a hierarchy. Some can "guess" at non-hierarchical HTML, but there are no guarantees.

So far we've described two ways of representing a hierarchy -- outlines and tags. A third way is a node tree, where each element has pointers to its parent, to its previous sibling, to its next sibling, to its first child and to its last child. Given those pointers, navigation and all sorts of hierarchy manipulation is trivially simple. This is the best way to represent a hierarchy within computer memory. One simple implementation of such nodes is my Node.pm and Node.rb tools. Another is DOM.

DOM stands for Document Object Model. It's a way of objectifying a tagged document such as HTML or XML. Using DOM, every HTML tag becomes an Element. Most elements can contain other elements, thereby making a hierarchy. Elements can have attributes. An attribute is a fact about the element. Attributes cannot contain anything else. Rather than give a big description, let's convert some HTML into DOM:
<table border='11'><tr><td><i><big>This is some text</big></i></td></tr></table>
The preceding HTML can be expressed in DOM code as follows:

var mytable = document.createElement('table');
mytable.setAttribute('border', '11');

var tr = document.createElement('tr');
mytable.appendChild(tr);

var td = document.createElement('td');
tr.appendChild(td);

var italics = document.createElement('i');
td.appendChild(italics);

var big = document.createElement('big');
italics.appendChild(big);

var text = document.createTextNode('This is some text');
big.appendChild(text);

document.getElementById('errmsg').appendChild(mytable);

The preceding code starts by creating a table element, then creates a row and appends it as a child of the table, and repeatedly descends, creating and appending as a child until the text is inserted. The last line is necessary to display the newly made table. It puts the table inside the element whose ID is errmsg so that it can be displayed. If you insert the preceding code immediately after the calls to errmsg() in the Hello World program, the output looks like this:
DOM Example screenshot

In the preceding, you see you built a table with an 11 point border (an attribute of the table set by the setAttribute(key,value) method). The text inside a single cell of the single row of the table is italic and large.

The W3C Connection

At this point you might wonder why you can't simply insert the HTML code into an element, rather than creating all these objects. On complex hierarchies it would represent a 500% decrease in coding verbosity.

You can! Microsoft created an object instance variable, called innerHTML, which you can read or write just like any other OOP instance variable. When you read it, it gives you all the HTML contained by the element for which it's a method. When you assign to it, it replaces any HTML contained by the element by the new HTML. In other words, using this method, you could have shortened this:

var mytable = document.createElement('table');
mytable.setAttribute('border', '11');

var tr = document.createElement('tr');
mytable.appendChild(tr);

var td = document.createElement('td');
tr.appendChild(td);

var italics = document.createElement('i');
td.appendChild(italics);

var big = document.createElement('big');
italics.appendChild(big);

var text = document.createTextNode('This is some text');
big.appendChild(text);

document.getElementById('errmsg').appendChild(mytable);

to this:

document.getElementById('errmsg').innerHTML = '<table border='11'><tr><td><i><big>This is some text</big></i></td></tr></table>';

The innerHTML technique is MUCH simpler and more reminiscent of HTML. It's supported by Microsoft's Internet Explorer, by Mozilla Firefox (at least my version 1.0.6), and probably by many other browsers. But I don't use it. Why not?

Because it's not part of the W3C standard.


Addtionally, before a computer program can deal with begin and end tags, it must parse them into a node hierarchy that looks more like an outline.

Outputting to Stdout

T

Outputting to Stderr

A

Environment Variables and Return Values

P

Branching

B

User Input

W

Looping

L

Subroutines

T

File I/O

T

Regular Expressions

R

String Manipulation

B

Quasi Object Programming

T

 [ Troubleshooters.com | Code Corner | Email Steve Litt ]

Copyright (C)2006 by Steve Litt -- Legal