Troubleshooting Professional Magazine
OOP Tomfoolery |
|
Perhaps most surprising, I knew better, and have known better for 15 years. What happened that caused me to forsake my tried and true design methods and behave like the programmers I lampoon? I was overconfident. Could it happen to you?
So this month's TPM will walk you step by step through the flawed development -- where I made mistakes, and how you can avoid those mistakes. Read it. Luckily, all I lost was five days of productive labor. If this work had been done for a client things would have gotten ugly.
Don't let this happen to you. Kick back and read this issue of Troubleshooting Professional Magazine. And remember, if you're a Troubleshooter, Open Source enthusiast or computer programmer, this is your magazine. Enjoy!
After understanding the big picture, I defined a language to quickly,
easily and readably create, extend or modify menus in any menuing system.
As mentioned, the language is called EMDL, standing for Easy Menu Definition
Language. The name tells you the main priority -- ease. In a tradeoff between
ease of authorship vs. ease of parsing, EMDL opts completely for ease of
authorship. There are no end tags. There is no spanning of lines. It's
completely indentation sensitive, eliminating the need for most punctuation,
and making it readable. Extraneous characters are kept to a minimum. In
the following is an example of EMDL, please note that the colors are to
illustrate the language:
E:::Example Menu Mouse Speedup param C: xset m 3 2 Office ::: Office Menu Abiword param C: abiword & Lyx param C: lyx & D: /data/lyxdocs ^Quit Information ::: Information Menu Ifconfig param C: ifconfig P: /sbin S: 1 Top param C: top ^Quit ^eXit |
The preceding EMDL file defines a menu hierarchy in which the main menu
(titled Example Menu) contains an executable choice (Mouse Speedup) and
two submenus (Office and Information), as well as an uplevel escape (^eXit).
Each of those two submenus contain two executable choices and an uplevel
escape. All executable choices have at least one parameter, a command,
which begins with C:. Here's a summary of all current possible parameters:
Param
Letter |
Param
Purpose |
C | Command to run |
D | Directory to switch to before running the command |
P | directory(s) to put on the front of the Path before running the command |
E | Environment variable to set before running the command |
S | Stop after running the command |
B | Run the command's wrapper script in the background |
V | driVe to switch to before running the command |
I | Icon file |
Lines whose first non-whitespace character is a pound sign (#) are comments and are ignored. Likewise, empty lines and lines with only whitespace are ignored. Menu lines are those containing a triple colon (:::) separating their text on their parent menu from the title of their menu. Parameters are information about a given choice, and take the form letter, colon, data, as described previously. More importantly, parameters are indented below a line containing only the word param, itself indented below the choice it pertains to. Choices are lines without triple colons that are not parameters or param designators.
Choices beginning with a carat (^) are uplevel escapes that take you to the parent menu, or on the main menu exit the menu program. It's essential to have an uplevel escape on every menu. The possible exception is the main menu -- in cases where you want the menu to be the user's only user interface.
EMDL is optimized for extremely quick and easy creation and editing of menus, and nothing else. Tagless, line and spacing dependent, it's not suitable for purposes other than menu definitions. There may be some menuing features it cannot accommodate. It's certainly not easy to parse (take that from me). But it's intuitively recognizable to anyone, and it's easy to edit with any text editor, especially an autoindenting editor like VI, and it's lightning fast to edit with VimOutliner.
As I said, it's not easy to parse. Read on...
The work on the outline to UMENU program began the morning of Friday, 2/22/2002. Within ten minutes I made my first and worst mistake.
Figuring an object design would take too long for a one day project, I designed with functional decomposition. (Yes, you can laugh, because you know what's coming). The program would require some objects to hold state and the like, but it would be primarily a top down design. And because the program needed to parse a hierarchy, I used a recursive function.
So let's take a look at this design. It had an input file reader object, and a bunch of procedural code. The main routine called handleMenu(), which recursed itself through the input file lines delivered by the reader. Because I needed to save state between recursions, I added a Linestate object, which kept states, determined line types (COMMENT, BLANK, MENU, CHOICE, PARAMSTART or PARAMETER). By Friday night the program was almost working program. After taking Saturday off, I got to work Sunday morning, figuring it would be two days instead of one. Ughhh!
Sunday afternoon the realization hit. If written properly, this program could be easily modified to convert the outline form of menu definition (which I now called EMDL -- Easy Menu Definition Language) to any type of menu system, including the Icewm/Linux menu system. I ripped out a bunch of code and tried to modulize the UMENU content. This effort continued into Monday. Class after class was added. A Choice class, a Menu class, a Menuarray class, and plenty more procedural code to handle synchronizing states of all the objects instantiated by various levels of recursion.
Monday drifted to Tuesday, and late Tuesday afternoon I declared the code unusable and unsalvageable. The code featured a ping-pong game between top-down and object programming, with various functionalities duplicated in both places. The combination of recursion and OOP made tracking state gruesome. It was 768 lines of the kind of code you can get fired for writing. With almost four days sunk in this supposedly one day project, I backed off and redesigned -- this time with a pure OOP design. I got a stack of paper and began to draw.
A couple hours was all it took to confirm that my briar patch of halfbreed code could be replaced by an incredibly simple, powerful and scalable design. It looked something like this:
The preceding is awfully simple, but after considerable reflection it rang true. Further elaboration on that design yielded the following refinements:
Wednesday morning I went to Burger King, where I discussed the design with fellow LEAP member Hale Pringle. He felt the design was reasonable, adding that it might be a good idea to get rid of the recursion, possibly by multiple passes through the input. I declined, feeling the recursive algorithm was close to finished. I dreaded rewriting that algorithm, and felt that if the recursion could be encapsulated in the Parser object instead of splattering itself all over the main routine, it would be OK.
But as I continued to transfer code from the procedural section to the appropriate classes, once again it looked like recursion, now entirely in the Parser object, would be a problem. Recursion is wonderful in simple algorithms, but as you need more variables to pass between recursions, the situation degenerates. Part of the problem is that while the input is a hierarchy of menus, the output is a flat array of menus. Hierarchical output is trivial with recursion, but a flat array of menus whose choices are in order, uninterupted by lower levels -- that's a different animal.
So I began investigating alternatives. For all I know there might be 100 alternatives to recursion, but the two most familiar to me are the DOM API method of getFirstChild/getNextSibling/getParent, and use of a stack to store what would be local to a recursion instance. A third alternative is object recursion, where an object instantiates one or more instances of its class, but I chose not to go there.
A few minutes of thought confirmed the fact that if each menu could be looked up with a specific ID, and that ID could be calculated during parsing, then information could be ringtossed to the proper menu in real time. So I used the poor man's stack -- the hierarchical string. The string's length is proportional to the menu's level. The root menu is "01", the root menu's first child is "0101", the first child of "0101" is "010101", and the second child of the root menu is "0102". The fact that each level is 2 digits means a given menu can have 99 submenus (only menu type items receive these IDs). Clearly no menu used on an ordinary monitor would have more than 99 submenus. This hierarchical string was accomplished by three methods in the parser:
If each menu contained nothing but submenus, deducing level changes would be trivial. But given that some items are choices, and the fact that a menu is a choice on its parent menu but also a menu in its own right, things get difficult and beyond the scope of this magazine. If you're interested you can download the code (download URL in the URL's section of this magazine).
But I'll give you one hint. The loop that does the parsing maintains two variables, $previousMenuID and $parentMenuID.
Choices, and the choice components of menus, have their info written to the menu object at $parentMenuID. But the menu component of menus are assigned a new ID, deduced from their level and from $previousMenuID.
The program instantiates exactly one Menutree object, which in turn instantiates its "has-a" contained class hierarchy of Menu and Choice classes, is straightforward -- a mighty relief from the complex Parser. Basically, if you get the right information into the Menutree object, writing that information is easy. But the writing is done by a separate object...
Perhaps more fascinating, if you write a Parser class for your menu system, and a Writer class for EMDL, you can create EMDL from your current menu structure. True round trip development.
And yet I designed this object program with functional decomposition. Why would I make such a stupid mistake? Could you fall into the same trap?
I speak OOP as a second language -- my mother tongue is top-down structured design. I figured I'd save a couple hours by doing the design in my mother tongue. If you began programming before 1994 your mother tongue might also be top-down structured design, and for a small OOP project you might figure you can get away with functional decomposition design techniques. And if you do, you might get burned the way I did. And if you're really unlucky it will be for a client who will curse your name for taking 6 days for a 1 day project.
So pick your best Object design technique, and use it -- even if it takes much more time than a more familiar structured technique. Objects mix poorly with a structured main routine and subroutines, leading to parallel code in each section, and a ping-pong game between the structured and object parts of your program.
For this project, the OOP design technique I used (once I used one at all :-) was to envision the entire program as a machine, and determine what "parts" the machine needs, and what sub parts each of those parts need to do its job, and so on.
#!/usr/bin/perl -w ### Copyright (C) 2002 by Steve Litt ### ### GPL LICENSE, NO WARRANTY ### use strict; use Menutree; use Emdlparser; use Umenuwriter; package main; sub usage() { print "Usage:\n"; print " emdl2umenu.pl emdlFilename\n"; print "Where emdlFilename is the name of the emdl file to be parsed.\n"; print "The output is a tree of UMENU .mnu files in the current directory.\n"; print "\n"; } sub main() { unless(@ARGV == 1) { usage(); exit; } my($filename) = $ARGV[0]; my($menuTreeObj) = Menutree->new(); my($parserObj) = Parser->new($filename,$menuTreeObj); my($writerObj) = Writer->new('arg_not_used',$menuTreeObj); $parserObj->parse(); $menuTreeObj->selftest(); if($menuTreeObj->isValid()) {$writerObj->write();} } main(); |
Just remember my tale of woe, and the next time you think you're such a hotshot that proper design isn't necessary, know that pride comes before the fall.
There are many recursion alternatives. Here are some I've used:
As shown in the preceding article, it works great for a hierarchy implemented
as a sequential list, such as an outline. For simplicity, let's say that
each node can have no more than 26 subnodes. Then the first node would
be "a", its first child would be "aa" and its second child would be "ab".
The first child of "ab" would be "aba". Below is an example showing indentation,
and the strings that would be assigned to the indented lines as the indented
hierarchy is read sequentially:
a aa aaa aab aaba aac ab b ba bb bba c ca |
The beauty is that these strings can be sorted and selected for the
desired output. For instance, a simple string sort will output them in
hierarchical order, with the level defined by the string length. But let's
say you want to output each parent with its direct children and nothing
else. Prepend the string length (zero filled to the left, of course), and
sort again, and look what you get:
001a 001b 001c 002aa 002ab 002ba 002bb 002ca 003aaa 003aab 003aac 003bba 004aaba |
The preceding sort can easily be converted to a list of parents and
direct children by deducing that each node's parent is the node's string
with the rightmost letter trimmed off. Using a little break logic your
output would look like this:
Root a b c a aa ab b ba bb c ca aa aaa aab aac bb bba aab aaba |
One more nice thing about hierarchy strings is that if your language offers hashes (key/value pair associative arrays), hierarchy strings offer lightning fast lookup of specific nodes.
level=0 ascending=0 filenode=getfirst(root) if defined(filenode) level=1 While level > 0 if not ascending process this filenode found=0 //first try for a child if filenode is a directory and not ascending push the filenode filenode=getfirst(current directory(filenode)) if defined(filenode) level++ found=1 ascending=0 else pop the filenode //else try for a sibling if not found filenode=getnext(filenode) if defined(filenode) found=1 ascending=0 //else get parent, and next iteration will go right or up if not found pop the filenode level-- ascending=1 |
Obviously this is a silly alternative for a simple recursive directory tree walk, but it illustrates how a stack can substitute for recursion in even in a fairly hardcore recursion domain. You could probably implement a backtracking algorithm with a stack too.
As long as you can store the current state on the stack, pop and use it later (in other words, no side effects on down, right or up), you can iteratively walk a tree exactly as if you had recursed it.
Multiple passes can be time consuming, and coding them isn't always easy. You must figure out how to store what you've read so far -- often with hierarchy strings :-).
Often you're better going with hierarchy strings than with multiple passes.
This is excellent because it not only facilitates walking the whole tree, but it facilitates fast sequential search for a given node. For instance, if you're looking for a node whose hierarchy string would be bfbc, you'd do the following:
One might argue the preceding question in the affirmative. As of March 7, UMENU still isn't useable because it must run in an xterm using the xterm -e syntax, and anything spawned by such a process dies when the terminal goes away. Depending on what alternatives I find, I might need to implement a socket server to spawn programs and a socket client to tell the server what command to run, thus decoupling the running of the command from the menu and its terminal, via an extremely thin interface.
Isn't it crazy to do this much work just to switch away from the menu system that comes with Linux? Isn't this proof that maybe I never should have switched away from Windows?
Quite the opposite! Much more than Windows, Linux gives you the opportunity to mold the user interface to your exact needs. A great example is VimOutliner -- an adaptation of the Vim editor that works as a hyper-productive outliner. It took a few days to program, and it's saved several multiples of that amount so far, and saves me more time every day. I even created a VimOutliner to LyX conversion program, thus jumpstarting my "Troubleshooting Techniques of the Successful Technologist" book. Try that with your favorite Windows outliner!
Linux is driven by utility, not by marketing or what Judge Jackson and the appeals court call monopolism. Therefore, Linux has less need to insert incompatibilities with past products or competitors. With Linux, your modifications today can be used for years. And of course, Linux provides the ultimate modification facility -- free access to the source code. Free access to all -- not just to priveleged vendors and developers.
It all boils down to this question: Do you want to mold your computing experience to your needs, or do you want to drift along with whatever your software vendors give you this year?
To state the question more succinctly, who owns your computer?
Crazy, they'd say, because 85% of the developer jobs are Windows based. Crazy, they'd say, because in the "real world" developers use RAD drag and drop development environments, not a text editor. Crazy, they'd say, because the student needs to get into the workforce fast, and VB can get you there faster than the source code based Java or C++ you'd learn on a Linux box.
They're right on all three counts, but I'm not crazy. I'm right.
If a school, training course, or book simply teaches the prospective developer how to assemble an app using drag and drop, they're creating a monster. As a guy who has cleaned up after many such monsters, I can tell you that the developer without design knowledge is an unintentional saboteur. His code will be ugly and buggy at birth, and every round of maintenance will make it much worse.
Unfortunately, many schools and training programs have become monster factories. How well I remember my CS student neighbor who was quite unimpressed with my Symptom Description Wizard (Java), until I used a batch file to compile it. Then his jaw dropped. With a look of worship on his face. "How did you do that? Were you using the command line? Where was your development environment? How did you know what to do? Can you tell me how to do that? How did you change that file? They never taught us any of that in school!" .
I told him about code, and design, and encapsulation, and the fact that objects are much more than a convenient drop target on your screen. With any luck I rehumanized the monster his school was on the way to creating.
A developer's #1 responsibility is design. Let me repeat that...
A developer's #1 responsibility is design! |
Of course it could be said that some developers are handed a fully designed class, whose every method is completely specified, and told to code it. Such a developer would not need design skills. But such a programmer would not advance beyond the "junior programmer" designation until he learned design skills. And a developer taking a more complete role in the development process needs to perform an excellent design.
Design is for the most part language independent and "development environment" independent. Classes must be mapped to the problem domain. In a program simulating a car you might have major classes called Engine, Transmission and Dashboard. In a business app major classes might be called Employee, Contractor, and Contact -- all subclasses of the Person class, as well as classes for Customer and maybe AccountingSystem. A file conversion program might have classes Parser, Menutree and Writer
But a design centered around classes called CustomerInputForm and EmployeeInputForm won't stand the test of time. And that introduces the reason Linux and its programming languages are best for teaching and learning design.
The priority for most "drag and drop development environments" is coding speed, plain and simple. Or, for the more skeptical among us, the illusion of coding speed -- a marketing ploy. They encourage creation of an app by creating input forms and output reports, and a menu system to navigate. That might be fast, but it doesn't teach design. And by the time the debugging phase of development is over, it might not even be fast.
A second priority of drag and drop development environments is, to put it delicately, access for the algorithmically challenged. The marketing cry for this priority goes something like this: "Skillful developers are rare and expensive. Less skilled developers can use product to create an excellent system".
So drag and drop development environments do not prioritize good design.
Every Linux computer comes with Java and C++, both incredibly capable object oriented languages. Java is especially good because once you know the basics of class creation, you can use pre-created classes from the JDK, as well as Javabeans, EJB, and a host of other things. Java is the ideal laboratory for learning design.
The GNU C++ that ships with every Linux box is for the most part ANSI C++, unencumbered with "frameworks", "devlopment environments" or other geegaws that obfuscate the language. For the most part, if a program compiles in GNU C++, it will compile in any ANSI C++ implementation.
The C++ and Java that come with Linux are free. You can install them on as many computers as you desire. You can set up a computer laboratory using ancient 32 Meg Pentium 75's, and the students in that laboratory will be well served. That's because the C++ and Java that come with Linux don't need a graphical user interface, and they can be edited with any text editor. The reason I bring up the 32 Meg Pentium 75 machines is that much of our society cannot afford proprietary licenses and hardware capable of running a GUI and a "development environment". Those people are the most needful of training, but perversely they're priced out of training. Increasingly, churches and other supporters of personal responsibility and personal improvement are setting up learning labs using hand-me-down Linux machines. Linux is a powerful weapon against the digital divide.
Perhaps a better way to set up a Linux computer lab is by using the K12 Linux Terminal Server Project. In such a scenario you can use ancient, beat up hand-me-down computers as diskless workstations, with the actual software running on a server of moderate horsepower. Recently St. Mary's Catholic School set up something similar, with the help of the Melbourne Linux User Group and I.D.E.A.L Technology (URL's in URLs section).
But what if you're like me, and can afford big horsepower. This magazine is being written on a 512 Meg dual Celeron 450 with two 7200 UltraDMA drives. Certainly I could run the fanciest development environment on the newest proprietary GUI operating system. But if I did so, my main learning would be specific to that development environment. Development environments and their associated frameworks and "foundation classes" typically obfuscate the principles of development.
The bottom line is this: Even though many prospective developers will end up using drag and drop development environments, to be effective at their jobs they must learn design skills. The most effective and economical laboratory for learning those design skills are the Java and C++ that come packaged with Linux. Once those design skills are learned, that lucrative drag and drop career requires only the learning of a slightly different language (or maybe Java or C++ will be the underlying language), and the knowledge of the tools the drag and drop development environment bestows, and how best to use those tools.
Learning system design is difficult, so learn that on a Linux box, which most effectively creates a laboratory for the acquisition of design skills. Armed with design skills, you can quickly and easily learn the drag and drop environment of your choice. After all, most drag and drop development environments were designed for the algorithmically challenged.
Any article submitted to Troubleshooting Professional Magazine must be licensed with the Open Publication License, which you can view at http://opencontent.org/openpub/. At your option you may elect the option to prohibit substantive modifications. However, in order to publish your article in Troubleshooting Professional Magazine, you must decline the option to prohibit commercial use, because Troubleshooting Professional Magazine is a commercial publication.
Obviously, you must be the copyright holder and must be legally able to so license the article. We do not currently pay for articles.
Troubleshooters.Com reserves the right to edit any submission for clarity or brevity, within the scope of the Open Publication License. If you elect to prohibit substantive modifications, we may elect to place editors notes outside of your material, or reject the submission, or send it back for modification. Any published article will include a two sentence description of the author, a hypertext link to his or her email, and a phone number if desired. Upon request, we will include a hypertext link, at the end of the magazine issue, to the author's website, providing that website meets the Troubleshooters.Com criteria for links and that the author's website first links to Troubleshooters.Com. Authors: please understand we can't place hyperlinks inside articles. If we did, only the first article would be read, and we can't place every article first.
Submissions should be emailed to Steve Litt's email address, with subject line Article Submission. The first paragraph of your message should read as follows (unless other arrangements are previously made in writing):
Copyright (c) 2001 by <your name>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, version Draft v1.0, 8 June 1999 (Available at http://www.troubleshooters.com/openpub04.txt/ (wordwrapped for readability at http://www.troubleshooters.com/openpub04_wrapped.txt). The latest version is presently available at http://www.opencontent.org/openpub/).
Open Publication License Option A [ is | is not] elected, so this document [may | may not] be modified. Option B is not elected, so this material may be published for commercial purposes.
After that paragraph, write the title, text of the article, and a two sentence description of the author.