Troubleshooters.Com Presents

Troubleshooting Professional Magazine

Volume 3, Issue 1, January 1999
The Revolution Continues

Copyright (C) 1998 by Steve Litt. All rights reserved. Materials from guest authors copyrighted by them and licensed for perpetual use to Troubleshooting Professional Magazine. All rights reserved to the copyright holder, except for items specifically marked otherwise (certain free software source code, GNU/GPL, etc.). All material herein provided "As-Is". User assumes all risk and responsibility for any outcome.

[ Troubleshooters.Com | Back Issues ]


Editors Desk
Can you hear it in the wind?
If you're not part of the solution
Come the Revolution
The State of Troubleshooting Address
Linux Log
Auld Lang Syne
Letters to the Editor
How to Submit an Article
URLs Mentioned in this Issue

Editors Desk

By Steve Litt
Troubleshooting Professional Magazine turns two today. And we're proud to have first announced The Troubleshooting Revolution in our premier issue, January 1997. Most large corporations now recognize the death of intuitive Troubleshooting, and have budgeted Troubleshooting Process training. Our parent website, Troubleshooters.Com, went into the black this year. Other Troubleshooting Process consultants are doing well. And the opportunists we warned of in our premier issue haven't appeared in quantity. Yes, it seems the Troubleshooting Revolution is all but won, and we're just doing mop-up operations now.

But don't declare victory just yet. There's still the matter of increasing complexity discussed in the January 1998 issue. And the fact that the third era of Troubleshooting is drawing rapidly to a close...

Steve Litt can be reached at Steve Litt's email address.

Can you hear it in the wind?

By Steve Litt
It's coming. Can you feel it? Can you sense the change? The new-style Troubleshooting has become old, and things will never be the same.


We Troubleshooting Process Patriots have come a long way, haven't we? Doesn't it seem like yesterday that hit and miss troubleshooting ruled the land? Remember that horrible phrase, "troubleshooting what?". And remember the dreaded blank stare when you said "anything".

How different it is today! Today it's "We need you to train our people in system independent Troubleshooting. Please submit a proposal". It's in the budget of all major corporations, and today it's considered obvious.

We've achieved our strategic objectives. We've won. But back in the old days, we were so busy achieving process, we postponed dealing with the geometric increase in complexity. Now that we've cleared the process bottleneck, the next step is complexity. We can make systems more modular, and we can automate Troubleshooting Process. Both are being done.

We've crested the hill, and we've seen it's not the last. We're ragged and dusty. But we're not tired. We're stronger, more committed, and more knowledgable than ever. Because we know productive Troubleshooting is nothing less than our destiny.

Steve Litt is president of American Troublebusters and Troubleshooters.Com, and editor of Troubleshooting Professional Magazine. He's also an application developer and technical writer. He can be reached at Steve Litt's email address.

If you're not part of the solution

By Steve Litt
Technology's complexity increases geometrically. Mostly that's good. Most of our machines and systems are safer, more reliable and more productive than ever before. Consider automotive gas mileage over the last 30 years, with no decrease in power. Consider the improvements in suspension and handling. And above all, consider the order-of-magnitude improvement in emissions.

But too much complexity is overkill. The best, but by no means the only, example is computers. Sure, they're easier to use than ever. But many crash hourly. Considering that 1980's style minicomputers ran more than a week between crashes, and that's not good news.

And how often does that $1500.00 family computer require hours of expensive consultant time to debug. Fact is, the modern Windows operating system and many of its applications are so non-modular and entangled that binary search troubleshooting is often impossible. While we talk of computers becoming commodities, the reality is that Joe Average must either accept a computer that performs "most of the time", or regularly use a technological wizard with an MCSE after his name and hourly rates to match. And isn't it common to see those wizards leave, scratching their heads in puzzlement? Or blaming another vendor?

So I issue this challenge to Microsoft: Build your products to the same standards as the car manufacturers. Nobody buys a car knowing it has have several defects. Build your software modular, and publish the test points. If you can't insert features without non-modularity, don't insert the features. Get rid of the intermittents. Several years ago it was possible to sell buggy operating systems, but back then you were the only OS in town. Now there's Linux.

And I issue this challenge to computer purchasers: Consider Linux. Linux is fairly modular, well documented, and ultra reliable. Up-times often exceed a month. It may not be ready for prime time on your desktop (although many are using it that way), but it certainly makes an ultra-reliable server. Web, Naming services Email, File server, print server, Internet dialout server, Database server.  I'm not telling you you have to use Linux. But do yourself a favor and at least consider it.

During the past year Troubleshooting Professional Magazine has become a favorite of the Linux crowd. They're sick of jumping through hoops to make advertising hype come true, and have turned to Linux as a reliable alternative. Linux use grows geometrically as complexity for its own sake is seen hurting the bottom line.

The point is this. Times have changed. The Internet provides alternative to Madison Avenue hype. The truth can no longer be bought or monopolized. In a world of free information, the marketplace will finally favor quality.

Steve Litt can be reached at Steve Litt's email address.

Come the Revolution

"We won't get fooled again". In their song of that name, The Who describe the futility of revolution. Futile because once the fighting is over, self-serving hacks replace the zealots and the new regime becomes as bad as the old. The plot of films and Twilight Zone adventures, it's true enough to have become part of our collective consciousness.

And it's an easy trap to fall into. If you're anything like me, it's tempting to celebrate victory and kick back. And maybe even resist change a little. But Troubleshooting waits for no man. We need to march right on into the fourth era of Troubleshooting...

The History of Troubleshooting

Era Name Range Description
1 Observational Troubleshooting From invention of the bow and arrow until the invention of the steam engine (8000 BC to 1700's) Observation only. Systems under repair have all components visible, so the problem is obvious. Little diagnosis needed. On the other hand, repair/replacement of component requires precision, one of a kind work.
2 Intuitive Troubleshooting From invention of the steam engine until the 1970's Observation and non-rigorous diagnostic process. Systems under repair still contain only a few components, though some aren't visible to the naked eye. Diagnosis required, but doesn't need to be rigorous. Replacement parts likely to be available from a vendor, but may be difficult to replace.
3 Process Troubleshooting From 1970's until the present Observation and rigorous diagnostic process. Systems under repair contain many (>10,000) components, most abstract or invisible to the naked eye. Non-rigorous diagnosis produces circular search and rework. Rigorous diagnosis  required. Replacement parts available from a vendor, and due to modularity often easy to install. Software components are often replaced in five minutes with a few keystrokes.
4 Technologically Enhanced Troubleshooting From now until the next era Observation and rigorous diagnostic process, aided by context-relevant technology-served information (Troubleshooting process aware smart manuals). Systems under repair are now hugely complex, not always completely modular. Observation and rigorous diagnostic process alone takes too long, because no human can have the complete Mental Model, manual and diagnostic information in his or her head. Replacement parts are stock.

1999 is the autumn of the third era of Troubleshooting. Sure, we can still solve problems with Troubleshooting Process alone. But only those folks with immense memory capacities and expensive (and continuous) training possess the necessary Mental Model. So if your computer network fails, you call in a $200/hour guy with a dozen initials after his name. He's a strongman muscling around a HUGE load of system specific information.

But carrying that information load is manual labor. It makes no more sense than using strongmen to dig a trench, instead of using a backhoe. Don't get me wrong. There will be occasional problems where the smart manuals won't help, and the strongman is needed. Just like there are some excavation tasks (digging near cable) where we must revert to hand shovels. But profitable outfits will use the Troubleshooting Process aware smart manuals for the bulk of the work.

Expert Systems: We Won't Get Fooled Again

Those long-time Troubleshooting Professional Magazine readers among you might see this as a reversal on my part. After all, in the February, 1997 issue, I ridiculed the use of expert systems as a solution replacing Troubleshooting Process.

And nothing's changed! Most expert systems still are marketed as solutions instead of tools, and still are marketed as a replacement for people skilled in Troubleshooting Process. Most expert systems are experts on the system and ignorant of Troubleshooting Process. We've been down that road before -- it's a budget busting dead end.

But there's now an expert system receiving the Troubleshooters.Com seal of approval. Read on...

The First Era 4 Product

It's unique. Unlike those over-hyped expert systems (which didn't function properly even in Era 3), this system **has a Troubleshooting Process as a foundation**. Even as you read this it's being used to fix Northstar System Cadillacs, and its use is spreading rapidly to the military.

To accomplish this, an inter-corporational team led by General Motors' Jim Roach created a Troubleshooting Process optimized for use in a smart manual. This process takes the equivalent of the Universal Troubleshooting Process Step 5 (General Maintenance), and elevates it to an artform, complete with pre-defined diagnostics, error code interpretation, and factory mods. This allows discovery of the root cause at Step 5 80% of the time, leaving only 20% to go to time-consuming Step 6 (narrow it down). Even when problems do go to Step 6, they arrive in a much narrower scope than they otherwise would have.

They then built a voice-activated machine to implement their Troubleshooting Process. The machine is much more than a thyroidal Step 5. It also covers Step 2 (symptom description), Step 3 (damage control plan), Step 4 (symptom reproduction) and Step 7 (repair or replace). This little machine contains the total knowledge of the system's engineers, delivered in a when-needed, whatever-needed manner. It's like having the design team and engineering standing right there while you Troubleshoot.

I've seen them market it, and they do it the right way. It's marketed as a tool, not a solution. There's no implication that you can fire your techs, hire clerks, and get a good result. Indeed, they tell everyone who will listen about the underlying Troubleshooting Process, and its importance. They market it as what it is -- a Troubleshooting Process aware tool to relieve the Troubleshooter from the manual labor of carrying thousands of facts in his head.

It's a first of breed product, so naturally it's not perfect. Add to this the fact that fewer than 1000 people today really understand Era 4, so it's difficult (and expensive) to find authors for the material.

The machine itself is still expensive, so don't expect to see it at your local computer store this year. It calls for an early decision by the technician as to which subsystem contains the flaw, thereby posing a risk, in the hands of an inexperienced technician, of the problem getting out of the box. And in an industry like software, with its present non-modularity and "push maintenance down to the user" mentality, it becomes a much greater challenge. But it's a quantum leap above anything that preceeded it, and it's getting better all the time.

I've used it. It's nothing short of phenominal.

Era 4: The Manufacturer's Responsibility

Irreparable products are nothing new. When I began repairing audio in 1979, there were automotive tapedecks that required complete disassembly to replace the main belt. There have been cars where replacing a spark plug required pulling the engine.

But today irrepariablity has risen to dizzying heights. Witness the typical desktop computer system, where crashes, bugs, and non-functional features are accepted as normal. Check out Windows 98's "troubleshooters" (contained in the help system) -- a group of pre-defined diagnostics falling significantly short of an Era 4 Troubleshooting product.

Microsoft's "troubleshooters" authors aren't at fault. The problem is the complexity, non-modularity, and shear number of variables in the Windows operatins system. Maybe one in ten-thousand people can draw a detailed block-diagram of the Windows operating system, and such a diagram would be entangled almost beyond recognition. So how can a smart manual be made?

An Era 4 product is not a replacement for good, clean design. Instead, good, clean design is a prerequisite for an adequate Era 4 product. This is the responsibility of the manufacturer.

Era 4 at Troubleshooters.Com

Using the above described product and extensive discussions with Jim Roach have convinced me that pre-defined diagnostics are a vital tool. This might seem self-evident today. But given the crudity of yesteryear's predefined diagnostics (didn't it sometimes seem like everything came with a flowchart that didn't work), it took some convincing.

I wrote my first pre-defined diagnostic a few weeks ago. It's an HTML based diagnostic to Troubleshoot network problems on networks with a Linux server and Microsoft clients. It works like a charm. I'm still trying to decide whether to sell it or put it up as free content on Troubleshooters.Com. I anticipate many more pre-defined diagnostics in the near future.

Troubleshooters.Com guest author Marc-Henri Poget (Generating Web Decision Graphs using Perl, November 1998 TPM) and I have discussed creation of an Era 4 tool for the software industry. It's a really tough job due to the lax standards in our industry. We indeed live in exciting times.

Picking Your Era 4 Tools

Right now the only real Era 4 tool I know of is the one from GM. But of course others will come. When picking the right tool, here's what to look for:

We Won't Get Fooled Again

The Era 3 revolution has been won. We, the former revolutionaries, are now in charge. Once we've mopped up the pockets of resistance, will we be progressive, or will we become the defenders of the status quo? Will we insist on valid smart manuals, or will we jump on the "business as usual" dumb as dirt "expert systems" bandwagon.

They say he who ignores history is bound to repeat it. The unemployment lines are filled with Era 2 troubleshooters.

Steve Litt became one of the first to document Era 3 Troubleshooting Process with the publication of his book "Troubleshooting: Tools, Tips and Techniques" in 1990.  His opinions and advice are actively sought by Era 4 pioneers. He can be reached at Steve Litt's email address.

The State of Troubleshooting Address

It's January 1999, and Era 3 Process Oriented Troubleshooting is near victory. While it may be true that the majority of Troubleshooting done today is still intuitive, those Era 2 types are on the run and they know it. It's just a matter of time -- and it won't be long.

But already the new challenge of complexity looms. Even while we were kicking the intuitive guys out of the palace, complexity was rendering our methods (by themselves) impractical. And already, complexity has spawned some interesting responses.

So, In Summary...

So here's my New Years prediction. Y2K will suck up all Troubleshooters, good and bad. After that, Era 3 (Process) Troubleshooters will replace Era 2 (intuitive) Troubleshooters. Era 3 trainers will be sought after. But most sought after will be the Era 4 (technologically enhanced process) Troubleshooters, Trainers, and authors. As time goes on, manufacturers will discover that unless they work hand in hand with their Troubleshooters, an upper limit will be placed on their feature sets.

In short, times have never been better for Troubleshooters.

Steve Litt can be reached at Steve Litt's email address.

Linux Log

Linux Line is now a regular column in Troubleshooting Professional Magazine, authored by Steve Litt. Each month we'll explore a facet of Linux as it relates to that month's theme. Today we'll discuss Linux from the point of view of intermittence.

1979: Pacific Stereo Service Department

They called me "cherry picker", and it wasn't a complement. Corporate culture: take any piece of junk the customer brought in, and fix in the order received. I continually violated policy, refusing junk where I could, and postponing it where I must.

And man, there was junk. Turntables that pushed the record UP the spindle! One-of-a-kind audio-cassette carosels. Car audio requiring complete disassembly to replace a stretched drive belt. And a certain, highly rated, tapedeck that may have sounded great, but with springs, pulleys, levers and notches making re-assembly a 4 hour job. There was no shortage of juryrigged, convoluted Rube Goldberg machines if that's the kind of thing you liked to work on.

"Hey cherry picker, can't you do the hard ones?"

I regularly violated policy, but they didn't have the heart to fire me. I made them 100% more money than their average technician.

1998: American Troublebusters

They called me "idealist", and it wasn't a complement. Corporate culture: take any piece of junk Redmond brought in, and make it work the way Redmond advertised. I continually violated policy, using corporationally incorrect technology where I could, and putting in Microsoft decoys where I must. Linux was my OS of choice.

And man, there was junk. A set of foundation classes that mapped to the operating system instead of the problem domain. An operating system whose normal behavior was to crash several times a day. Black boxes called DLLs connected to multiple things, with no documentation, and different versions. Scripting languages with more exceptions than rules. There was no shortage of juryrigged, convoluted Rube Goldberg machines if that's the kind of thing you liked to work on.

"Hey idealist, can't you do Microsoft?

I regularly violated policy, but they didn't have the heart to fire me. My apps came in on time and under budget, worked flawlessly except under user error, in which case they were easy to troubleshoot. They satisfied both the users and the corporate strategy.

Some Time in the Near Future

They called me "genius", and it was a complement. Corporate culture: technology must pay its way. I continually supported that policy with Linux and other formerly corporationally incorrect technology.

And man, that technology was good. Worked exactly as expected. Full source availability guaranteed we'd never get boxed in by a vendor and provided fallback documentation. Even huge multi-subnet Linux systems with naming services, email service, file/print/application server and database web apps could be readily troubleshot, on those rare occasions when they stopped working. The operating system was documented, modular, and made sense. A base of over a million "idealist" technologists, all helping each other online, guaranteed no problem was insoluble. Superior versions of apps and technologies formerly thought possible only in Windows became available in Linux.

Hey genius, how do perform these miracles?

I regularly raised my rates, but they didn't have the heart to fire me. My apps came in on time and under budget, worked flawlessly except under user error, in which case they were easy to troubleshoot. They satisfied both the users and the corporate strategy.

The more things change, the more they stay the same.

Steve Litt can be reached at Steve Litt's email address.

Auld Lang Syne

Now I understand how how politicians feel when nobody votes. Only 1 (yes, one) vote has been received for best issue, none for best article. This in spite of my statistics page which tells me over 10,000 people visited the November 1998 (Linux) issue of Troubleshooting Professional Magazine. So the winner, by a 100% majority of 1 is:

March 1998 Troubleshooting Professional Magazine: Bottleneck Analysis.

So, in the absense of a large voter turnout, I'll once again pick what I believe to be the year's five best articles:
Article What It's About
1 The Man Who Banned General Maintenance
(February  1998)
A Troubleshooting short story encompassing three generations, over 30 years, plots, intrigue, corporate takeover, and sweet victory.
2 Safety and Intermittents
(December 1998)
A serious discussion of this too-often ignored subject.
3 GNU: An Idea Ahead of its Time
(May 1998)
The true story of how one man, Richard Stallman, changed history with an idea and an ideal.
4 A Supercomputer in Every Kitchen
(May 1998)
Souped up Dodge Darts and Linux Parallel Supercomputers combine to create the ultimate Boomer fantasy.
5 Post Doctorate: Database Enabled Web App
(November 1998)
Pure geekiness puts this Linux Web App howto in the top five.

A warm thank you for a tough job well done goes out to this year's guest authors:

Letters to the Editor

All letters become the property of the publisher (Steve Litt), and may be edited for clarity or brevity. We especially welcome additions, clarifications, corrections or flames from vendors whose products have been reviewed in this magazine. We reserve the right to not publish letters we deem in bad taste (bad language, obscenity, hate, lewd, violence, etc.).
Submit letters to the editor to Steve Litt's email address, and be sure the subject reads "Letter to the Editor". We regret that we cannot return your letter, so please make a copy of it for future reference.

How to Submit an Article

We anticipate two to five articles per issue, with issues coming out monthly. We look for articles that pertain to the Troubleshooting Process. This can be done as an essay, with humor, with a case study, or some other literary device. A Troubleshooting poem would be nice. Submissions may mention a specific product, but must be useful without the purchase of that product. Content must greatly overpower advertising. Submissions should be between 250 and 2000 words long.

All submissions become the property of the publisher (Steve Litt), unless other arrangements are previously made in writing. We do not currently pay for articles. Troubleshooters.Com reserves the right to edit any submission for clarity or brevity. Any published article will include a two sentence description of the author, a hypertext link to his or her email, and a phone number if desired. Upon request, we will include a hypertext link, at the end of the magazine issue, to the author's website, providing that website meets the Troubleshooters.Com criteria for links and that the author's website first links to Troubleshooters.Com.

Submissions should be emailed to Steve Litt's email address, with subject line Article Submission. The first paragraph of your message should read as follows (unless other arrangements are previously made in writing):

I (your name), am submitting this article for possible publication in Troubleshooters.Com. I understand that this submission becomes the property of the publisher, Steve Litt, whether or not it is published, and that Steve Litt reserves the right to edit my submission for clarity or brevity. I certify that I wrote this submission and no part of it is owned by, written by or copyrighted by others.
After that paragraph, write the title, text of the article, and a two sentence description of the author.

URLs Mentioned in this Issue