Copyright (C) 2001-2005 by Steve Litt. All rights
reserved. This material was started in 2001, and completed in 2005.
Volume 7 Issue
Materials from guest authors copyrighted by them and licensed for
use to Troubleshooting Professional Magazine. All rights reserved to
copyright holder, except for items specifically marked otherwise
free software source code, GNU/GPL, etc.). All material herein provided
User assumes all risk and responsibility for any outcome.
| Back Issues ]
triumphantly through the gates, barely glancing at the old woman
about to cut the rope and spring shut the
trap. -- Mason Cooley
By Steve Litt
Riddle: The answer is
"speculation, guesswork and prayer". What's the
The question is: What is troubleshooting without process?
Absent valid troubleshooting process, troubleshooting becomes
in the hands of experts, guesswork for those of average abilities, and
in the hands of neophytes.
Speculation, guesswork and prayer. Loss of profit, morale, and
(for the company, department, and employee).
This issue of Troubleshooting Professional discusses the importance
Troubleshooting Process, and the speculation, guesswork and prayer
from its absence. We'll discuss how lack of process creates
guesswork and prayer, how to recognize the problem, the resulting
to your department or organization, and how you can eliminate the
with training in troubleshooting process.
So kick back, relax, and read this magazine. And remember, if you're
Troubleshooter or Technologist, this is your magazine. Enjoy!
Yes, as a Matter of Fact
This Magazine is Late
By Steve Litt
The Summer 2003 issue of Troubleshooting Professional Magazine was
completed on October 6, 2005. It had been on the web before that,
presumably since the summer of 2003, but it had been incomplete.
Stranger still, this magazine was written primarily in 2001, but then
put aside for other content. When I rediscovered this content on
10/6/2005, I was very impressed with it, finished it, and put the
revisions on the web. I hope you enjoy it, and just remember this:
By Steve Litt
Give a symptom description to 12 experts in a room. Ideally you'd like
see immediate, unanimous agreement on the root cause of the symptom.
course that's impossible.
A very realistic expectation is to hear each expert say "I don't
but I'll find out". But that's all too rare.
What you usually find when giving a symptom description to 12
is 12 different snap judgements. Speculation. This poses problem
because at the most one of those 12 snap decisions is correct. The
other 11 embark on an
expensive journey to a dead end. The result for your department or
is slow solutions, failure to repair, or other problems.
Somehow society, and experts themselves, have developed an
that a true expert can hear a symptom description and instantly fathom
cause. To see how truly silly this is, imagine if court procedings were
with this expectation.
Effective Troubleshooting is Like a Courtroom
Who done it? What caused it? Same thing.
In court you need to prove the cause beyond a reasonable doubt.
that proof is concluded, any remediation is likely to do more harm than
No matter how expert the detective, the lawyer, the expert witness, it
the duty of the jury to weigh all evidence.
In the courtroom, failure to prove before remediating allows the
to go free, and the innocent to be imprisoned or executed. It results
loss of confidence by the public. To see examples, Google search the
"death row" "Anthony Porter".
In the repair of machines, computers, networks and other systems,
to prove before remediating facilitates repeated failure to repair,
damage to the system, and loss of confidence, in both the department
organization, by customers, management and co-workers.
Both Require a Process
To minimize likelihood of harmful actions, both courtroom activities
troubleshooting require a process.
In the case of the courtroom, the chain of evidence must be
Both sides present witnesses, examine and cross examine the witnesses,
exhibits, have witnesses testify about exhibits, make and counter
The judge determines at each point whether valid courtroom process is
followed. The jury is informed they must decide based ONLY on evidence
has been properly and correctly presented in court. The jury reaches
best decision, based ONLY on the evidence.
In the case of troubleshooting, the technologist aquires the symptom
reproduces the symptom, checks likely suspects, and performs tests.
designed to rule out sections of the system. Eventually the area of
narrows to the point where the root cause is obvious, at which time the
has reached his best decision based only on the evidence, and repairs
replaces the defective component.
But What About Intuition
We all know troubleshooters use intuition, and we all believe this is a
thing. The question is, at which point is intuition brought to
Intuition is a wonderful tool in deciding what tests to perform. It's a
component in guiding the course of the diagnosis. But when the
assumes a cause based on intuition, and acts on that assumption, he's
The process leading to the solution should be more deductive than
Unless the technologist performs verification tests before performing
fix, he will waste an extraordinary amount of time and money
a wrong fix. If the technologist fails to test, the fix will be wrong,
in serious problems.
Not only is intuition harmful at the root cause level, but excessive
can also be harmful at the diagnostic level. If the technologist
speculates specific components, he loses the benefit of ruling out
chunks of the system.
The decision of which troubleshooting tests to perform next is based
a quadruple tradeoff:
Intuition and speculation are proper to the extent that they are used
estimate the preceding four factors and draw the best educated guess.
that, intuition and speculation have no part in troubleshooting.
- Ease of performing the test
- Likelihood that the test will isolate the root cause into a small
- Even divisions in ruling out the remainder of the system
- Safety concerns
Talent is No Excuse for Speculation
The argument might be made that the speculation of an expert is more
than that of a non-expert. Such an argument might state that whereas
is just a "peer" of the defendent, the network engineer is an expert
spent a career accumulating network knowledge. Certainly the network
speculation can be respected.
But in fact, the network engineer's speculation is no more
than a juror's would be. The most talented and knowledgeable network
lacks Xray vision and divining rod fingers. He cannot listen to a
description, gaze upon the system, and devine the single defective
out of a cast of hundreds or thousands.
Yet many talented people try for the speculative quick fix. They use
diagnostic machine or software program, speculate that the diagnostic's
are true, forsaking all other areas of the machine. Or they hear a
description identical to one they've heard before, and speculate that
root cause is the same. Or they analyse the system, and speculate on
part would produce an identical symptom.
Once again, there's nothing wrong with using speculation as a tool
decide among several likely diagnostic tests, but when one assumes a
cause, the result is replacement of the wrong component, circular
argumentative troubleshooting, finger pointing, profit loss, and harm
reputation of all involved.
Speculation is a Product of Ego
Certainly a technologist considering himself a beginner or "average"
have no problem "admitting" the need of diagnostic tests before drawing
conclusion. It seems like those considering themselves "experts" are
expecting themselves to provide an instant answer. Yet a little
reflection tells us that no sane person
would expect any human to conjur a root cause out of a system with
of components. The sane person knows that only the process of
leads to a quick and correct identification of the root cause.
So why do those considering themselves experts fall into the
trap? They fail to recognize that troubleshooting consists of two
The reason they don't know is because they've never been trained in #2.
few people have. As long as we expect our experts to get along without
in valid troubleshooting process, we'll continue to experience
and the problems it brings.
- Knowledge of the system under scrutiny
- Knowledge of troubleshooting process
Some Experts Use Valid Troubleshooting Process
Nothing in this article is meant to imply that all experts are devoid
troubleshooting process knowledge. Many experts have learned a valid
process either through courses or throught the school of hard knocks.
Such expert troubleshooters usually do not speculate nor exhibit
Problems Caused by Speculative Troubleshooting
The problems caused by speculative troubleshooting can be grouped into
- Individual problems
- Team problems
- Corporate problems.
The individual technologist troubleshooting speculatively experiences
loss caused by the circular troubleshooting forced upon him by his
This results in delayed solutions at best, and incorrect solutions if
are not verified with conclusive tests. Such productivity loss cannot
but torpedo the technologist's morale, which could result in burnout,
or blame shifting. Burnout, stress and blame shifting invariably steal
energy from the technologist's diagnostic efforts, creating a vicious
The final result could be disability leave, dismissal, or gravely
No troubleshooter works alone. There is usually a customer, internal or
who expects the problem fixed quickly and cleanly. In the case of a
debugging his own code, there will be users relying on that code, and
who have gone out on a limb promising clean code delivered on time.
the troubleshooter is part of a team consisting of a members from one
Speculative troubleshooting ruins teamwork. It facilitates argument
finger pointing between employees, or with customers or vendors. The
can take the form of "who owns the problem", as in the classic "it's
it's software" gambit, or it can take a more personal form as in "Joe
me bad information (his erroneous speculation), and cost me many hours
time scheduled for another project.
Combined with a lack of team awareness (good Troubleshooting Process
teach team troubleshooting), speculation results in alternating
visits. The hardware guy comes out, pronounces the problem software,
The software guy comes out, declares it to be hardware, and leaves.
goes on three or four times, consuming a week or more, until
managers demand that both the hardware and software people show up at
same time and stay there til the problem is solved.
All of this creates morale problems for the team. Team morale
tend to degenerate very fast and very hard, with grave results.
Morale problems also crop up in the corporation. Speculative
leave users and customers high and dry. Name calling contests often
All of this reduces the organization's productivity.
The decreased individual and team performance results in increased
salary expense, as more technologists are required to do the work.
The finger pointing between technologists, departments, and
caused by speculative troubleshooting can make the organization look
a Keystone Kops film.
Opportunities are lost as lingering unsolved problems erode client
Those lingering problems also create risk of customer loss, litigation,
problems, and harm to the organization's reputation.
All these "screwups" result in remediation costs. Drawn out meetings
phone calls are required to sooth nerves (usually followed by yet
mistake). Profit-sapping customer discounts are often used to say "I'm
In the case of catastrophic mistakes, remedial advertising might be
Recognizing Troubleshooting by Speculation
Sometimes the hardest part of dealing with a problem is recognizing it
the first place. How do you recognize speculative troubleshooting.
best recognize speculative troubleshooting, be on the lookout for the
- Arrogance based diagnosis
- Agenda based diagnosis
- Rote repetitive diagnosis
- Slow or unsatisfactory solutions
- Dissatisfaction with the support department
Arrogance Based Diagnosis
Arrogance based diagnosis leaves clues. When you see these clues,
"I'm the expert, I know how to fix it!"
Statements like the preceding should raise a red flag. Although it's
that every troubleshooter must occasionally trot out this phrase to
an overly agitated user, if someone uses this phrase regularly it's
more out of arrogance than an attempt to calm. All too often statements
the preceding are followed by misdiagnosis.
If the person making the statement has anything but a steller record
fast, accurate solutions, intervention may be called for.
"The diagnostic software says..."
Diagnostic tools and software are essential for diagnostic
Unfortunately, the more complex and featureful they are, the more they
be misused. Many diagnostic tools attempt to troubleshoot down to the
component on the basis of a symptom description, possibly plus an
array of pre-defined diagnostic tests. Many so-called experts are more
in the operation of their diagnostic tools than on the system under
or the process of troubleshooting. All too often, experts place
belief in the pronouncements of such diagnostic tools, leading to
of incorrect components and failure to repair.
When a technologist says "The diagnostic software says...", if the
sentence is "So lets start looking at that subsystem", everything's OK.
if the next sentence is "So let's replace the...", expect trouble. It's
height of arrogance to believe ones self so expert at diagnostic
that further investigation is unnecesary.
Again, if the person making the statement has anything but a steller
of fast, accurate solutions, intervention may be called for.
Maddeningly, some technologists consider themselves so important that
simply fail to respond. Sometimes it's the result of overzealous
of their resources by management. But in many cases, it's arrogance,
Although foot dragging is not caused by speculation, often foot
and speculation have the same cause -- arrogance.
Agenda Based Diagnosis
Sometimes speculation takes the form of conflict of interest. Maybe the
doesn't want to be bothered right now, so he manufactures a believable
why the root cause is in a subsystem whose responsible party is a
person or organization. The old "It's a hardware problem" gambit.
Often the diagnosis is made in order to shift blame. Perhaps the
wired the building with inferior network cabling, and now is trying to
the poor performance on "bad routers" and "packet collisions" rather
admit he made a bad mistake.
Sometimes there's a more sinister intent. Two departments get in a
contest, and technologists from each department fabricates root cause
designed to place the cause in the enemy department.
All agenda based diagnosis share's one common trait: Root causes
move according to the whims of people. Eventually the scoundrels are
out, and people are angered. If the duped party is a customer, he might
his business elsewhere. If the duped party is inside the organization,
entire organization has been hurt by the slow diagnosis -- a fact not
by upper management.
Good troubleshooting process training educates technologists to the
that sooner or later they will lose, and lose big, if they diagnose
Rote Repetitive Diagnosis
Some experts are more memory experts than system experts or
process experts. An outstanding memory is an asset when used wisely and
liability when used as a crutch. Once again, if memory serves to
the diagnostic test most likely to isolate the root cause, that's a
thing. But if memory is used to speculate the root cause, and that
is used as the basis of a repair, many times it's a costly mistake.
We see it all the time. Last time the Windows crash was caused by a
with the database server, and you just got a blue screen, so
reload the database server now. Unfortunately, this time it's a rogue .dll
file placed on the application server by a restore, so
database server just dumps all work past the last backup and costs many
There are some who believe expertise is achieved when one has seen
problems and remembers the solutions. Such people don't understand the
of troubleshooting process, and doom themselves to a high percentage of
repairs. Such people need training in the troubleshooting process.
Non-cooperation is really an effect of diagnosis by arrogance or
by agenda. If you see much non-cooperation during troubleshooting,
Slow or Unsatisfactory Solutions or Dissatisfaction with the
The ultimate result of speculative troubleshooting is slow or
solutions. If you see these symptoms, investigate further.
Eliminating Troubleshooting by Speculation
Obviously, to eliminate speculative troubleshooting, you need to fix
cause. As mentioned previously, possible causes include:
- Rote repetitive diagnosis
One could define arrogance as unjustified certainty. The arrogant
is certain his speculation is correct and there's no need for further
or testing. In the absense of some sort of extrasensory powers, how
he draw that conclusion?
It could be abject stupidity, but if he's smart enough to be
an expert, that's unlikely. What's more likely is that he is not
with the role of process in troubleshooting. This is likely
because only a tiny percentage of technologists have been trained
in troubleshooting process, so the troubleshooting productivity of this
might be on a par with his contemporaries. Quite often, lack of
process training is the root cause of arrogance, which is a cause of
troubleshooting. Therefore, troubleshooting process training can
Rote repetitive diagnosis
What causes rote repetitive diagnosis? Certainly its practitioners
it fails frequently, so why do they use it overwhelmingly? Could it be
know of no alternative? Once again, given the lack of training in the
process, it's quite likely.
So in cases of speculation caused by arrogance or rote repetitive
training in a valid and correctly optimized troubleshooting process
likely cure the speculation, and the host of problems it creates.
Diagnosis by agenda is not so simple. Here there's pressure on the
to toss the problem to the other department or company like a hot
Here the speculation is merely a symptom, with the root cause being a
business model pitting departments against each other, or occasionally
company against its customers. Does that mean training in
process cannot help?
Yes and no. The process training won't eliminate the root cause of
based diagnosis, but it can expose the root cause. Once one department
the process of diagnosis, they are prepared to find all root causes
their area of responsibility, and they're prepared to document cases
the other department hot potato's them, then armed with such evidence,
that department can report their findings to management. It's possible
that management will understand and fix the underlying business
problem. If nothing else, the troubleshooting process training makes
the day to day
work environment more livable.
If speculative troubleshooting creates problems in your organization,
there's a high likelihood that problem can be cured or at least
by training your technologists in diagnostic process.
By Steve Litt
What's the distinction between speculation and guesswork? Speculation
certainty, while guesswork does not.
Those defined as "experts" are often self-defined. They're certain
expertise. Those without such certainty might be relegated to the
Wouldn't a self-perceived "average" troubleshooter forgo speculation
its less certain cousin called guesswork) in favor of a valid
process? He would if only he knew a valid troubleshooting process.
Once again, only a tiny segment of the technology population
training in the process of diagnosis. The rest have to learn it in "the
of hard knocks". Some learn it better than others. It's not an
proposition, but instead a spectrum from the process clueless to those
at home with and aware of process, cause and effect, the process of
and prioritization of diagnostic testing.
We all guess from time to time when deciding what diagnostic test to
next. But all too often guessing transends the role of tactical guide,
becomes a strategy in and of itself. That's the time to give the
an alternative -- training in a valid troubleshooting process.
Those with less than total comfort and awareness of troubleshooting
often resort to guesswork. Sometimes the guesswork takes the form of
the machine and guessing which root causes would cause the symptom.
the guesswork is along the lines of probability. Guesswork can assume
role of total belief of "expert systems", or of diagnosis by serial
or excessive reliance on escalation. The process-deficient technologist
guesses that the root cause is outside his job description, or that he
insufficient skills to fix the problem (that's almost always a false
And sometimes guesswork is just random guessing. Whatever the
the guesswork, it's always a productivity thief.
If these symptoms appear among your technologists, it's necessary to
that diagnosis by guesswork is occurring, and it's time for your
to receive training in a valid troubleshooting process.
By Steve Litt
While the experts are busy with speculation, and average technologists
guess their way to a solution, the beginning technolgist must pray for
Like the others, he's received no troubleshooting process training, but
addition, he hasn't had sufficient time in "the school of hard knocks"
learn even a little process. Making things worse, the beginning
is often undertrained in the system under repair.
Barring a miracle or massive intervention by a more experienced
the beginning technologist stands little chance of solving a complex
problem. And given today's "lean and mean" business environment,
little hope of much help from senior technologists. So the beginning
muddles through his first year or two trying not to alienate coworkers,
and customers. He tries not to cause damage to systems he
He fights job disappointment.
In the kinder and gentler days of decades gone by, the organization
resources to help the new technologist through these difficult times.
many organizations expect the new technologist to sink or swim. The
turnover is a problem, but they just don't have enough spare capacity
it comes to more experienced technologists.
It doesn't have to be this way
There's an alternative to the sink or swim philosophy. New
may be several months deficient in their systems training, but a two
course can make them fully competant in troubleshooting process.
As any master technician can tell you, an excellent grasp of
process compensates quite nicely for a partial lack of systems
In fact, the cause and effect deduced during the troubleshooting
is one way we learn technological systems.
Ultimately, the organization that brings new technologists up to
fast experiences productivity gains, reduced staffing, retention and
acquisition costs, and a better team environment. Two days of
process training is a small price to pay for these advantages.
By Steve Litt
The bad news is that speculation, guesswork and prayer carry huge costs
the organization, the department, and the individual employee. The good
is that because they're caused by inadequate troubleshooting process
they're cured by a simple training course. Of course, you need to
right troubleshooting process, the right course, and the right
The following are the properties of a well chosen troubleshooting
This article also discusses the finer points of evaluating a
process and implementing a troubleshooting process training program.
- Valid Troubleshooting Process
- Properly Optimized Troubleshooting Process
- Good for expert, average and beginning
- Yields significant productivity gains
Valid Troubleshooting Process
A valid troubleshooting process must, at the very least, accomplish the
- Recognize and exploit cause and effect
- Recognize and exploit the process of elimination
- Minimize jargon and fluff
- Accommodate the Work Habits of Human Beings
Recognize and Exploit Cause and Effect
Cause and effect is the opposite of cause and effect is superstition.
wants diagnosis by tarot card.
But it's not enough not to be superstitious. Cause and effect must
recognized and exploited. Many so-called "troubleshooting courses" are
but yet another journey into the technical details of the system to be
with a few diagnostic tools thrown in. No cause and effect there.
Recognize and Exploit the Process of Elimination
If you count up all the electronic parts, jumper settings, BIOS
operating system configuration parameters, and application
in a modern computer system, you'll find it has 50,000 or so components
can act as root causes. Combine 10, 100 or 1000 such computers into a
and the complexity is mind boggling. Nothing short of efficient
of the process of elimination will foster quick solutions of such
And yet there are all sorts of "Windows Troubleshooting" and
Troubleshooting" courses that act as little more than symptom/solution
with a few tests and diagnostic products thrown in.
Accommodate the Work Habits of Human Beings
Human beings have certain attributes. To the extent that a
process works with those attributes, it will be successful. To the
that a troubleshooting process fights those attributes, it will fail.
First and foremost, humans can concentrate on about 7 facts at a
Some more, some less, but 7 is a good number. Right off the bat, that
that the process better not ask the person to try to simulate the
in an effort to "figure out" what's wrong. Nobody can contemplate
or thousands of components at once. That's why mental simulation
doesn't work. That's why processes based on binary search through
tests work marvelously.
Humans must trust before they can invest. A simple process whose
is obvious will excite a person to learn, and to use after learning. On
other hand, nobody will go to the effort to learn a process requiring
all sorts of
detailed actions on his part. Initially the trainee doesn't believe it
work, and as such doesn't invest much learning energy.
That brings up the subject of the program of the month.
in the workforce more than a couple years has encountered at least one.
employee perceives something as a program of the month, he goes through
motions but his mind is elsewhere. His attitude: Been there, done that!
The troubleshooting process MUST NOT be perceived as a program of
month. What are some attributes of programs of the month?
So your troubleshooting process of choice should state its case in
English, it should take care not to insult the intelligence of those
and like any other company endeavor, it must have the support of
at the highest level of its application and influence.
- Upper management never bought in
- Insults the intelligence of employees
- Treasure hunts
- Reciting "we're entrapeneurial" while on salary during a salary
- Warm fuzzy cartoons with animals, carrots and sticks, extolling
virtues of teamwork
- Implying that the employees best interests always parallel the
- Encrusted with jargon
Properly Optimized Troubleshooting Process
Problem solving process isn't a "one size fits all" proposition. There
generic problem solving methodologies optimized to solve problems in
defined systems (business, political and relationship), there are
processes optimized to solve problems in well defined systems
networks, software and machines of all sorts). There are methodologies
for safety critical situations, and for events and extremely sparse
There are methodologies optimized to find bottlenecks (where the
contains "too" or "insufficient"). Selecting a process not optimized
your situation will cut productivity by an order of magnitude.
The Universal Troubleshooting Process is an example of a
process optimized for well defined systems, and works quite well under
variety of such systems. Its ideal use is on reproducible or frequently
intermittent problems in machines and computer systems (including
in low, moderate and sometimes highly safety critical situations. In
critical situations (nuclear power plants and the like) it's best
with a more safety optimized process such as Root Cause Analysis.
Because it incorporates Bottleneck Analysis, the Universal
Process quickly solves problems of degree in well defined systems.
Inappropriate Uses of the Universal Troubleshooting Process
The Universal Troubleshooting Process is inappropriate for solving
in fuzzily defined systems such as businesses, relationships, and
floors (the work and parts flow). For such problems you'd use a generic
solving method like Kepner Tregoe, or as appropriate a bottleneck
optimized process such as the Theory of Constraints. The Theory of
has become quite famous on the factory floor.
The Universal Troubleshooting Process is insufficient in
where a large number of events or sparse intermittents occur. Such
require use of an event-optimized process such as Root Cause Analysis.
mentioned previously, in extreme safety critical environments such as
power plants the Universal Troubleshooting Process should be
with something like Root Cause Analysis.
There will come a time (for many of you it's already come many times)
someone will try to sell you on the idea of generic problem solving
for your technologists, technicians and maintenance people in order to
their productivity fixing computerized systems and machines. Such use
problem solving process is inappropriate, time consuming, error prone, and costly because generic
solving processes are not optimized for well defined problems, and also
they generally waste time with questions irrelevant to fixing technical problems, such as how to transition from the current state to the desired state: In technical troubleshooting you just repair or replace the bad component. Using a generic problem solving method to solve technological problems can double, triple, or perhaps increase tenfold the time, effort and cost of repair. And your employees will rebel against such use of generic problem solving methods.
Good for expert, average and beginning
Many experts speculate, many average technologists guess, and many
pray. The solution must be a troubleshooting process useful to expert,
and beginner alike. It must work, it must be comprehendible, and it
believeable. It must not wrap itself in buzzword fluff or fuzzy
Yields significant productivity gains
Speculation, guesswork and prayer waste time and money. The whole
of their elimination is increased productivity. The chosen
process must truly increase productivity. Fortunately, this is easy,
usually the missing ingredient to productivity is troubleshooting
Evaluating a Troubleshooting Process
How do you evaluate a troubleshooting process? How do you estimate its
potential? Here's a list of ideas:
- Optimized for speed and accuracy
- No extraneous time consuming frills
- Heavy coverage of the process
- Minimal jargon
- Respect for Troubleshooter's time and intelligence
- Immediately usable on the job
Optimized for speed and accuracy
In a high school electronics class years ago, I was taught to scope the
then the first stage, then the second, stage by stage until the output.
first place I found a problem indicated the stage containing the root
That's called serial search, -- a genuine troubleshooting process. It
might have even been sufficient for the five tube radios of the era.
|Today's systems often have six
worth of components. Serial search would take weeks or months per
Serial search isn't optimized for complex systems.
An optimal process for today's systems would use binary search. The
to the right shows how binary search works. Each diagnostic test rules
half of the remaining root cause scope. Obviously, real world
can't exactly split the remaining root cause scope, but it's something
must shoot for.
Imagine a system with 16 million components. If each diagnostic test
only one minute, serial search requires 16 million minutes -- that's 30
Now let's say you use pure binary search. You would need only 24
tests to find the root cause. Even if each diagnostic took an hour
of a minute, that's three 8 hour workdays.
One factor that hugely speeds diagnosis by even division is the
of numerous cheap and quick testpoints. With such testpoints, a
is only a measurement away. Without them, much mental simulation is
The Universal Troubleshooting Process is optimized to take
advantage of the numerous cheap and easy testpoints on most technology.
the other hand, the methods described in Kepner and Tregoe's "New
Manager" do not optimize to this advantage, resulting in a slower
on most technology.
No extraneous time consuming frills
For technological troubleshooting, you want a troubleshooting process
to technology, with no time consuming frills. Specifically, methods to
fuzzily defined problems (business and interpersonal) are unneeded
in technology. If a technologist needs to solve business and
problems as well as technology problems, he or she should be trained in
best of breed for each, not a methodology that happens to include both.
Jargon is distracting and aggravating. Employees are smart, so they
when a course author has used jargon words to obfuscate a beautifully
concept. They typically attribute the motivation for such obfuscation
an attempt to make the material more complex than it need be, in order
the material's price. Technological troubleshooting is simple enough to
in plain English. When creating the Universal Troubleshooting Process
I used plain English for every concept except the term Mental Model, which is
explainable in 2 minutes.
Respect for Troubleshooter's time and intelligence
All too many employees are subjected to various programs
supposed to "get them on board", or "improve their potential".
most are perceived as "bullfeathers". To be credible, troubleshooting
training must stick to the facts, and avoid any hint of "program of the
Immediately usable on the job
Timing is everything. Unless the employee can immediately use the
on the job, the material is forgotten within a few days. Employees
be encouraged to use the information immediately upon return to work,
that encouragement should be based on the fact that the material is
to their everyday jobs.
Implementing a TP training program
Decide on the goals
What is the basic problem? Is it failure of groups of employess to
problems in a timely and accurate way? If so, are the problems they're
business problems, factory floor problems, or technological problems
and system repair)?
If your employees need help solving business problems, consider a
employing the methods described in "The New Rational Manager" by Kepner
Tregoe. That methodology is great for analyzing and deciding upon
for fuzzily defined systems such as businesses, departments, and the
However, because it doesn't take advantage of numerous cheap and safe
afforded by machines and systems, it's slow and cumbersome for solving
If your employees need help solving technological problems, be it
machines on the factory floor, local or wide area networks, computer
computer software, or electronics, consider the Universal
Process course. Its optimization for abundant testpoints makes it a
quick way to diagnose and fix tech problems.
But resist the temptation to use the Universal Troubleshooting Process
business and interpersonal problems -- lack of testpoints in those
of systems make the Universal Troubleshooting Process insufficient for
and interpersonal problems.
When dealing with safety critical problems, you need to solve the
problem behind the technological problem. For instance, in a nuclear
if lack of a written maintenance policy for air conditioners causes an
conditioner to run out of freon, which causes 130 degree temperatures
the parts storage room, which causes weak solder joints on stored
boards, which causes those boards to fail early, which causes a rod
failure, which causes the reactor to trip, root cause must be traced
the way back to the lack of written maintenance policy. In such a case,
Cause Analysis as described by Max Ammerman's book would be what is
In many cases your people need instruction on more than one of these
In that case, I suggest you give them courses in all necessary
Resist the temptation to make one of the technologies fit all the types
Commit the resources
Resources are always an open question. Here are some ideas.
Instruction costs vary widely. One of the least expensive is the
Troubleshooting Process, which, if implemented by in-house instructors,
$45.00 per attendee plus what you pay the in-house instructors. If you
to have a Troubleshooters.Com instructor teach the course, it will
cost between $2800.00 and $7000.00 for a 2 day course, depending on
This can be very cost effective if you have more than 10 attendees at
Many other courses cost considerably more. When deciding on a course,
into account the cost, the benefit, and how applicable the training is
the problems you're trying to solve. Reserve the money from the budget.
The employees can't work while they're receiving instruction, and they
receive instruction while they're working. As with any course, you must
enough resources so employees can take the course. This might involve
employees take the course in shifts. However you work it, have it
Post-course break in period
Commit the resources for a post-course break in period. In other words,
a period of a week or so, expect employee productivity to go down, not
Because your employees have just changed their habits. What they once
by rote, they now need to think about. That takes time. I've heard it
that it takes 21 days to change a habit. That sounds reasonable. If
it means that productivity gain will occur somewhere between 7 and 14
Before 7 days, the new habits are so new that the employee must take
time to remember to do them. After 14 days, the habits are almost in
and the efficiencies of the new methods overcome the employee's extra
to remember. From then on, it's all gravy.
|Sports analogies are ubiquitous. One I like
speedskating analogy. In 1980 every competitive speedskater was on the
skates with the four wheels forming a rectangle. The few people on
skates were recreational skaters and were ignored. By 1983 there were a
fast inline skaters, but none rose to the level of regional
By 1986 inline skaters participated in races and a few actually won.
caused others to try them, and those trying them invariably skated
not faster. Some gave up and went back to rectangular skates. But
continued practicing on inlines, eventually beating their best rectangular skate times. By 1992, every competitive speedskater used inlines -- nobody
rectangular skates could win or even come close.
But there was a breakin period of several years.
Today it's absolutely clear that inline skates are much faster than the
skates. And yet, every oldstyle skater went slower the first few times
What I'm trying to say is this. You must support your employees in
to use the better technique. This means that for 1 to 3 weeks after
do not demand improved productivity, and do not make a big deal out of
productivity. After 3 weeks, you have the rest of the employee's tenure
profit from their training.
Decide on the process
Choose the troubleshooting process best suited toward your goal, and
optimized for your purposes. Different goals and purposes were
the Decide on the goals section of
Decide on the course and instructor
Some troubleshooting processes are taught by a variety of vendors,
are taught by just one. If the course you want is taught by a variety
vendors, pick the best one based on reputation, price and "fit".
Some vendors give you the choice of teaching by the vendor's
or licensing the course for teaching by your own in-house instructors.
are pros and cons of each. Some vendors also give the option of a
trainer" course so that in-house trainers fully understand what they're
and how to teach it.
In house instructors
In-house instructors are great because they have better knowledge of
work lives. That means they can use examples and exercises more suited
the audience's day to day work, thus gaining credibility. Some vendors,
as Troubleshooters.Com, provide the inhouse trainers with self
instructor materials. Beyond that, in a large training project, it's
valuable to have the vendor provide "train the trainer" training to the
Depending on the vendor, use of in house instructors can be very
For instance, Troubleshooters.Com charges only $45.00 per attendee to
the Universal Troubleshooting Process course given by in house
With all the advantages of in house instructors, why would anyone
the vendor's instructors?
no substitute for the vendor's instructors if you want the utmost in
troubleshooting process knowledge. Also, the vendor's instructors
truly believe in the power of their troubleshooting process, so there's
no need to get "buy in" from the trainers themselves. Last but not
least, vendor supplied instructors are knowledgeable enough to conduct
"train the trainer" sessions for the in-house instructors.
In a few cases, the organization's politics create a situation where employees don't find the organization's trainers credible, and will much more easily believe an "expert from afar". Or, perhaps, they associate in-house trainers with propaganda. Such cases might justify bringing in the vendor's trainer.
Just remember that, in spite of all their advantages, vendor supplied
instructors have less knowledge of your employees and their work, so
the supplied examples and exercises won't be as specific to your
Decide on the attendees
Who gets trained? Obviously, for technical troubleshooting your
technical people would receive the training. The courseware vendor can
tell you the optimal class size, so you can decide how many to send,
and if you should conduct more than one class.
Some employees are more accommodating to new ideas than others. When
all other things are equal, send those most likely to buy in to the new
process. When their productivity is improved, use that fact to petition
both upper management and other employees to obtain troubleshooting
Letters to the Editor
All letters become the property of the publisher (Steve Litt), and
be edited for clarity or brevity. We especially welcome additions,
corrections or flames from vendors whose products have been reviewed in
magazine. We reserve the right to not publish letters we deem in
(bad language, obscenity, hate, lewd, violence, etc.).
Submit letters to the editor to Steve Litt's email address, and be
the subject reads "Letter to the Editor". We regret that we cannot
your letter, so please make a copy of it for future reference.
How to Submit an Article
We anticipate two to five articles per issue, with issues coming out
We look for articles that pertain to the Troubleshooting Process, or
on tools, equipment or systems with a Troubleshooting slant. This can
done as an essay, with humor, with a case study, or some other literary
A Troubleshooting poem would be nice. Submissions may mention a
but must be useful without the purchase of that product. Content must
overpower advertising. Submissions should be between 250 and 2000 words
By submitting content, you give Troubleshooters.Com the
perpetual right to publish it on Troubleshooters.Com or any A3B3website. Other than that, youretain
the copyright and sole right to sell or give it away elsewhere.
will acknowledge you as the author and, if you request, will display
copyright notice and/or a "reprinted by permission of author" notice.
you must be the copyright holder and must be legally able to grant us
perpetual right. We do not currently pay for articles.
Troubleshooters.Com reserves the right to edit any submission for
or brevity. Any published article will include a two sentence
of the author, a hypertext link to his or her email, and a phone number
desired. Upon request, we will include a hypertext link, at the end of
magazine issue, to the author's website, providing that website meets
Troubleshooters.Com criteria for links
that the author's website first links to Troubleshooters.Com. Authors:
understand we can't place hyperlinks inside articles. If we did, only
first article would be read, and we can't place every article first.
Submissions should be emailed to Steve Litt's email address, with
line Article Submission. The first paragraph of your message should
as follows (unless other arrangements are previously made in writing):
I (your name), am submitting this article for possible publication
Troubleshooters.Com. I understand that by submitting this article I am
the publisher, Steve Litt, perpetual license to publish this article on
or any other A3B3 website. Other than the preceding sentence, I
that I retain the copyright and full, complete and exclusive right to
or give away this article. I acknowledge that Steve Litt reserves the
to edit my submission for clarity or brevity. I certify that I wrote
submission and no part of it is owned by, written by or copyrighted by
After that paragraph, write the title, text of the article, and a two
description of the author.
URLs Mentioned in this Issue
- Microsoft Licensing
You cannot reimage a computer with an OEM Windows installation. You
to pay Microsoft for the privelege.
"Microsoft retools corporate software licensing program".
"Microsoft asks PC builders to help stem 'naked' system order". This
documents Microsoft's rewards for system builders to turn in
ordering PC's without operating systems. This article also says that
XP and Office XP will have a "forced registration system".
"Microsoft licensing shift creates uncertainty for user". Microsoft
must do expensive audits.
"Microsoft Pitches XP to Corporate Users": More on Windows XP.
"User queries prompt new Microsoft attack on open source": Describes
Microsoft's Craig Mundie says that the open-source movement could
in "product instability" and "inherent security risks" for software
- Windows to Linux Conversion Stories
- Miscellaneous URL's