Dr. Dobb's Journal June 2002
Dear DDJ,
I worked for IBM for my whole career, maintaining both hardware and software. An important rule was that any problem discovered should be fixed before returning the equipment to the customer for continued use. The logic was based partly on not wanting to keep a list of known problems on every machine, partly to avoid needing to negotiate time to repair equipment that the customer did not perceive to be malfunctioning, and partly to make it easier to find the defect that caused us to be called.
When I got to the section of Eric McRae's article "Tracking Down Killer Bugs" (DDJ, April 2002) entitled "Building a Test Platform," my immediate thought was that Eric should have issued the fix for the problem he knew about. If he had, he would have saved three weeks, as well as the cost and hassles of a new testing environment. I think that one of the "Lessons Learned" should have been, "Fix any bug you know about as soon as possible." It usually saves you resources if you do.
Peter M. Guy
peteguy@attglobal.net
Eric's response: Pete, thanks for your suggestion for a lesson learned. In some cases, I would agree. However, the combination of extremely complicated microcode and extremely rare symptoms caused me to delay releasing the fix I knew about. The change to the microcode required for the fix would have changed the timing of the microcode actions a couple of clock cycles. While small, such a change is perfectly capable of modifying the conditions necessary to reproduce the symptom. Further, based on the information I had at the time, the bug I knew about could not have produced the symptoms.
In this case, yes, had I released the fix, it would have cured the problem. I would have festered over it in my own time and eventually discovered what had actually happened. However, had the problem been somewhere else, my fix might have pushed its occurrence rate out to once every 10 hours instead of every three. In that case, the problem might have slipped through testing and into production, later to surface as a recall candidate, thus costing my client millions of dollars.
My rule is that if a client is experiencing a problem, I avoid altering any of their code until I fully understand the conditions that are causing the problem. I don't always hold fast to this rule, but in this particular case, I'm glad I did. My client learned that the correct information is critical to my debug process. They fortunately understood and acknowledged that without any prodding from me. Again, I appreciate your taking the time to comment on the article.
Dear DDJ,
Regarding Jonathan Erickson's "Editorial" (DDJ, February 2002), the Boeing strike (as well as labor unrest at other places, such as Microsoft) is in fact a repeat of the events leading to the founding of the CIO during the Great Depression. Over and over again, in the various different mass-production industries, the founding strike was spearheaded by the skilled workers. In those days, the skilled workers were not usually university-trained professionals, but apprenticeship-trained craftsmen, such as machinists, millwrights, tool and die men, and so on. Management was commonly recruited from these workers. A notable labor leader such as Walter Reuther of the United Auto Workers was apt to be sociologically indistinguishable from a corporate manager of all but the highest level. Reuther, in fact, had spent a couple of years in the Soviet Union trying to launch an automobile industry under extremely unfavorable conditions before returning home to become a labor leader. Labor militancy was closely bound up with craft skill. One of the stories that is told about Reuther is that, in his capacity as a tool and die man, his stamping dies tended not to need debugging (with a file) like those of ordinary mortals. Sound familiar?
Here is another instance. In the electrical industry, the employers, faced with a declining market, laid off the easily replaced operatives, but tried to hang onto the skilled men by employing them in whatever capacity possible. Typically, this meant putting them on the assembly line, and this proved to be the final straw leading to the rise of the United Electrical Workers. The skilled workers in this case were the machinists who designed and built giant custom steam turbines for electric power plants and ships.
Of course, history never repeats itself exactly, and conditions have shifted somewhat. In 1929, the machinist typically owned his hand tools (files, micrometers, and so on), but not the machine tools themselves, and still less the punch presses that replicated his work in mass production. This meant that the skilled men were obliged to stand and fight in their original workplace, rather than simply withdrawing their labor to greener pastures. The limits to labor power were perhaps defined by the 1940 "Reuther Plan" for defense mobilization. Reuther had the human resources he could recruit machinists with detailed knowledge of all the machine tools available and what they could be made to do, and was able to do detailed production planning. However, in the last analysis, the UAW did not own the machine tools, and the plan was squelched.
In this respect, modern programmers are perhaps more like construction tradesmen. As Herbert Applebaum pointed out in Royal Blues, union construction craftsmen have the option of simply going into business for themselves as small contractors if conditions justify it. This is because they own all their tools except those that are available for rental on the open market. Likewise, a programmer can easily afford to own a complete set of tools, up to and including his own web site and/or Beowulf cluster.
One implication of this freedom is that the open-source movement is likely to be a major beneficiary of labor militancy, particularly if large numbers of people are doing fairly routine jobs (web-page design, for instance), which pay the bills, but do not exactly stretch their mental horizons. Open source has delivered impressive results so far, with tens of thousands of participants in a booming economy with a tight labor market. In a depression, the number of participants might easily rise into the hundreds of thousands or the millions.
Andrew D. Todd
U46A8@WVNVM.WVNET.EDU
Dear DDJ,
Jonathan Erickson's February 2002 "Editorial" is making the rounds here at Boeing. A few comments. Typical raises were 4 to 5 percent every year before the strike, so the 17 percent isn't as impressive as it sounds. The raises are now distributed almost equally, nonperformers who don't contribute get the same raise. The $2500 bonus didn't offset the loss of pay. The proposed copay was minimal, the agency fees that people are forced to pay in a now closed shop equal that. Voice in company decisions? I see engineering work going to Moscow. Just a different perspective.
Curt Adalbert
curt.n.adalbert@boeing.com
DDJ