Dear DDJ,
The lawyers have finally given me the green light to describe why the MS-DOS detection code discussed in the article "Examining the Windows AARD Detection Code" by Andrew Schulman (DDJ, September 1993) was in the Christmas beta. I hope you will keep an open mind, listen to the truth, and accept it. It may not make such good press, but sometimes the truth is like that.
It has never been a practice of this company to deliberately create incompatibilities between Microsoft system software and the system software of other OS publishers. I am not aware of any instance where Microsoft intentionally created an incompatibility between Windows and DR DOS. Windows is tightly coupled to the underlying MS-DOS operating system. It relies on a number of very precise behavioral characteristics of MS-DOS which have nothing to do with the Int 21h API. Because of this tight coupling, an MS-DOS imitation must have exactly the proper behavior, or all sorts of subtle and not-so-subtle problems will occur, including data loss.
Microsoft does not test Windows on anything other than Microsoft's MS-DOS. We don't have the development or testing resources, nor do we consider it our job to test Windows on other systems. If you're the developer of an MS-DOS imitation, you shouldn't expect your main competitor to do your work for you. If Windows works on your imitation, it works; if it doesn't, it's your problem to fix. That may not give you, Andrew, the warm and fuzzies, but this is business, not a giveaway.
During the developing of Win 3.1, a great deal of thought was given to ways to reduce the high support burden associated with Windows. During the betas, we got a few bug reports about Windows not working correctly on some of the MS-DOS imitations. So it seemed like a very small portion of the market might have problems running Win 3.1 on something other than genuine MS-DOS. In order to be fair and up-front with them, we considered that it might be a good idea to let them know--before they encountered problems or even data loss--that they were running Win 3.1 on a system we hadn't tested. The intended purpose of this disclosure message was to protect the customer and reduce the product-support burden arising from the use of Windows on untested systems. The plan was to include an "off switch" in the commercial release that the end user could use to prevent the message from being redisplayed every time Windows was run.
In order to preserve the option of putting a disclosure message in the commercial release of Win 3.1, some MS-DOS detection code was implemented and inserted into the relevant modules of the "Christmas" beta. This code only detected the presence of MS-DOS; it did not detect any competing OS.
The wording of the message that was displayed if something other than MS-DOS was detected in the Christmas beta has been the subject of accusatory speculation. Our intention for the final release was to warn the user that Windows (and that includes all Windows components) is being run on a system we have not tested. The message in the beta, however, was carefully crafted to produce a desired effect. Since this code was inserted very late in the development schedule, we were very concerned about making sure it worked properly, and especially that it did not have "false positives," i.e., that it did not "misfire" when there really was genuine MS-DOS underneath. As a result, we wanted to make sure that anytime it triggered, the beta tester would call us so we could follow up and confirm that the code was reliably detecting MS-DOS, or if instead it was returning false positives. In fact, the message says to contact the Win 3.1 beta support.
The language of the message was not alarming; it did not mention the nature of the "nonfatal error" nor the name of any competitor. Moreover, the message either disappeared in a matter of seconds or with a single keystroke. Nor did the message stop Windows from running.
Of course the code was concealed. This should not be surprising at all. If it can be easily circumvented by an imitation (which I remind you we haven't tested against), then its purpose has been defeated.
Neither the detection and concealment code nor the nonfatal-error message created any incompatibility with DR DOS.
Prior to the March 9, 1992 RTM date for Win 3.1, we decided not to include the disclosure message in the commercial release of the product because we didn't want to run the risk that it would be misinterpreted and thus divert attention from the new features of Windows 3.1. We were in a tough competitive battle with OS/2 and wanted the attention focused on the great new features of Win 3.1, rather than artificial "controversy" whipped up by the press or our competitors.
In fact, the planned disclosure message was never coded into the product. Because this decision was made so late in the development cycle, and we didn't want to risk introducing instability into the product, we left the detection and concealment code and the nonfatal-
error message in the product, but disabled it from printing onscreen. As a technical person, Andrew, you know that a NO-OP is a NO-OP. Even though the code remains in Win 3.1 in a "quiescent" state, the fact remains that no messages are printed. You insinuate that we could somehow, sometime "turn it on." How? ESP? Remote control? If we could get people to execute a patch that would turn the code on, we could certainly figure out a way to patch the whole thing in.
Finally, the detection and concealment code and the nonfatal-error message code have been stripped out of the versions of Windows currently under development. That's the story. Surely not as interesting or controversial as you or others would have people believe, but it's what really happened.
Brad Silverberg, Vice President
Microsoft Corp.
Redmond, Washington
Andrew responds: Thanks for your thoughtful explanation to the AARD code, Brad. As you've also told me in person since the article appeared, very late in the beta, Microsoft decided against displaying a message when running Windows on a non-Microsoft DOS. At this late stage, Aaron Reynolds (author of the AARD code) needed to produce the smallest-possible binary "diff," so he cleared a control byte rather than removing the code.
It's also noteworthy that you point out that the wording of the beta message differed from that intended for the Windows 3.1 retail. The beta message was intended to test Aaron's tricky code and to weed out what you, Brad, call "false positives." (In retrospect, this code couldn't have worked 100 percent of the time, as an April 1993 Microsoft KnowledgeBase article, "Replace Case Mapping Function with Proprietary Version," notes that a DOS program can hook INT 21h AH=38h and replace the built-in case mapping function.)
But even if I was off-base about why the code was left in the retail version and the wording of the error message in the beta, this doesn't change anything substantial, and in some ways your explanations are more damning than my original article. You say that Microsoft intended to put into Windows 3.1 a warning to the user that running Windows on a non-Microsoft DOS was "untested" and might cause data loss. The intended error message would have looked something like the following:
WARNING: This Microsoft product has been tested and certified for use only with the MS-DOS and PC-DOS operating systems. Your use of this product with another operating system may void valuable warranty protection provided by Microsoft on this product.
This is not a hypothetical error message, but one that several versions of Microsoft C produce when running on non-Microsoft versions of DOS. (This error message and the code that produces it are discussed in Undocumented DOS, second edition, Chapter 4.) From your letter, it appears that Microsoft intended to hook the AARD code up to a similar message in Windows 3.1.
But while such a message seems benign to you, the word "untested" could have a chilling effect on the typical user. As long as Windows and DOS are sold as separate products, it would be a classic tying arrangement for Windows to scare users of other DOSs with blanket statements that something might be wrong. Novell rightly characterizes such manufactured error messages as "product disparagement." If Windows has some special expectations of the underlying DOS, then the Windows group should publish a specification saying what those expectations are. If DR DOS isn't up to spec, then so be it. It all comes back to the issue of documentation: If Windows has special needs from DOS, Microsoft should specify what those needs are. When I recently asked you this, you replied that Microsoft can't afford to spend time on such documentation--there are only 300 people working on DOS and Windows. I'm not sympathetic to this argument.
Microsoft says that Windows and MS-DOS are integrally tied, that they're designed to work as a "seamless" team, and so on. Remember, though, that these products are sold separately. For this discussion, Windows is just another DOS utility, like DesqView. Windows' reliance on undocumented DOS functionality should be viewed in the same light as the use of undocumented functions in Excel and WinWord that Microsoft denied so strenuously and for so long.
Yet here, you're not denying that Windows exploits features of DOS that Microsoft refuses to document for third parties. In fact, it is the entire premise for your argument: Windows "relies on a number of very precise behavioral characteristics of MS-DOS which have nothing to do with the Int 21h API." In other words, Windows uses undocumented DOS!
Let me see if I can summarize Microsoft's argument: Windows relies on undocumented DOS. Microsoft can't guarantee that a non-Microsoft DOS will support these undocumented DOS features, so it came up with a test that it knew non-Microsoft DOSs would fail, and that, by encryption, it hoped the vendors of these DOSs would never figure out. Thus, it could detect non-Microsoft versions of DOS and put up a message telling the user that maybe Windows might possibly not work on these untested environments. We're supposed to feel better about this?
The good news is that the forthcoming "Chicago" operating system solves the problems posed by the AARD code. As you say, the AARD code has been removed. More important, Windows 4 will not be sold separately from the under-lying operating system. The Windows component of Chicago requires DOS 7.0 or higher, which is the other component of Chicago. This will make life difficult for Microsoft's competitors, but this is an intrinsic difficulty, not a contrived one. Contrast the simplicity of Chicago's up-front check for a required DOS version number with the AARD code's encrypted and obfuscated test for arbitrary aspects of undocumented DOS, and you will see that there is a right way and a wrong way for a product to have system requirements.
Dear DDJ,
In his article "Examining the Windows AARD Detection Code" (DDJ, September 1993), Andrew Schulman graciously credits me with having "unraveled" part of the AARD code. Although I'm certain that Andrew analyzed the rest of the code independently, I should like to claim prior discovery of the workings of the whole code. In fact, I posted a rough but comprehensive description, identifying all tests and (briefly) raising many of the points taken up in Andrew's article to the Windows/development conference of the British-based bulletin board CIX on June 7--8, 1992.
Of course, Andrew deserves applause for bringing the AARD code to a wide audience, but since he has relied on some of my work, propriety demands at least a note that my analysis predated his, perhaps with the explanation that my findings had not been as widely disseminated.
I contend personally that Microsoft deserves strong condemnation for the mere existence of the encrypted detection code and a disguised, misleading error message. This is one reason why I told Andrew and others of it soon after its discovery--before knowing that DR DOS ran foul of the tests. In this light, Andrew's omission gives the impression that the AARD code is important only because Novell was inconvenienced.
Geoff Chappell
London, England
Andrew responds: I'm sorry that Geoff feels my article did not sufficiently stress his priority. The article twice stated that I never would have figured out the crucial redirector/case map/FCB test if Geoff hadn't explained it to me. The article gave Geoff's e-mail address. Unfortunately, an additional reference to Geoff as "a master of disassembly" and a plug for his forthcoming book were removed during editing. The second edition of Undocumented DOS, from which portions of the article were extracted, refers constantly to Geoff.
While Geoff told me about the code in April 1992, it seemed like he was telling me about some obscure aspect of WIN.COM and HIMEM.SYS that did not sound very important. The reason I eventually looked at the AARD code was that I had heard vague reports of incompatibilities between Windows and DR DOS, and these didn't make any sense to me. A reporter, Wendy Goldman Rohm, informed me that the FTC was looking into something having to do with Windows 3.1 betas running on DR DOS. I dug out all my beta copies, found the message, and worked backwards from there. I wanted to independently confirm or deny what I knew the FTC was already looking at.
I doubt Novell ever figured out the crucial test involving the redirector, default-case map routine, and location of the first System FCB. The most recent beta of Novell DOS 7.0 I examined still did not contain the necessary minor adjustments to pass the AARD test. Thus, I suspect Novell first learned of the crucial test from the DDJ article. And the article made quite clear that this information came from Geoff.
What I said in the article is accurate: Geoff uncovered the crucial part of the code. Surely it was Novell who, for what it's worth, can claim priority to uncovering the noncrucial parts.
Copyright © 1994, Dr. Dobb's Journal