Matt, author of Windows Internals, is a programmer at Nu-Mega specializing in debuggers and file formats. He can be reached on CompuServe at 71774,362.
Third-party development tools intended to replace and enhance the standard development environment tools can greatly enhance the productivity of DOS and Windows programmers. Yet, there can be pitfalls when choosing to replace standard machinery. For example, if you're a Borland C++ developer using the standard Turbo Debugger for Windows (TDW), you can reasonably expect full technical support from Borland when a debugging problem arises. However, if you've replaced TDW with, say, Symantec's Multiscope debugger, who do you call in the event of a problem? At best, the product support will be fragmented. At worst, both companies may point fingers in the other direction leaving you somewhere in the middle.
Also, the executable file format or debug specification that's in vogue today may be obsolete tomorrow. If you commit to using a third-party tool that doesn't keep up with the latest industry standards, you're stuck. Therefore, there has to be a compelling reason for a user to switch to a new set of tools. A speed increase of 10 percent or a file size reduction of 2 percent may not be enough to convince you to give up the security of the programs you already use. Instead, a third-party tool not only has to provide compatibility with your current tools, it also needs to offer significant advantages. I've put OPTLINK for Windows version 4.01, from SLR Systems to the test to see if it meets these criteria.
OPTLINK for Windows is intended as a drop-in replacement for Microsoft's LINK.EXE and Borland's TLINK.EXE. OPTLINK runs from the DOS command line, and generates DOS executables, as well as DLLs and executables for Windows and OS/2 1.x. It does not generate OS/2 2.x LX format files, nor the PE format files used by Win32 operating systems such as Windows NT. However, SLR has indicated that it intends to support PE format files soon.
OPTLINK performs all the standard optimizations that LINK and TLINK perform, including far call translation, segment packing, and fixup chaining. Far call translation occurs when the linker sees a far call instruction to a procedure that's in the same code segment. For example, given a call of the form call far ptr xxxx:yyyy where xxxx is the same as the current code segment, the linker can replace that one instruction with:
NOP PUSH CS CALL NEAR PTR YYYY
This second sequence is both faster to execute because it avoids a costly segment register load, and avoids the need for a fixup in the .EXE or .DLL file, thus shrinking the file size and speeding up load time.
Segment packing occurs when the linker takes segments of the same class and concatenates them together. For instance, if you were using the medium or large memory models, and had files A.C, B.C, and C.C, the resulting code segments in the .OBJs would be A_TEXT, B_TEXT, and C_TEXT. Without segment packing, the linker would produce three separate code segments in the .EXE. While not really a problem for DOS executables, in Windows this wastes space in the file and forces Windows to use more selectors when it loads the program. In addition, segment packing affords the linker additional opportunities to perform far call translations, saving even more space.
Fixup chaining is a method of compressing the load-time relocation information in NE format files. (NE files are Windows and OS/2 1.x files.) To give an example, consider a program that makes 20 calls to the Windows BeginPaint() API. Without fixup chaining there would be 20 fixups referring to BeginPaint() in the .EXE. Each fixup is eight bytes in length, so the total space used for relocations is 160 bytes. A linker that does fixup chaining (such as OPTLINK) can get away with only putting one fixup record in the file. How's this? The NE format has a clever method of letting fixups be applied in a linked-list fashion. The head of the list is pointed to by the single relocation record. At the spot in the segment where the address of BeginPaint() will be plugged in is a 16-bit offset to another place where BeginPaint()'s address also needs to be applied. When the operating system loader brings the file into memory, it just visits each node of the chain and leaves behind a copy of the necessary information (the target address). Not only does fixup chaining save space by eliminating redundant fixup records, it can also speed up load times significantly. For more information on fixup chaining (as well as segment packing), see my article "Liposuction Your Corpulent Executables and Remove Excess Fat" (Microsoft Systems Journal, July 1993).
In addition to the main program (OPTLINKS.EXE), the package comes with a few other programs. OPTIMP is a superset of the IMPLIBs shipped with the Borland and Microsoft development environments. STRIPDEB removes the debug information from the end of an executable, similar to Borland's TDSTRIP and Microsoft's CVPACK /STRIP. FIXLIB accepts a Borland-produced .LIB file format and modifies the dictionary so that LINK, TLINK, and OPTLINK can all use it. According to the SLR folks, the dictionary in Borland .LIBs is incorrect at times, and both LINK and OPTLINK are unable to use it. I personally love FIXLIB because I can now use Borland's IMPORT.LIB with LINK and OPTLINK. IMPORT.LIB has all the exported Windows functions, not just those documented in Microsoft's LIBW.LIB.
Before Borland became a major presence in the C/C++ market, OPTLINK was targeted at users of Microsoft C who wanted smaller .EXEs and faster linking. However, SLR now appears to be targeting users of Borland's TLINK. The reason can be summarized in two words: debug capacity. As all too many users of Borland's TLINK 5.x know, when building a program with debugging information, TLINK can run out of memory amazingly early. This is especially true with C++ programs. The use of class hierarchies leads to much more debugging information than the equivalent C code would produce. Borland users who have stuck with TLINK are getting increasingly frustrated with turning on debugging information in just select modules to prevent TLINK from running out of memory. OPTLINK has a much greater capacity when processing Borland's debugging information, so it has a major inroad with Borland's customer base. In fact, Borland representatives have themselves recommended OPTLINK when pressured about TLINK's capacity problems.
One of TLINK's attempts to deal with the sheer volume of debugging information was to introduce symbol table compression (using the /Vt switch). A compressed symbol table is in the same format as a non-compressed symbol table. The compression that occurs is more a matter of eliminating duplicate type information. For instance, if you defined a struct in an .H file and included that file in three separate .CPP files, the type information describing the structure will show up three times in an uncompressed symbol table. By using /Vt with TLINK, there would only be one copy of the struct's type information.
OPTLINK performs debug information compression implicitly as part of the link process. In fact, OPTLINK does a better job of eliminating redundant information than TLINK /Vt does. I determined this by linking a couple of programs with both TLINK /Vt and OPTLINK. To see the resulting debug information, I ran TDUMP -v -ex on the two executable files. I then compared each debug information subsection table in the two .EXEs. The detailed results are breathtakingly dull, so I'll spare you a recitation of them here. The short summary is that OPTLINK was more aggressive in eliminating types, member definitions, class definitions, and so on. Table 1 shows the debug information sizes for the two files. With one minor exception, I noted that OPTLINK fully supports the Borland debug specification, down to inclusion of the browser symbols and code coverage tables. The minor exception is that OPTLINK doesn't output browser information for local symbols.
Another compelling reason for Windows programmers to consider OPTLINK is that it produces significantly smaller executable files and DLLs for Windows. The primary size reduction comes from OPTLINK's ability to chain fixups as noted earlier. In linking the OWL WCHESS.EXE sample program, OPTLINK produces 503 fixups as compared to 3412 by TLINK. At eight bytes per fixup, that's a savings of over 22K, and more than 10 percent of the .EXE's size. Needless to say, the OPTLINK version will load faster as well.
Since OPTLINK defaults to producing Windows files that will only run in protected mode, it makes all entries in the NE entry table FIXED, even if the function is in a MOVEABLE segment. By using FIXED entries instead of MOVEABLE, OPTLINK can eliminate three bytes of overhead per entry. TLINK also defaults to PROTMODE operation, but generates MOVEABLE entries if the function is in a MOVEABLE segment. Another space savings offered by OPTLINK includes a smaller DOS stub if you let it provide a default stub.
Despite all the benefits OPTLINK offers, there are a few rough edges if you're a TLINK user. OPTLINK was originally developed as a Microsoft LINK replacement; Borland support was added later. As such, it doesn't appear that OPTLINK has been "burned in" as much for TLINK replacement as it has for LINK replacement. For example, in a linker response file, it's legal for a program to specify only the base file name for the target to be built (for instance, "FOO", rather than "FOO.EXE"). When I passed OPTLINK such a response file and told it to build a DLL, it created the file with a .EXE extension, rather than .DLL. The bit indicating that the file was a DLL was set inside the NE file, but the file's extension was wrong. TLINK handles this situation correctly.
Another quirk is OPTLINK's response file handling. I'm in the habit of invoking Borland's command-line compiler (BCC.EXE) with just a .C or .CPP file, and letting it supply the defaults when invoking TLINK. To make BCC work with OPTLINK, I made a copy of OPTLINKS.EXE called TLINK.EXE, and supplied an appropriate /TLINK mode OPTLINKS.CFG file. For a test, I ran BCC A.C, where A.C was a minimal DOS program. When using Borland's TLINK, the linker accepted the output from BCC without a peep. When using the renamed OPTLINKS, it prompted me for both library files and a .DEF file (ala LINK). Pressing the Enter key at each prompt yielded an .EXE file, but this prompting is annoying when it happens continually in a development situation. Since the program was a DOS program, OPTLINK shouldn't have asked for a .DEF file (TLINK doesn't).
Another problem I encountered with TLINK compatibility had to do with default .DEF files for Windows .EXEs. If you don't specify a .DEF file when using TLINK, it uses a set of defaults, including a 5K program stack. While OPTLINK will also use defaults, it has a nasty habit of not specifying any stack at all for the generated .EXE. To circumvent this problem, I tried putting a /STACK:5120 directive in the OPTLINKS.CFG file. While this worked for Windows programs, it also gave DOS programs a 5K stack. Borland- produced DOS programs start out with an initial small stack, and at run time switch the SS:SP to a larger stack. Creating a DOS .EXE with an initial 5K stack was certainly not the behavior I desired from OPTLINK. The point of all this is that although SLR has put on a snazzy coat of TLINK paint, some areas appear to be lightly tested. In addition, OPTLINK seems to want to revert to LINK compatibility mode whenever it gets a chance.
In the past, OPTLINK's primary target audience was Microsoft C and MASM developers who needed faster link times and increased capacity. With LINK 5.50 from the Visual C++ package, Microsoft has significantly narrowed both gaps. However, OPTLINK still holds some advantages for Microsoft users.
To a certain extent, debug information capacity is less a problem with Microsoft tools than the corresponding tools offered by Borland. The reason is that the linker doesn't have to do all the work of massaging the debug information into its final form. When producing CodeView-style information, OPTLINK emits a preliminary version of the debug information that's relatively easy for the linker to process. Afterwards, OPTLINK invokes CVPACK.EXE which takes care of merging all the debug information into one unit and eliminating duplicate information. Interestingly, OPTLINK doesn't complain if it can't execute CVPACK. If you have older tools that only recognize the CodeView 3.0 debug specification, OPTLINK can produce this format as well as producing the default CodeView 4.0 debug information.
In the speed category, OPTLINK was just slightly faster than LINK on my test executable, but not enough to get excited about; see Table 2. In all fairness, the test .EXE wasn't large enough to test the virtual memory systems of either OPTLINK or LINK. On large industrial- grade applications, SLR claims some users see performance gains of up to 50 percent over LINK.
Regarding the parts of the .EXE used by the operating system, OPTLINK produces NE files that aren't dramatically different than what LINK produces. LINK chains fixups, so you won't see the dramatic space savings like you would when comparing OPTLINK to TLINK. In fact, OPTLINK appears to produce the identical fixups to LINK, although in a different order. Two other NE tables where there's a difference between the two linkers are the resident and non-resident names tables (where the names of your exported functions live). LINK puts entries in these tables in a seemingly random order, while OPTLINK sorts the name in the reverse order of the entry table (for example, 15, 14, 13, and so on).
Other differences between OPTLINK and LINK-produced Windows executables include the entry table. Like TLINK, LINK defaults to PROTMODE, yet still generates MOVEABLE entries where appropriate. OPTLINK always appears to generate the smaller FIXED entries, thereby saving three bytes per entry. In addition, some segments in NE files are a few bytes larger in the OPTLINK-created executable than in the LINK-produced .EXE. While this may just be an effect of rounding-up segment sizes, it could potentially be the source of different behaviors when comparing the two linkers. For example, you might have a fence-post error and try to read one byte past the end of a data structure at the end of a segment. The LINK-produced program could GP fault, while the OPTLINK- produced program might not.
Because SLR Systems is an underdog in a market dominated by the likes of Borland, and Microsoft, OPTLINK has added some unique features to distance itself from the pack. One such feature, resource binding at link time, performs the .RES binding and flag setting operations that you normally use RC.EXE for. However, since OPTLINK doesn't actually compile .RC files, you can't get rid of RC just yet. Presumably the reason for integrating the resource binding into the linker is for increased build speed. Your MAKE program only needs to invoke one program when building the executable target, rather than two. Also, it's conceivable that OPTLINK could gain additional speed by writing the segments and resources into the executable in their final positions. When using RC after the link step, the executable's segments could be written to the file twice; once by the linker, and again later by RC. As a final note on resource binding, OPTLINK has the somewhat odd (but harmless) habit of looking in the LIB= directory for the .RES file when it doesn't find it in the default directory.
Another advantage of OPTLINK over TLINK or LINK isn't really a feature at all. When linking Windows files, OPT-LINK defaults to 16-byte alignment, while LINK and TLINK default to 512-byte alignment. All segments and resources in an NE file start at a file offset that's a multiple of the alignment value (512, 1024, 1536, and so on). When a segment or resource isn't a multiple of the alignment value, the linker needs to pad the file with wasted space until it gets to the next alignment value multiple. For a more detailed description, see the previously mentioned Microsoft Systems Journal article. In short, when using the default settings, Windows .EXEs and .DLLs linked with OPTLINK are often significantly smaller than when linked with either LINK or TLINK. For example, BCW.EXE from Borland C++ 3.1 would lose around 115K in wasted file space if linked with OPTLINK (not counting additional savings from fixup chaining). Quattro Pro for Windows 1.0 would lose around 145K in the same manner. Although you can get the same effect with LINK or TLINK, OPTLINK's choice of default behaviors and values seem more finely tuned.
OPTLINK also has a smattering of smaller features that distinguish it from its competition. The /FIXDS option tells OPTLINK to modify the prologues of exported functions to load DS from SS upon entry. Microsoft has had this option (/GA) in its compiler since Version 7.0, and Borland has always had "smart callbacks," so this feature is probably only of use to users of Microsoft C 6.0 and earlier. The /XREF switch tells OPTLINK to generate a cross-reference of public symbols in the .MAP file. Each line of the cross reference shows what source module the symbol was defined in, and what modules reference the symbol. While this is a nice feature, I did encounter a probable six-legged creature. If you initialize a variable as part of its declaration (for example, HWND HMainWnd=0), you won't see the declaring module in the list of referencing modules. The /REORDERSEGMENTS option gives OPTLINK the leeway to rearrange segments in order to try to combine more segments into one (segment packing).
Developers who use the Borland Integrated Development Environment won't be able to use OPTLINK without reverting to make files or using the transfer system. This may change in Borland C++ 4.0, however. OPTLINK will integrate into the Visual C++ IDE by simply renaming OPTLINKS.EXE to LINK.EXE.
Another thing to watch for is shifting debug formats. Both Borland and Microsoft have gone through at least three major changes to the debug specification in the last couple of years. Rumor has it that another change to Borland's 16-bit debug specification is in the works. If you lock yourself into using a special feature of OPTLINK, and a new compiler comes out afterwards, you're at the mercy of SLR to get an update out quickly. Fortunately, SLR seems to be good about keeping OPTLINK up-to-date with the latest compiler offerings.
If you're in the Microsoft camp, and you work on small- to medium-sized projects and don't need any of the unique features of OPTLINK, it may not be particularly beneficial. However, if you work on larger projects or can use some of OPTLINK's unique features, OPTLINK is probably worth the money.
For Borland users, the decision is a little easier. For a small investment in time to set it up, you'll get more debug capacity, and smaller executables. The only people who it may not be suitable for are diehard IDE users and programmers who cower at the sight of a command-line switch (OPTLINK features eleven). If you have the need for a high-performance, high-end linker, OPTLINK may be just what you're looking for.
OPTLINK TLINK
4.01 5.1
File Size (w/Debug) 297570 421547
File Size (no Debug) 151632 189046
Link time (w/Debug) 6.5 sec 9.1 sec
Number of Fixup s 503 3412
OPTLINK TLINK
4.01 5.1
File Size (w/Debug) 389296 390484
File Size (no Debug) 161776 162697
Link time (w/Debug) 5.7 sec 6.3 sec
Number of Fixups 446 446
OPTLINK for Windows SLR Systems 1622 North Main Street Butler, PA 16001 412-282-0864 $350.00
Copyright © 1993, Dr. Dobb's Journal