Columns


Questions & Answers

More On Storing Data In .EXEs

Ken Pugh


Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C language courses for corporations. He is the author of C Language for Programmers and All On C, and was a member on the ANSI C committee. He also does custom C programming for communications, graphics, image databases, and hypertext. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 493-4390. When you hear the answering message, press the * button on your telephone. Ken also receives email at kpugh@dukemvs.ac.duke.edu (Internet).

Q

Can you explain to me (or at least give me a book reference) how to define and integrate into a program the function keys (F1, F2, ...), as well as "CTRL + KEY" and "ALT + KEY"? I looked up many books on C, and found only one (The Waite Group's Microsoft C Programming for the IBM PC) briefly explaining how to define these keys, but not how to integrate them and make them work inside a program.

Richard R. Jodoin, M.D.
Quebec, Canada

A

The book reference for the key values is the IBM Technical Reference Manual, Section Five, System and BIOS. You use the getch (or getche functions of Microsoft, rather than getchar (see Listing 1) . For "special" keys, these functions return a 0 value followed by another value. IBM calls these "extended codes."

Table 1 shows a brief excerpt from the table. You can look at the output of the program for any other keys you might want to use. It's interesting to note that ctrl-left-cursor and ctrl-right-cursor have extended codes, but not ctrl-up-cursor and ctrl-down-cursor, although the extended codes have plenty of room left for those and other omitted key combinations). IBM listed some guidelines for key usage in he manual. It did not provide extended codes for those keys not in the list.

Q

The writer is a comparative tyro in "C", so it may not be surprising when he meets up with a problem. However, here it is:

The program Setamstr.Com (Listing 2) is to run under MS-DOS v3.0, and is to set the AMSTRAD DMP3160 printer. Compiled with Mix CC v2.5.3, and linked with LINKER \s switch, it runs successfully. Compiled and linked with Microsoft Quick C 1.01, it throws the error message and exits. This is a great puzzle to me, and a disappointment.

Can you help?

A. E. Molony,
Dunedin, NZ

A

This is a simple one. Change the open call from:

if ((fp = fopen("PRN:", "W"))
   == NUL)
to:

if ((fp = fopen("PRN", "w"))
   == NUL)
The mode values are case sensitive. "W" is not for writing; "w" is. Also, the colon is not needed in the filename.

Reader's Replies

Comment Style

I would like to remark about a letter sent by Kim Tsang of Hong Kong to your CUJ Q?/A! column (December 1990, pp. 91). A product I stumbled across advertised in CUJ adds end comments to closing functions and statements during reformatting. (The product is C-Clearly.) The comment added consists of the function/statement keyword for the block followed by the first identifier on the line of the opening statement. Assume the following code:

if (switch_flag_on)
{
   code goes here
}
After running it through the reformatter the code appears as:

if (switch_flag_on)
{
   formatted code goes here
}/*if switch_flag_on */
Note the added end comment. After end comments are inserted, I precede them with the word END as in the following:

} /* END if switch_flag_on */
When I use the vi editor, the command for global replacement to add the word END to all end comments is:

:s/}\/\*/} \/\* END/g
Some would consider the end-comment utility trivial, but it comes in handy when you have cascades of closing braces.

I would also like to put forth the following observation regarding compiler updates. I am using Turbo C++ 1.0. Recently I saw a blip on a local BBS regarding an update for this compiler. I called Borland and questioned tech support. I was told there was indeed an update going around on BBSs, but I could also get the update on disk from Borland. I asked if there were any differences in the updates and was told yes. Before I could query about the differences I was forwarded to customer support where I dutifully handed over $15 and received eight new disks. What really burns me is that whenever Borland introduces a new product I get six trees worth of mail asking me to buy, yet for bug fixes - not a word.

Phil Pistone
Chicago, IL

Storing Data in .EXE Files

This letter is in response to Ken Yerves' question about storing data in an .EXE file in the December 1990 issue. There is an alternative, and easier, method to store data in an .EXE file rather than in a static data area. Try storing your data at the end of the .EXE file. The second and third integers of the .EXE header give the information needed to find the data at the end of the file. The third integer gives the number of 512-byte blocks used by the .EXE file, the second integer gives the actual number of bytes in the last block. With this information you can get the position of the data with the equation

DataPosition = 512*(header[2]-1
+ Header [1]+1)
The functions in
Listing 3 are used to write and read the data stored at the end of the file.

To attach the Data Segment to the .EXE after compilation, use the DOS copy command

C0PY exefile.exe/B+exefile.dat
 newexe.exe
or just write it there on the first execution.

This should solve your problem, but be very careful to write only after the end of the executable code. For more information on this technique see the May 1990 edition of Inside Turbo C, from which this code is adapted.

Also, when compiling the program make sure to turn off all debugging options, as they also tack information onto the end of the .EXE and can cause problems.

Jeff Jones
San Diego, CA

I'd like to make a suggeston to Mr. Yerves, who wrote in the December 1990 issue of CUJ asking for suggestions to help him locate the data segment in an executable file. Since all he's trying to do is store configuration information in the .EXE file, why not just store it at the end of the file where you always know where it is? When an .EXE file is loaded, information in the header tells the loader what parts of the file are used. If you add to the end of the file and don't change the header, MS-DOS will ignore the additional information when it loads the program. Here's the way I store configuration information in .EXE files:

I define a struct called config to contain the configuration information. I also define a unique (random) string of bytes, eight bytes long, which is used as a "magic word" to tell me that the configuration information is at the end of a file. To write configuration information to the file:

1. Fill out the config struct.

2. Open the executable file in binary mode for reading and writing.

3. lseek to the end of the file. I don't have to know how long the file is because lseek can set the file position relative to the end.

4. Read in the last eight bytes of the file and compare them to the magic word for the file. If they are the same, I know there is configuration information in the file, so I use lseek to set the file position to ((sizeof struct config) + 8) relative to the end of the file. This sets the file position so the next write will be at the first byte of the existing configuration information. If the magic words did not match, I know there is no configuration information in the .EXE file, so I set the file position to the first byte after the end of the executable. New configuration information will be appended to the file.

5. I then write the new configuration information, followed by the eight bytes of magic word. The new information will either overwrite the old or be appended to the file if there was no old information.

To read the configuration information back, the process is exactly the same, except that the file need be opened only for reading. In step 4 I return a "no configuration information" if the magic words don't match.

I don't know whether this technique would work if you employ one of those .EXE file compression utilities like LZH, and I don't think configuration information would survive a trip through EXEPACK, but one can't have everything.

I'm sure that Mr. Yerves will find this a simple and elegant solution to the problem of storing configuration information in .EXE files. I hope that your readers can make use of the technique.

Brad Brown
Weston, Ontario

I enjoy your columns in CUJ. I noted Ken Yerves' question in the December 1990 issue concerning .EXE file segments.

Inside Microsoft C, May 1990 had an .EXE configuration program. You might pass it along to Mr. Yerves.

George W. Smith
Chatham, NJ

The referenced article is "Storing Data in an .EXE File". The technique is to simply copy the .EXE file and the data file using the/B switch as:

COPY MYPROG.EXE/B+MYDATA.DAT
NEWPROG. EXE
The third word in the .EXE file tells how many 512-byte pages there are. The second word gives the length of the last page. Simply seek to:

 (third_word - 1) * 512 + second_word + 1
Jeff Jones's letter gives his version for routines that will allow you to store and retrieve the data.

The article also notes that virus checking programs may detect that you are altering an .EXE file. You could use the technique to keep a checksum to see if the .EXE file has been altered by a virus.

The October issue of TECH Specialist had an article by Hugh Staley entitled "Incorporating SetUp Params in .EXEs". The article suggests putting a string into a static variable that can be located using a simple search. Following this variable you should declare variables you wish to be settable. You can hardwire that position into your configuration program. (KP)

char and long Pointers

In response to Hans-Gabriel Ridder's question, the editor's note had:

((long*) &ptr) ++
This should be

(*((long**)&ptr))++
/* get **char, convert to **long,
   deref to *long (lvalue), then
   inc. */
This could be put into a macro:

#define ptr_type(type, ptr) \
(*((type**)&ptr))
See
Listing 4 for an example.

Stephen D. Williams
SDW Systems

Portable Structure Data

In response to R. Palmer Benedict's question, I ran into this problem when developing an application interface builder system. The following method illustrates how the alignment bytes can be detected and compensated for. It can be used to pack and unpack structures in a safe way on any machine (any normal machine anyway). This algorithm works by testing the alignment characteristics of the built-in scalar types. It does this just once and stores the results. Routines are then used to access the elements of the structure in an array-like manner. The only requirement is that the structure and the code that analyzes padding be compiled with the same alignment. Note that I communicate the format of the structure by a simple string containing one character for each field representing a data type along with an optional count for arrays (see Listing 5) .

As for putting structures on disk, I have in the past used the above routines to pack individual elements of a structure in and out of a character buffer, which is read/written in one call. For machine portability, the usual method is to choose one format that is the standard. Portable code checks its own characteristics and converts accordingly. This may involve either compile or runtime checking. The format chosen may coincide with a particular machine, common with migrating applications (DOS/Intel to UNIX/Motorola). It may also be chosen as a greatest/least common denominator. Good examples of this would be Oracle's Number format, Internet/Sun's XDR, and OSI's ASN.1.

Of course the other alternative, which is much easier to debug, is to convert to some form of text, fixed or otherwise. If we had an absolutely optimum text<->number conversion function(s), this would be much more viable. (Any fantastic hacks out there?)

Notice that these methods allow the creation of library routines that do not have knowledge of specific structure formats. These methods allow datadriven code for screen generators, general database systems, language writing and interfacing, etc. I have used them for all of the above.

Stephen D. Williams
SDW Systems

Thanks for your input. That's a good way to solve the incompatible data representation problem.

It's interesting to note that the UNIX machines running network software have a series of functions that permit transparent translation of integer values that are in TCP/IP message headers. On some machines, these are simple assignments; on others, they perform the byte swapping. (KP)