May 1990/Executable Strings

Features

Executable Strings

James A. Kuzdrall

James A. Kuzdrall has been programming digital computers since 1960 and designing them since 1970. He is an MIT and Northeastern University graduate, and enjoys creating efficient algorithms and using computers extensively in engineering analysis and instrumentation control. He may be reached at Intrel Service Co., Box 1247, Nashua, NH 03061 (603) 883-4815.
Programmers who use C for hardware control soon discover they can't do everything from C. For example, C doesn't provide control over interrupts or access to the processor's internal registers. When processor dependent facilities are required, programmers must create assembly language modules and link them to the C program. The executable string technique explained here lets you write short assembly programs without learning the details of your C compiler's assembler and linker.
C strings are sequential arrays of bytes (characters) that reside someplace in memory and are identified by a pointer containing the address of their first byte. It is quite simple in C to make a pointer point to a function rather than a string. Creating a string of bytes which represents a real program isn't that difficult either.
The executable string technique is practical for programs of up to 20 bytes or so. Creating an executable string does require a knowledge of machine language. But since most of these simple programs need only register-to-register transfers and the most elementary addressing modes, only a superficial knowledge of the processor's instruction set is necessary.
For example, consider an executable string which masks (prevents) 6809 hardware interrupts by setting a bit in the condition code (CC) register of the microprocessor. Listing 1 gives the assembly code and its numeric equivalent.
The two-line assembly program is manually translated to the equivalent three bytes of machine code using lookup tables for the 6809 microprocessor. Most lookup tables give machine code in hexadecimal, whereas C strings traditionally accept octal character constants. (You may wish to generate a hex-to-octal lookup table using printf().) The executable string is coded in C as follows:
char *x_str;
x_str= "\032\020\071";
For C to call the string as a function requires a cast. C must be told that x_str is a pointer to a function returning an integer or, for better documentation, a function returning nothing (void). The cast function call is:
/* for non-ANSI compilers */
#define void int
/* set interrupt mask */
(*((void (*) ())x_str)) ();
Since the string is short and won't be used elsewhere, the call can be simplified by eliminating the step of assigning the string address to x_str:
/* set interrupt mask */
(*((void (*)())"\032\020\071"))();
For those who wish to believe C is inscrutable, this function call may well serve as the final proof. You should at least include the code-equivalent comment with executable strings, as shown in the listings.
This example is one of very few functions that require no parameter exchange with the C program. Where such exchanges are necessary, you must know how your compiler passes single parameters back and forth to functions. Most compilers pass the parameter to the function on the stack just above the return address. Some compilers pass single parameters in one of the processor's registers. A simple function return value is most likely passed in a processor register. Your compiler manual will have this information in a section on assembly language interface.
Executable strings also provide an efficient way to call system functions that require parameters to be passed in registers different from those used by the compiler. The executable string program can shuffle the parameters into the correct registers, coming and going. The system calls seldom involve more than a single pointer or integer parameter.
Consider the drive-select function common to CP/M, FLEX, and MS-DOS. The function requires an input parameter specifying which drive to select and returns the status of that drive. For example, FLEX requires the address of a file control block (fcb) containing the desired drive code. FLEX wants the fcb address in the X register whereas the INTROL-C compiler delivers it in the D register. On return, FLEX sets the condition code register (CC) carry bit to one if there was an error, whereas the INTROL-C compiler would like to see a value (-1 for ERROR) in the D register. Listing 2 shows an executable string that does the swaps.
Executable strings work with any microprocessor or computer that has a linear (continuous) address space. These include the 65xx, 8085, Z80, 68xx, 68xxx, 32xxx, and most minis or mainframes. This trick won't work with the 8086 family, which uses a segmented addressing scheme, unless the code is compiled using a "large" memory model (four byte pointers).