Features


A Flexible dprintf Function

Arkin Asaf


Arkin Asaf has eight years of programming experience, the last two exclusive to the C language. He gets his kicks from writing small but useful utilities, and can be reached at 47 Berri St., Herzlya, Israel 46456.

About dprintff

Many applications could benefit from a specialized version of printf. Unfortunately, printf is too large and complex to re-invent, and it's often easier to get the source to a complex 3-D graphics library than to the pedestrian functions in a compiler's standard library.

This article presents source code to dprintf, a clone of the printf function. With minute effort you can modify and adapt dprintf to suit your needs. You will find dprintf is mostly portable and expandable: it easily extends to accommodate newly devised formats; it can print to almost all output destinations, and it follows the ANSI standard.

Defining dprint

dprintf parallels printf in calling convention and operation except that it uses a pointer to a function as its first parameter. This pointer designates a function, resembling putchar(), which performs all output. Specifying the output function gives dprintf unlimited choice of output destinations. The pointer's definition and function prototypes are:

typedef   int (*dprintf_fp)(int);
int    dprintf(dprintf_fp Func,
            const char *Format, ...);
int   vdprintf(dprintf_fp Func,
            const char *Format, va_list Args);

Variations On dprintf

Like most printf implementations, dprintf accepts a variable length argument list and passes a pointer to this list, along with pointers to the output function and format string to a subordinate function, in this case to vdprintf. Having so little to do, dprintf tends to be rather small — as short as four statements long.

Since vdprintf accepts fixed arguments and is quite long, I advise that dprintf absorb all non-standard arguments, setting them for vdprintf's convenience. For example, you could create a function aprintf which allocates a memory block in which to store output by revising dprintf as in Listing 1. Note that vdprintf remains unchanged.

vdprintf cannot be insulated from all changes: new formats will require changes to vdprintf. You can easily create relatively portable versions of dprintf that support binary and Roman numerals, file and path names, and printer control codes. At a minimum, you must modify vdprintf to create a % pointer format, since this format varies considerably between system architectures.

The Workings Of vdprintf

This section and the following two describe the internal working of vdprintf (Listing 2) in a stepwise manner.

vdprintf outputs characters using a programmer-supplied function designated by a pointer. This function returns EOF upon output error. Rather than passing a pointer and return value through three levels of functions, a static pointer (Out-Func) and longjmp buffer (dputc_buf — for quick return) are defined. Consequently, vdprintf's first actions involve assigning OutFunc a pointer and initializing dputc_buf.

With that done, the printing process begins: vdprintf scans the format string a character at a time, interpreting % specifications and echoing all other characters to the output.

Following the % format sign come the flags - zero or more from a set of five: -, +, space, 0 and #. Successive flags are parsed from the format string one by one. strchr() matches each potential flag against FlagsList, returning either NULL (not a flag) or a pointer to the flag in FlagsList. Simple pointer substraction and bit shifting then produce a bit mask, which ORs onto Flags. Later on vdprintf will AND Flags with Mask macros to establish whether or not certain flags have been mentioned.

With all flags read, vdprintf gathers the width and precision parameters. Widths are processed first (zero assumed, if absent), either as a numeric (deduced from digits in the format string) or an int, and if an asterisk replaces the numerals an argument is consumed from the arguments list. Leading zeros are considered a flag. Since the precision is separated by a period, it may begin with zero but otherwise is read similarly. Note that not specifying the precision value (zero assumed by default) differs from omitting the precision and period altogether (minus one assumed). For example, "%5.s" implies a zero-length string, whereas "%5s" implies a string of five or more characters.

Before the format letter comes the argument size: default, short (type h), long (l), or long double (L). The long size applies only to integers, the long double size to floating points. Default may be int, double or any specialized type, such as char for the %c format. The short type serves only to maintain some compatibility with scanf (), short arguments being automatically promoted to int and float arguments to double by the compiler. In effect, the L specifier is meaningless.

Finally, vdprintf reads the format letter, which determines how to generate the output. Most formats are provided by auxiliary functions in order to keep vdprintf short. If an output error or incorrect format specification is encountered at any point, vdprintf returns EOF; if all goes well, vdprintf returns the number of characters successfully printed.

vdprintf's Auxiliary Functions

Five auxiliary functions assist vdprintf: PrintDecimal, PrintRadix, PrintFloat, ToInteger and Print. The first three transform long ints, long unsigneds, and long doubles, respectively, into printable strings of digits. ToInteger also transforms and Print does the actual printing.

PrintDecimal (%d or %i formats) produces signed decimals. It dissociates the received long int into prefix and value, the prefix holding the sign. Once ToInteger stringizes the value, Print outputs both the prefix and the value.

PrintRadix yields long unsigned decimals (%u format), octals (%0), hexadecimals (%x or %X) and pointers (%p). Since these values are always positive, the prefix, obtained in the variant format (# flag present), denotes the value's type: nothing for decimals, 0 for octals and 0x for hexadecimals. (Note that hexadecimal letters are in the same case as the format letter.)

As presented in Listing 2, vdprintf's PrintRadix utters eight-digit hexadecimal (upper case letters) pointers, which @ prefixes in the variant format. Various system architectures impose different pointer representation, both in memory and in writing. It may be essential that you modify not only PrintRadix, but also vdprintf's switch construct, which assumes pointers remain intact, cast to long unsigneds. Not all systems guarantee this.

Before printing, numeric values must be converted into characters. ToInteger turns long unsigneds into NULL-terminated strings of digits in a given radix. A numeral must have no less than precision number of digits; if necessary, zeros precede the value. ToInteger stores the string in a malloc'ed memory block. Its address returns by reference — through formal parameter char **Buffer. The string's length returns by value (terminating NULL excluded.)

Print completes the auxiliary functions, printing the prefix and value in accordance with Flags. Normally, spaces are inserted before the prefix, right-justifying it and the value within their field (the width parameter sets the field). The 0 flag states that zeros come between the prefix and value to fill the field whole. The - flag appends spaces at the end, left-justifying the prefix and value. Regardless of the style used, no more than Maximum number of characters are printed (ignoring negative maximums). Finally, OutCnt increments by the total number of characters printed.

Outputting Floating Point

PrintFloat starts with the prefix: negative values have a - prefix; positive values have either nothing (default), a space (space flag present) or a + prefix (plus flag present, space flag present or not).

At the far right of the representation, all formats but %f require an exponent. The exponent results from dividing and multiplying the floating point number by ten until the resulting value is between zero and one. Divisions add to the exponent count; multiplication subtract. The %g format forces vdprintf to choose between the standard (%f) and engineering (%e) formats, whichever is shorter. If the exponent lies between -3 and precision (inclusive), standard format governs (the exponent is cleared after it has been deducted from the precision). Otherwise, vdprintf selects engineering. Either way, the precision loses one digit off its end.

In the algorithm's trickiest manipulation vdprintf must split a floating point number into integer and fractional parts. The integer rounding is accomplished by casting the floating point to int and then back to float. Twice an integer is created. The second time, the part of the fraction to be printed is moved left of the decimal point, and again cast to int and back. We now have the integer and fraction parts stringized.

The integer part prints first. In the %g format, trailing zeros are removed from the fraction. In some cases (e.g., zero precision) there may be no fraction. If the fraction follows or the # flag appears, a period follows the integer. Unrelated, the exponent follows. An e (in the same case as the format letter) precedes the exponent, which always contains a sign and at least two digits.

During the process a long double was cast to long int and back. On some systems, long floating numbers may fail to convert properly, if at all (they may even raise an exception). A simple but imperfect solution uses the floor function to obtain the integer part of a floating point. This fix also requires changes to parts of PrintFloat. (See Listing 3) . This solution is imperfect because floor acts only on doubles, not long doubles. Depending on your system and your demands on it, either version of PrintFloat should work.