Columns


Questions & Answers

Function Return Values

Kenneth Pugh


Kenneth Pugh, a principal in Pugh-Killeen Associates, teaches C and C+ + language courses for corporations. He is the author of C Language for Programmers and All On C, and was a member on the ANSI C committee. He also does custom C programming for communications, graphics, image databases, and hypertext. His address is 4201 University Dr., Suite 102, Durham, NC 27707. You may fax questions for Ken to (919) 489-5239. Ken also receives email at kpugh@dukemvs.ac.duke.edu (Internet) and on Compuserve 70125,1142.

Q

First of all I want to thank you for a wonderful magazine and informative answers to a lot of good programming questions. I have two easy ones for you.

Why do most of the Standard C library string functions (strcpy, strcat) return a char * when they already modify the string through formal parameters? I have seen:

s2 = strcpy(s2, s1);
a lot less than I have seen:

strcpy(s2, s1);
I remember your saying something about return values screwing up the stack if they are not captured, and this seems like a major offender.

The second question is geared towards C++. I have never seen a constructor or destructor declared with a return value. Are we assuming they return void or int, either of which would be better style to specifically declare?

Thanks in advance for your time and useful information, and keep up the great column.

Andrew Tucker
Seattle Pacific University

A

Let's take your questions in order. I can't ever remember saying anything about not using the return value. It usually gets passed back in a register and if you do not use it, the value simply disappears when the register is reused.

One reason for the char * return value is to allow the functions to be nested. You could specify a series of operations as:

strcpy(a,"");
strcat(a, b);
strcat(a, c);
or as:

strcat(strcat(strcpy (a,""), b), c);
I cannot say I particularly prefer the latter. Of course, if you were using C++ and a String class, the point becomes moot. String classes overload the operator + to mean concatenation. The assignment operator is also usually overloaded to mean string copying. So the set of operations can be expressed as:

a = "" + b + c;
or

a = b + c;
Functions such as fgets also return a character pointer, probably for the same reason. Although this is consistent with the str__ functions, it is inconsistent with the return type of most of the other file functions.

As for your second question, constructors and destructors do not return any value. They are implicitly called in declarations, so there is no opportunity for them to return a value. You might call this a void return value, but that implies to me that the function actually is called normally.

For example, suppose you coded:

#include "string.hpp"
a_function()
    {
    String a_string;
    ...
    }
The constructor String(void) is implicitly called when a_string is declared. At the terminating brace (when a_string goes out of scope), the destructor ~String(void) is implicitly called.

The lack of return value relates to the problem of what to do if there is an error in the constructor or destruct. For example, the initialization values might be out of range. You cannot return an error code. You have several choices. They include aborting the program, issuing a message alternative to the standard error output, or simply substituting in valid values. Just to be complete, let me note that a constructor can be called explicitly as a normal function. For example, if you had a string = string ("abc"), the constructor string (char*) is called and creates an object of class string. However, there is still no way to return an error code. A destructor can also be called explicitly, but the use for that is fiarly obscure. (KP)

Check Digits

This is in partial response to your answer to a question by Brendan O'Haire in the November 1992 issue of The C Users Journal. I came across this in reading sometime early in my career, 1958-1964 perhaps. If you would like proofs of the mathematical assertions, I will supply them on request. I remember the source where I read this claimed it was a technique widely-used in the business world. I cannot argue for or against that nor can I cite the original source.

This procedure inserts a check "digit" into a number in such a way that all single transpositions of adjacent digits can be detected. It can actually detect single transposition at odd distance, but the adjacent transposition is the more probable keying error. The mathematical algorithm used to compute the check digit is actually an algorithm for computing the remainder upon division by 11.

I'll use an example. Suppose the number to be encoded is 34781 and you wish to make the check digit the third digit from the right. Let Y represent the check digit. The new number will have the form 347Y81. Form a sum alternately adding and subtracting the digits in the number:

1 - 8 + Y - 7 + 4 - 3 = -13 + Y
If the numeric part of this, -13 in this case, is not between -10 and 10 repeat the process:

-13 + Y -> -(3 - 1) + Y = -2 + Y
At this point pick a single digit for Y so the sum is either zero or eleven. Y is 2 in this case. It is possible Y might be 10 in some cases. The Roman numeral X is used as the "digit" in this case. The original number with the check digit inserted is now 347281. When the alternate addition and subtraction is performed:

1 - 8 + 2 - 7 + 4 - 3 = -11 -> -(1 - 1) = 0
then the ultimate value should be zero. If two digits are transposed, 342781 for example, the algorithm produces -1 for the example. A non-zero value indicates the number is not correct.

The mathematical argument depends on the fact that any integer, in decimal notation, can be represented as a sum of two terms, A+B, where A is the alternating sum and difference of the digits and B is an exact multiple of 11. This latter fact is derived from two assertions which can be proved by mathematical induction.

(10 ** (2k + 1)) + 1 and (10**2k) - 1, k = 0, 1, 2,...
are each divisible by 11. All the terms in B have a factor of one or the other of these forms.

Walter Beck

I read with some interest the letter from Brendan O'Haire in your column in the November CUJ regarding check digits. I'm no mathematician nor a theorist, so I don't know the reasons why these things are done the way they are. However, I have recently run into a couple of algorithms for calculation of check digits in real-world programs, which differ significantly from the simple algorithm you gave in your response.

The first one I tripped over was in doing some database conversion for a client. Part of the data were bank routing numbers, used for Electronic Funds Transfers (these numbers are, apparently, assigned by the Federal Reserve Bank). They are eight-digit numbers with a ninth digit appended, which is the check digit. The algorithm I was given to use for calculating these numbers is shown in the code in Listing 1.

Also, I am currently involved in attempting to understand the specifications from HL-7. (HL-7 is a standard for data messages to be used in Health Care settings.) Some of the data types specified in the standard are short text strings with a check digit calculated either as mod 10 or mod 11. After some digging I was given the following description of the algorithm:

Assume you have an identifier equal to 12345. Take the odd digit positions, counting from the right, i.e., 531, multiply this number by 2 to get 1062. Take the even digit positions, starting from the right, i.e., 42, append these to the 1062 to get 421062. Add all of these six digits together to get 15. Subtract this number from the next highest multiple of 10, i.e., 20 - 15 to get 5. The Mod10 check digit is 5. The Mod10 check digit for 401 is 0, for 9999, it's 4, for 9999999, it's 7.

Listing 2 is my (possibly simplistic, and certainly not terribly efficient) implementation of this algorithm.

I always enjoy reading your column. Keep up the good work!

Fred Smith

Thank you both for your contribution. I had an encoding course about twenty years ago, but the algorithms I recall were mainly for correction of binary values. They help in detecting and/or correcting binary digit errors (1 for 0 or 0 for 1). The parity bit on memory bytes is an example of a single error detection code. It could not detect a transposition error (switching two binary digits). Cyclic redundancy codes (CRC) can be used to correct multiple errors, such as might be found on magnetic disks.

A single check digit for a decimal number can be used to detect single transposition errors or a single digit error. Since the digit is usually included as part of the number, it would be interesting to see if any of these algorithms fail if the check digit is transposed with a normal digit. When I used the check digit, it was set off as a separate character with a hyphen, so that mistake was less likely. Too bad telephone numbers don't include a check digit. The old rotary switches couldn't handle them, but computerized switches should be able to. I wonder if the saving in the amount of time spent dialing wrong numbers would compensate for the time spent dialing an extra digit. (KP)

gotos

In response to the letter from Raymond Lutz in the November CUJ I would like to point out the following useful feature of goto. I use gotos to consolidate the error-handling code near the bottom of the routine. This can be particularly useful when the normal path also includes most (or all) of the error-handling code as well (as in the case of cleanup code). See Listing 3 for an example.

If I had used return instead of goto I would have had three copies of the free code for buf1, two copies of the free code for buf2, and four different exit points in the routine. This way I only have one copy of the free code and one exit point which makes maintaining the code easier.

In brief, I believe that gotos, just like handguns, can be misused/abused but that is not a reason to avoid them altogether. You just have to make sure you use the right tool for the right job.

James Brown
Orem, UT

I agree with your sentiments regarding gotos, as I have stated before in this column. Your particular example of memory allocations is of particular interest. A major program that I am helping out with has rather large dynamic memory requirements. It also has the need for exiting gracefully if memory runs out. The compiler vendor's allocation algorithms have created some interesting problems in this application. Memory was getting fragmented more than expected, so some tracing capabilities were needed.

Instead of using malloc and free, I made up two wrapper functions called memory_allocate(unsigned int size, char * name) and memory_free(void *address, char * name). The name parameter is a string that describes the variable that is being allocated or freed.

It is used for debugging and error reporting purposes. I can't give you the exact code, as it is for a proprietary project, but the pseudo-code looks something like Figure 1. (KP)