Know Thy Machine
More for amusement than anything else, I occasionally teach a 10-week Java-programming class for the University of California Extension. Since I give out a lot of homework, cruel taskmaster that I am, I end up looking at a lot of code some good, some not. All of my students are working professionals, some with decades of programming experience. Lately Ive noticed a disturbing phenomena, illustrated with the following example.
In one of last semesters homework assignments, I told the class that they should constrain the size of a collection to the maximum value of an int. I saw the following code several times:
int value; //... if( value <= Integer.MAX_VALUE ) // do somethingThe problem with this statement is that it doesnt work its always true. The difficulty is in the hardware, which uses twos-complement arithmetic. (If it doesnt, Java requires the JVM to emulate twos-complement arithmetic, so everything Im about to say still applies.) In a twos-complement system, the most significant bit of a positive number is always zero, and the most significant bit of a negative number is always one. Given the 32-bit int mandated by the Java specification, this means that Integer.MAX_VALUE has the value 0x7fffffff. If you add one to this value, youll end up with 0x80000000. (All bits except the high bit are zero.) This number is, interestingly enough, Integer.MIN_VALUE. The foregoing if statement never fails because all negative numbers are certainly less than Integer.MAX_VALUE. The worst case of this problem is:
for( int i = 0; i <= Integer.MAX_VALUE, ++i ) // do somethingwhich is an infinite loop.
The point of this discussion is actually not to look at the odd behavior of arithmetic in a computer, but to address a more fundamental programmers (not programming) issue. In spite of some programmers beliefs, Java is not the Alfred E. Neuman programming environment. (MAD Magazine readers are familiar with Alfreds motto: What, me worry?) Dont think about the machine, the VM will hide it from you. Thats nonsense, of course. Our programs are running on real hardware, which has real limitations in how it works, and we have to be aware of these limitations to use the language effectively. Remember, a computer program is really a bunch of electrons moving across a sheet of carefully dirtied sand and aluminum. If you dont know how the machine works at the binary level, then you are bound to get bitten eventually by an expression that is intuitively correct but which yields nonsense on a real computer.
If youve never done it (or havent done it lately), you should read a good algorithms book and a good book that discusses how machines work at the hardware level. Programming is a craft, and unless you know the fundamentals of your craft, you cant do the best possible work.
Allen Holub
Senior Editor