Dr. Dobb's Journal September 2001
When Myst came out, I was hooked, which surprised me because I don't play many computer games. But I've been to all the worlds, solved all the puzzles, and delivered the white page to Atrus. When the sequel game called Riven was released, I wasted countless hours visiting the islands, freeing Katherine, and finding and returning the linking book to Atrus.
Myst III has been released, and the only place I visited with it was the Wal-Mart service desk to return it. I installed Myst III on four computers three running Windows 98 and one running Windows ME. (I dual boot most of the machines here so I can run things that Linux won't do. Like Myst.) All my machines have the minimum required hardware and software. One machine blows up during installation. Two others can't render the sound. One can't sense when the CD-ROM is inserted.
The lady at Wal-Mart's service desk knows about the problems. She's been taking them back as fast as they sell them.
This is a clear case of someone releasing a product without testing it. Maybe some computer configuration somewhere installs and runs Myst III probably a Macintosh and maybe the Myst testers all have that machine, but I don't. Neither do most of Wal-Mart's customers.
Why is Version 1.0 of everything always fatally flawed? Why do users have to waste time learning that the thing is broken and will remain so until they waste more time downloading repairs? Why should we be treated that way when we cough up our hard-earned dough for what ought to be a working product?
Where's the lesson? I think it's here. Open-source developers and distributors of free software need to change their tune. Forget all the arm-waving evangelism about free societies, communities, and all that. Open-source software is routinely and intentionally released with big bugs in it. Its users get it for free and fix the bugs. That's part of the game. That's the puzzle to be solved.
You like to play games? You like to solve puzzles? Forget Myst. Become a Linux programmer.
Unlike other platforms, Linux offers no choices to a C++ programmer when it comes to compilers; you use gcc (the GNU Compiler Collection) or you don't write C++ code, simple as that.
Let me be clear about one thing. Gcc is not a product of the Linux open-source community. It is developed and distributed by the Free Software Foundation who developed many of the command-line tools, too. FSF also manages the development of GNOME, one of the graphical user interface APIs and desktops available to Linux users and programmers. These programs are not specifically for Linux; they run on multiple UNIX-like operating systems. Linux depends as well on open-source program modules from other places, too. Most of the command-line tools are part of the FSF's GNU project. The X11 and XFree86 libraries are two more examples from other places. Linux itself is only the kernel of an operating system.
On June 18, 2001, the Free Software Foundation released gcc Version 3.0, a major release intended to bring the C++ compiler components in line with the ANSI/ISO standard. This announcement reminded me of the old saying often seen hanging on the cubicle walls of overworked programmers. "Good, fast, cheap: Pick two." Let's discuss those attributes as they apply to gcc.
The FSF tells us the word "free" in their philosophy means "freedom" as much or more than it means "at no cost," in which case, they chose the wrong word. Software is not free in that sense of the word. Software does not have rights and freedoms. People do. So-called free software only empowers those freedoms. (The-open-source movement chose the word "open" because, I'm told, they did not want to distance Linux from companies that wanted to use it to generate revenues, and they wanted users to get used to the notion of paying for software.)
Gcc is definitely cheap. It's free in the sense of the word that means I didn't have to pay anything for it. As a famous columnist, I rarely have to pay for software (except at Wal-Mart, where they don't know me), but this time I didn't even have to ask for it. I just went online and downloaded it. No one else has to pay for it either. Consequently, gcc meets one of the three criteria for a project posted on our walls. It's cheap.
The cubicle bromide's "fast" could have two meanings, too fast as in speedy and fast as in timely. The saying refers to the timely nature of a project. Whether it's speedy is a part of whether it's good, which comes later.
Some open sourcers fervently believe that their development model will eventually displace closed-source proprietary development, and they rarely miss an opportunity to preach their gospel to the uninformed masses. But they always ignore an important part of any development effort, which is the schedule, which is whether you can have it "fast."
A largely volunteer, spare time, unpaid, unorganized staff can rarely deliver a complex software product in a timely fashion. It's difficult enough when the programmers are being paid and properly managed. That notwithstanding, the C++ Standards committee approved and published the C++ Standard document in July 1998. Commercial compiler developers who tracked the progress of the committee had released reasonably compliant compilers and libraries even before the publication of the specification. Commercial compilers were, indeed, fast. They were finished before it was time to start. It has taken three years for the open-source development model to do what the proprietary vendors did in negative time.
The gcc Standard C++ project was, therefore, not fast, which means to deliver our choice of two of those cubicle qualities, it better be good.
Gcc has a long-standing reputation as being one of the better C compilers available. I am more interested in its performance as a Standard C++ compiler. Some time ago I offered my opinion that gcc's template performance might be less than optimal. I came to that opinion after observing the differences in compile times and binary sizes between programs compiled with the legacy library and those that used the then-experimental Standard C++ Library, which comprises mostly template classes. I have been awaiting the official release of gcc 3.0 to see if improvements in the compiler and library have narrowed those differences. It was released yesterday, and I downloaded it and installed it.
From my simple and small benchmark, which consists of a "Hello World" program, I conclude that the original performance problem still exists. Earlier I suggested that programmers might want to avoid the Standard C++ Library and use the legacy iostream.h library and its brethren. That, unfortunately, is not an option when you upgrade to gcc 3.0. Whereas gcc 2.95.2 emulated the Standard C++ Library by pretending to recognize <iostream> and the other standard headers and by ignoring references to std::, gcc 3.0 does just the opposite. Not only does it implement the Standard Library, it lets you think you are using the legacy library by letting you include <iostream.h> et al., letting you omit the std:: namespace qualifier, probably with a using directive, and thereby substituting the Standard C++ Library classes for their simpler ancestors.
Like it or not, you get the new library with gcc 3.0.
Example 1 is the program I used for the benchmark. Unable to use gcc 3.0 to build this program with the legacy iostream libraries, I used the older compiler, which is still on my computer, and then gcc 3.0 to build the benchmarks. Here is the command line I used both times except that for the older compiler I specified the path to gcc ahead of the command name:
gcc -o tester tester.cpp -lstdc++
I'm running a faster processor these days, so the build time difference is not as dramatic as what I reported when I ran the experiment on a slower PC with the mingw32 port of gcc. There is, however, a noticeable difference in the time it takes the two compilers to build this tiny program. The old compiler snaps back to the command-line prompt instantly. The new compiler takes about three seconds to complete the build. Binary sizes are significantly different, too. The old compiler builds a 14,423 byte binary. The new one builds a behemoth of 123,400 bytes.
Consequently, I joined the gcc bug-reporting mailing list and am waiting to be accepted into their membership so I can report my experience.
So, is gcc 3.0 good? Well it's better than the competition. It might not be better than its predecessor, though. Time will tell. I'll stick with 3.0 until something makes me want to change, at which time I'll retreat to my previous platform, gcc 2.95.2 and its legacy libraries, which are all three, good, fast, and cheap. Kind of makes you Mysty-eyed, doesn't it?
If you are going to develop GUI applications for Linux (or FreeBSD, or UNIX itself) you probably want it to run under and look and feel like KDE, one of two competing GUI desktop environments, the other being GNOME. KDE is winning the competition in my opinion, because it looks better, feels better, is based on a C++ API rather than a C API, and seems to be progressing faster and more smoothly.
Applications built with the KDE development API run properly on the GNOME desktop if the GNOME user has the QT run-time libraries installed. GNOME applications run under KDE, too, if the GTK libraries are installed. About the only difference between the two run-time environments is the look and feel of the applications' title bars, menus, and so on. KDE-developed applications seem to have better standard dialog boxes, but that is the kind of thing that changes as each player plays catch-up with the other.
Time was, the open-source community denigrated KDE because it is built upon QT, a C++ application framework that was not free to developers. For a while, Red Hat refused to distribute KDE with its Linux distribution because of the QT license. (The Red Hat distribution itself is free if you want to download that much stuff, but to get support, a CDROM, and a pretty red box, you have to buy and register the distribution.) GNOME itself was a reaction to KDE's unfree development situation. Red Hat's antiKDE position launched Mandrake as a spin-off distribution for those users who want more choices.
The QT controversy kind of went away when its vendor, Trolltech, put the library under the GPL for noncommercial applications. If you don't charge for your Linux program, you don't have to pay a license fee to develop and release it under QT. Some folks don't think this is free enough, but Red Hat yielded to consumer pressure and added KDE to its distribution some time ago.
I know all this history is accurate, because I read it on the Internet.
To develop KDE applications, you must understand the QT class library. There is a KDE applications class library, but it is only a thin wrapper around QT. There are several books about programming with QT, and the complete specification and a few tutorials are online at the vendor's web site (http://www.trolltech.com/). One of the books is Programming With QT, by Matthias Kalle Dalheimer (O'Reilly & Associates, 1999). It's a good book and I recommend it. I've gone through it and played with its example programs. The book is pro-QT, of course, pointing out that the QT class library is rugged, proven, and highly regarded by KDE applications developers. It's time, however, to discuss some of QT's faults, too.
First, why can't they use namespaces? I have heard and don't buy the argument that they have to support platforms that do not implement namespaces well. KDE uses lots of compile-time conditionals to configure the code for various platforms. Q_AMPERSAND is one example, which supports the fact that some C++ compilers require the ampersand operator to return function addresses whereas other compilers do not. That's what the configure script is all about.
If they had argued that the namespace feature was hastily added to the language specification, was not well-thought out, was standardized with no real experience in its use, and had poorly designed features (the namespace specification, for example), I might agree with them. But the platform argument is lame.
QT uses an odd mechanism to implement signals and slots, which are messages and message handling functions in the parlance of MFC programmers. The mechanism associates member functions (slots) of a widget-derived class with the messages sent by events (signals).
At first you might think this mechanism could be implemented by virtual functions. It would seem natural to use C++'s polymorphism feature to implement something that clearly constitutes polymorphism in the window (widget) class hierarchy. In the early days of Windows programming, developers of C++ application framework classes needed to find a better way. If all the messages a window could receive were represented by virtual functions, every window class would be heavily laden by a huge vtable because there are many such messages and many kinds of windows (widgets). MFC developers invented the MESSAGE_MAP as an alternative approach. They used preprocessor macros to define the presence and initialization of a table of message numbers and function addresses that the message dispatcher could use.
The MESSAGE_MAP technique was and still is quite elegant and intuitive, its detractors notwithstanding. I think, however, that a well-defined class library could distribute messages to windows by using well-placed virtual functions, given that not every kind of window gets every kind of message, and contemporary programs have a lot more memory to use, anyway.
QT takes a somewhat different approach to implementing a message map, er, slot table. Qt uses the QOBJECT macro to generate one line of code with a bunch of member declarations for derived members common to all slot processor classes, but with specializations. The code generated by QOBJECT is all on one line to maintain line number integrity for the debugger. You never see it unless you look at the preprocessor output, which is what I did to see what was going on.
Qt calls the slot table "meta data" and invokes a metaobject compiler (MOC) to build it. QT requires the "slots" pseudo keyword (also a macro that #defines to null code) that you must provide in the class definition like this: private slots:. This access specifier identifies member functions that are slots (message handlers) and whose declarations follow. MOC processes this code in the header file to generate another source-code file with lots of code that sets each slot function's address and access specifier into two dynamically allocated message handling function tables.
Then the program itself your code has to associate each slot member function in the tables with its associated signal (message) by calling the QObject::connect function. Lots of initialization code that you have to write processes every time you run the program in order to achieve the same thing that MFC does with compile-time macros.
The MOC program also generates the common member functions declared by the QOBJECT macro. Most of these functions exist to specialize on a string of the name of the class. An MOC-generated function named className returns the name of the class as a string literal. Later functions use their own copies of string literals for the same data. It is not clear why the MOC-generated functions don't use the MOC-generated className function and why the className function itself does not use C++ RTTI.
I'm also not sure why all this custom RTTI is not controlled by _NDEBUG. The class name is usually meaningful only to a programmer in a debugging environment. Why is all that stuff included in production code? Is it because we expect to be delivering buggy applications? Myst-ifying, indeed.
You might expect QT to implement document and view classes the way MFC does. I thought it did because I use the KDevelop IDE to build and test applications. One of the choices on the KDE Application Wizard dialog is KDE 2 MDI, which builds a dummy application that implements a multiple document interface with a document/view architecture. Finding no document/view base classes in QT, I figured the KDE wrapper classes must have them, but the KDE programming documentation doesn't mention the idiom, either. When in doubt, read the code. It turns out that KDevelop itself builds the MDI interface when it generates code by deriving your document class from QObject, your view class from QWidget, and using QT container classes to contain document objects in the application class and view objects in the document class.
QT includes its own container and string classes. I want to use Standard C++ container and string classes wherever possible, and I want that usage to be transparent, which means I need implicit conversions from the standard classes to the QT classes and vice versa. Many QT functions expect and return objects of their QT container classes.
I started by looking at the QString class. To even begin to get what I want, I needed a conversion constructor named QString::QString(const std::string&) in the QString class. I could have used std::string.c_str() wherever QString is called for, because QString includes an implicit conversion from const char*, but that would be lame. C++ provides ways to do this and programmers ought to be able to use them.
My first idea was to add an inline conversion constructor to the QString header file, thus eliminating the need to rebuild the library while I tested. My plan was that after the inline function worked, I would put in the .cpp file, recompile the file, and rebuild the library. I'm not sure whether the QT license permits this, but it's what I decided to do. To add a constructor, one must see what the other constructors do. I read qstring.cpp and discovered the kind of code that an open-source developer should never publish to be scrutinized by one's peers.
The QString constructors use static non-member utility functions and macros declared in the .cpp file. I can't call them from inline functions in the header because they are not in view in the header. I have to modify qstring.cpp and rebuild the library just to add my simple conversion constructor. I can understand putting macros in the .cpp file if you are worried about polluting the global namespace, but namespace pollution is not one of their concerns. I cannot, however, understand using a static non-member function to do anything at all in the name of a class object. That's a C idiom, one that C++ programmers learned long ago to avoid. The function can be static, certainly, but it ought to be a private member function if only to avoid offending the sensibilities of programmers who believe in writing good code.
QT supports the development of nice applications, but at the expense of substandard code, at least in the view of this C++ programmer. The library represents an impressive body of work, and it would be foolish to try to duplicate what they've achieved only to have prettier code. In my opinion, QT needs a good C++ wrapper around it, not just the thin wrapper that the KDE application classes provide, but one that hides as much as possible the unattractive way QT does things.
To my generation the name Linus evokes images of a little guy sitting at a toy grand piano with a bust of Beethoven nearby and a pesky sister named Lucy. I've loved that little guy ever since I was a little guy playing the piano, too.
Today we have a new Linus icon, not such a little guy and not a piano player, but with a pesky sister. How do I know about her? Linus Torvalds and David Diamond just published Linus's autobiography titled Just For Fun (HarperBusiness, 2001). This entertaining book traces Linus from his childhood in Finland sitting on his grandfather's lap, hacking away at a Commodore home computer, his decision as a university student to improve a UNIX knockoff named "Minix" by writing his own kernel, the unexpected wave of Linux users who jumped aboard his project, the unintentional launch of a massive development project under what would be called "open source," his immigration to the U.S. to take a job with a Silicon Valley chip maker, and the fame and fortune that his worldwide project and a couple of timely stock options made possible.
Linus reveals a lot about himself and shares his philosophy of the meaning of life, which has three purposes survival, social acceptance, and entertainment. He also dismisses the image the press has given him of a self-effacing person uncaring about money and other worldly things. (We journalists are "scum" in his view.) He strongly advocates the open-source philosophy and believes that Linux is on the verge of dominating the desktops of the world.
The most unlikely and unexpected quote in the entire book comes when Linus is about to deliver his first major speech to members of the software-development community. He has a laptop set up to show his briefing slides, which he prepared with PowerPoint. Linus Torvalds, the poster child for the open-source movement, the standard bearer for all that is antiestablishment in software development, says, "Thank God for Microsoft."
There ought to be a way Microsoft can use that quote in its promotional literature. Maybe even in its Justice Department appeal briefs.
DDJ