Professionals have a right to be picky about the tools they use. The author suggests a few criteria for what makes a good tool and contributes a couple of his own tools designed to fit the bill.
Introduction
Every professional needs a toolkit, and software engineers are no different. An extensive, thoroughly tested toolkit can make the difference between delivering a working product on time, and failure to deliver anything at all. It makes sense, then, to evaluate the process of selecting candidate tools, while at the same time making an ongoing effort to collect tools. In this article I discuss my criteria for identifying good tools, and then provide two examples of items I believe meet those criteria: a C++ class generator program, and a command-line parser class.
What Makes a Good Software Tool?
In my experience, software tools, and especially C++ tools, should have the following properties:
- policy-free
- reliable
- freely available
- supplied in source code form
I will expand on these terms in a moment. Note that I am not saying that everything developers use has to fit those criteria; merely that in order for a piece of software to be considered a tool, or of tool quality, it must meet these criteria. The above list is partly for classification, and partly for guidance in searching for tools. If I am searching for a piece of software, my first choice is a tool-quality item. If a tool-quality item is not available, then I work my way down the list of what is available.
Policy-free tools are tools that do not force or even encourage developers to adopt certain programming policies, such as, for example, the policy of wrapping all C++ code in a namespace. Policy-free tools can be used in any project, regardless of the teams development methodology. Dogma tends to give indigestion to people who like to reason things through from first principles a characteristic shared by many of the top programmers Ive worked with.
Reliable tools allow the programmer to think about the task at hand, rather than troubleshooting his tools. A tool should only be considered reliable after a considerable amount of use in the field, by real programmers.
Freely available tools are more likely to become de facto software standards, which in turn adds to the attention and support and testing that they receive. Freely available tools can travel with the programmer, giving him assurance that his time spent learning the tool will be well-spent, even if he changes jobs. There is also the side benefit of not having to negotiate a budget for the tool this can be especially helpful, given the remarkable abundance these days of technical managers who are not technically qualified to change stations on the radio, much less decide whether a given tool is worth the money.
Tools supplied in source code form avoid a huge amount of problems. They guarantee that their behavior is open for inspection: no hidden trapdoors or hacks can exist for long, without the world finding out. Such tools may be ported to new platforms, by anyone. And of course, their internal classes and architectures may be studied and reused.
The Dearth of Good Software Tools
The best toolkit available for C++ is the Standard C++ library. In some shops, Ive seen remarkable gains simply from training the programmers about the goodies that are tucked away in the Standard library, but beyond that, things begin to get a bit fuzzy. Even in this age of Linux, GNU, copylefts and public domain software [1], there is a serious shortage of no-strings-attached, high quality, reusable code. There are just not that many places to obtain high-quality items for your toolkit.
As a result, programmers everywhere find themselves creating the sorts of modules that really ought to be part of the professional software engineering mind share. This has been bothering me for some time, and so Ive been refining two items from my own toolkit over the past year, to the point where I think they now qualify as tools. These items are a code generator program, and a command-line parser class.
All of the source code provided here is portable and has been tested on Windows, Linux, and Solaris. The code relies only on the existence of a standards-compliant C++ compiler and library, or even a somewhat less-compliant compiler, such as MSVC++ 6.0 [2].
The Class Creator Program
I refer to the C++ class generator compiled program as fcc, a.k.a. Fast Class Creator [3]. If used without modifications, fcc will generate C++ class files to your specifications. As for why such a thing is necessary: there is a world of difference between code that works, and code that is appropriate for use within a commercial project. For example, in an article such as this, I could write:
class SocketWrench { public: SocketWrench(); virtual ~SocketWrench(); // ... };but to produce something similar in a commercial project, a professional programmer might have to write 30-50 lines (counting comments and white space) in the header file alone. Production code requires comment headers; built-in testability hooks; class invariant test hooks; possible explicit declaration of support (or lack of it) for copying and assignment; support for automatic documentation generators, unit tests, and namespaces; and more. When run with default settings, fcc generates over 150 lines for a single class, distributed across five files. From the point of view of an experienced C++ coder, all of this is boilerplate; for a novice, it represents a certain amount of C++ lore, distilled, ready for immediate use, and shipped without any of the quirky personality disorders that are occasionally found in the cubicle of the local reigning guru.
In order to meet the policy-free criterion, I have parameterized almost every aspect of this simple code generator. The tool itself comprises only about a thousand lines of source code, but it supports (so far) 26 different configuration settings to control the generated code. Thus, the program can probably be used without modification at most sites. Furthermore, all of the settings have reasonable defaults, so the program should be easy to use right from the start.
One last feature: fcc includes the ability to create multiple classes per file. Some people really need this kind of flexibility, and others, I suspect, simply enjoy the ability to gratuitously differentiate themselves from the Java coders.
Using fcc
fcc is invoked from the command line; Figure 1 shows a complete listing of the available options. Listings 1 through 3 show some sample classes generated by fcc. There are a few key concepts to note in using fcc from the command line:
- Each command-line argument has a name. An argument is allowed to have a value, or to exist without an associated value.
- Two argument names are required: -class and -author. The -namespace argument was actually required in earlier versions of fcc. The -namespace feature causes the generated code to reside within a C++ namespace. To provide for a more general tool, I decided to make this feature optional. (Originally, the fcc program was designed to provide both a tool plus a policy; now, it provides a tool plus a mechanism the settings files for assisting with enforcing policy.)
- Arguments may be supplied in any order, but name-value pairs must be kept together.
- If the -settings_file <filename> argument pair is supplied, the program will attempt to open the supplied filename, and read additional arguments from that file. The format within the file is identical to that used on the command line, except that newlines are permitted within the file. (This helps legibility when using long lists of arguments.)
- To generate multiple classes per file, supply a space-delimited list of class names, like this: -class "Apple Orange Pear". The program will use the name of the first class as the base for generating file names, unless you also specify a base filename, like this: -base_filename Fruits. I vacillated between two approaches for this interface. The other choice would have been to allow multiple command-line arguments, so that you could specify an unlimited number of -class <class_name> argument pairs. However, that would have required considerably more typing for the user, as well as making it more difficult to check for errors with other arguments that should not have multiple values (such as -author).
- Duplicate argument names will generate an error, as alluded to above.
- Generated code includes doc++ [4] comments, which are of the form /// (for single-line comments), or /** ... */ (for multi-line comments). This gives you a code generator and a documentation generator, all in one. Its rather impressive to be able to generate class files, and then execute a trivial unit test, and then browse the HTML documentation, all created from a few commands (one command to fcc, to build the files; another command to the make utility, to compile the program and the documentation).
Subtleties of fcc
The settings files allow a slightly higher level of abstraction when working with fcc, as Ive tried to demonstrate in Listings 2 and 3: note that the names of the settings files represent overall types of classes, such as interface and data classes. This approach allows you to have both precise control of the generated code, and access to a simpler interface to the code generator. In fact, although Im officially policy-free for this article, Id recommend that a Unix site (and Windows sites as well, if you can come up with a viable stand-in for Unix aliases) try out the following approach, which has worked well for me:
1) Place programmer-specific settings in an alias, like this:
alias create_class = 'fcc -author "Sam Smith" -copyright "(C) 2000, etc."'2) Place concept-specific data in a number of appropriately-named settings files, such as the interface_class.ini and data_class.ini files that were used to generate Listings 2 and 3, respectively.
3) Supply only the class name(s), namespace name, and other immediate needs on the command line.
Using this approach, you can write something like this:
create_class -class Hammer -namespace tools -settings_file interface.iniand you get everything you want, without a lot of typing.
To explore this concept just a bit further, let me briefly discuss the contents of the interface_class.ini file, specified at the top of Listing 3. A C++ interface class, by most definitions, is one whose methods are all pure virtual. The class is intended not for direct use, but only for use as a base class in an inheritance hierarchy. Now, the fcc program is set up to have reasonable defaults for general use, which means that it normally creates general-purpose sorts of classes. In order to generate an interface class, you must direct fcc as follows:
1) Do not create a copy constructor or assignment operator. The class has no data members, so these would just be misleading code. Switches: -no_copy_ctor, -no_op=.
2) Do not create class invariant test methods, nor built-in testing methods. Switches: -no_dump_diagnostics, -no_check_valid.
3) Do not create a unit test or makefile. Again, these will not be used in a class that is never directly instantiated. Switches: -no_unit_test, -no_makefile.
4) Do not use an icc file. This file, included from the class header file, is used to avoid cluttering the header file with implementation details when using template or inline methods. Neither template nor inline methods are used in an interface class, however, so this is just one more unnecessary file to avoid. Switch: -no_icc.
Rather than poking through the 26 command-line switches provided by fcc, it makes a lot of sense to figure this out once, and then capture that knowledge in an appropriately named file: interface_class.ini, for example.
Internal Components: the CommandLine Class
The command line parser is valuable not because it employs any special innovations in parsing a command line there isnt a whole lot of room for new ideas there; it is valuable because of its unusually concise interface, along with the fact that it allows you to instantly transition your program from C-style, to C++ style, right at the beginning.
By concise, I mean that the CommandLine class allows you to write code like this:
int main (int argc, char* argv[]) { try { tools::CommandLine commands(argc, argv); if (commands.Exists("author")) { std::string strAuthor = commands.GetByName("author"); // ...etc...This is far more accessible and maintainable than the usual C-style parsing code found in most command-line programs.
Listing 4 shows the implementation of each of CommandLines constructors. This excerpt should show how settings files are used in combination with actual command-line arguments. You can specify a settings file, and/or you can simply list the arguments directly, on the command line. The CommandLine class collects all of the arguments in an internal map (actually, an std::map<std::string, std::string>), and makes them available via the Exists and GetByName methods.
One particularly effective approach to creating console programs with CommandLine is to derive from it. This enables you to interpret the commands and encapsulate your programs unique view of what the commands mean. In the Class Creator system, there is a class called DigestedCommands that does just that. The policy for deciding what sort of code to generate resides entirely within DigestedCommands; the mechanics to support parsing the users input are completely invisible to everything except for the CommandLine class.
Acknowledgement
I would like to thank my company, ATD, for giving their consent to publish this work. The tools described here were created on their time, and were originally designed for use within their project teams.
Notes and References
[1] For more information on public domain software, copylefts, and the Free Software Foundation, see the GNU site at www.gnu.org.
[2] For a list of known areas where MSVC++ does not conform to the current C++ Standard, check out the Microsoft Developers Network article Q243451, INFO: C++ Standard Noncompliance Issues with Visual C++ 6.0.
[3] This follows directly from the fact that the original version of this program was written in Java, was called Class Creator, and no one ever considered using the word fast when referring to it.
[4] doc++ is a freely available, high-quality documentation generator created by Malte Zockler and Roland Wunderling. doc++ relies upon the presence of special /// and /** ... */ comments to guide it. See www.zib.de/Visual/software/doc++/index.html.
John F. Hubbard spent eight years as a nuclear submarine line officer, logging thousands of hours of submerged operation before finally succumbing to the lure of civilian computer technology. He currently works as a Senior Software Engineer for ATD Azad Technology Development Corporation, a software outsourcing company that specializes in real-time programming, embedded systems, and factory automation. Mr. Hubbard holds a BS in Electrical Engineering from Utah State University. He may be reached at jhubbard@azadtech.com.