C/C++ Users Journal November, 2004
Command-line options represent the "face" of the program, and tools for options parsing should be convenient and easy to use. For me it was always a tedious job to write the awkward getopt loops, especially in shell scripts. Frankly, I never understood why the getopt function is not designed in the manner of getenv. While libraries such as getopt, getopt_long, popt, and argp are powerful and rich, they don't provide utilities for parsing command options for the shell scripts.
With this in mind, I wrote my own getopt library, which I call "mgetopt." I was trying to implement a solution that is equally easy to use in C and shell programs. Mgetopt supports short (see Listing 1), and GNU-style long and longshort options. It also a has number of convenient features, such as:
Mgetopt's distribution is available at http://www.cuj.com/code/ and http://www .inch.com/~esurman. Mgetopt is written in C and is available under the GNU license.
In general, you need only the functions mgetopt_parse and mgetopt to get the job done. And most important, you don't need to write getopt loops in C or shell programs anymore.
As you probably already guessed, I utilize a hash table to hold and retrieve options. In fact, I use a standard UNIX hsearch library, more precisely the GNU version. For nonLinux platforms, I include the file hsearch_r.c from GNU glibc to the mgetopt distribution.
There is a mgetopt utility to use in shell scripts. A mgetopt is simply an executable wrapper around the mgetopt library. The shell's eval function evaluates the output of mgetopt into the shell's variables of the following format:
$opt_<name> (like: $opt_a, $opt_d)
As Listing 2 illustrates, I use Perl's style convention for shell's option names. All you need to do is check whether an option variable exists.
In C, argument parsing is managed by the mgetopt_parse function:
int mgetopt_parse( const char* shortopts,
const char* longopts,
const char* longshortopts,
int argc, char** argv);
mgetopt_parse returns 1 if a parse is successful. Parsing stops as soon as the first nonoption argument is encountered (POSIX style). mgetopt_parse prints an error message on the standard error and returns 0 when the parse fails.
The first parameter of mgetopt_parse is shortopts, a string of short option letters. The format of the short option string is the same as in the standard getopt. If a letter is followed by the flag ":", the option is expected to have an argument. I also implemented an additional flag "@" for the numeric arguments. If a letter is followed by "@", the option is expected to have an argument with NUMERIC value. mgetopt_parse prints an error message on the standard error and return 0 when the argument of the "@" option is not numeric.
An example of a short option string definition is:
mgetopt_parse( "abc:d@", 0, 0, argc, argv);
where a,b are short options with no arguments, c: is the short option with a string argument, and d@ is the short option with a numeric argument.
The second longopts parameter is a string of long option names separated by commas, ":", or "@" characters. The flags ":" and "@" placed after the option name indicate the option with a string or numeric argument, respectively. Space characters are ignored.
An example of a long option string definition is:
mgetopt_parse( 0, "ignore-case, file: delay@", 0, argc, argv);
where ignore-case is a long option with no argument, file: is a long option with a string argument, and delay@ is a long option with a numeric argument.
The third longshortopts parameter is a string of long option names, where the first letter in each name automatically becomes a short option. Names are separated by commas, ":", or "@" characters. You may also explicitly define a short option letter by placing it into parentheses "(x)" after the long option name.
An example of a longshort option string definition is:
mgetopt_parse( 0, 0, "ignore-case, input-file(f):", 0, argc, argv);
where ignore-case is a longshort option with no argument, and input-file is a longshort option with a string argument.
After mgetopt_parse successfully completes, it saves all command-line options in a hash table. The option values are then accessible by the mgetopt function:
const char* mgetopt( const char* opt_name);
The mgetopt function returns a pointer to the option value if an option is found in a hash table; otherwise, it returns NULL.
There are also a number of useful, predefined option names: NAME, the name of the program invoked (argv[0]); BAD, a list of "bad" option names detected; IND, an index of the next nonoption argument to be processed; SHIFT, the number of options processed, which could be used as an argument in shell's shift command; and HELP, which holds a help text if it is defined.
My solution for help text is to embed text inside of the option string by placing it into curly braces. mgetopt_parse removes braces and text from option strings and saves them into the hash entry HELP. This approach lets me implement the help facility both for C and shell programs.
An example of a short option string definition with help text is:
mgetopt_parse( "a { -a print all} s { -s silent} h { -h help}",
0, 0, argc, argv);
There are two additional, convenient functions that are available only for C programs:
Listing 3 illustrates most of mgetopt's features just presented, while Listing 4 is an example of the shell script.
The mgetopt library I presented here utilizes a hash table to simplify the user interface and eliminate parsing loops. Most of mgetopt's features are available both for C and shell programs.