Steven K. Graham has worked for Hewlett-Packard, served on the Faculty at UMKC, and is currently a Senior Engineer with CSI.
This month's CUG releases include two public-domain, genetic-algorithm tools, Genitor and GATool. These tools support exploration of genetic algorithms, and produce programs that could be used for applications.
Though Genitor and GATool are both designed for UNIX environments, they rely on C (Genitor) or C and C++ (GATool) and are distributed with source, so it should be possible to port them to other systems. GATool provides a menudriven interface based on curses, so the interface may be more difficult to port. Both systems provide code that manipulates the populations of strings and performs the basic operations. To use either system, you have to provide a fitness function (as C code) that drives selection of parents and survival of individuals. Documentation is sparse for both systems and neither is particularly user friendly. Genitor and GATool are both the products of a research environment, and as such, expect knowledgeable users.
The Genitor Package
The Genitor package is the more venerable of the two. An early version was first described in a technical report in 1987. Genitor has been distributed via ftp for some time, so presumably few bugs remain. Genitor distinguishes itself by using fitness rank to select parent strings rather than a string's fitness as a proportion of the population's total fitness (the sum of all individual's fitness values). Genitor normally produces offspring one at a time, rather than the more typical generational approach. Genitor provides reduced surrogate crossover and adaptive mutation as operators, as well as edge recombination, a particularly successful crossover operator for the Traveling Salesman Problem (TSP).Genitor comes with C source and various make files. The first step in using Genitor is to build the three libraries used for creating GA programs. These libraries differ in whether the data manipulated will be treated as integer, floating point, or binary. The Counting 1s example (see page 76) would use binary, a TSP problem would use integer, and a problem using Genitor to optimize weights in a neural net would use floating point. Genitor includes extensively-commented example files for traditional binary optimization (Counting 1s), Traveling Salesman Problem, and a neural net example that uses the genetic algorithm code to solve a two-bit adder problem.
To apply Genitor to a problem there are a number of steps. First, determine an encoding for the critical data of the problem to be solved, and select the corresponding data type. Then, write an evaluation function that accepts a string and a string-length and returns a floating-point value that represent the fitness of the string. Next, write a main program which uses the Genitor library routines to perform the mechanics of the genetic algorithm. In general, you would modify one of the examples rather than create these functions from scratch. Compile the evaluation function and main program and link them with the Genitor library to create an executable GA program specifically for your problem. Finally, experiment by running the executable program with various command-line parameters (or use a configuration file), to choose parameters that yield the best performance on your problem. Parameters include such things as population size, maximum number of trials, probability of mutation, and so on.
Genitor comes with three documentation files, one (ga_code) describes the organization of Genitor's functional modules and its data structures. A second (ga_startup) describes the initial steps for using Genitor. A third (ga_cmmd_line) describes the command line options for a Genitor program. Three required options are the string length, the population size, and the number of trials. The -s option for setting the status interval is useful, for viewing interim values while the program is running.
I was able to run the examples with little trouble. The make files must be modified to reflect the directory where you have stored the Genitor system, and a typo in one of these lines will cause the make to fail, and the reason may not be apparent. I had to make one change to a source file, ga_param.c, to get the program to run. Genitor uses a call
RandomSeed = (long) time()to generate a seed for its random number generator. On my system (a SPARCstation 2), this caused an error until I changed the line to
RandomSeed = (long) time(&RandomSeed)The GATool Package
The GATool system is more flexible, but less thoroughly tested. GATool functions similarly to Genitor, with various data types and operators, but GATool uses C++ and inheritance to allow a broader range of options (and even mixtures of options). GATool also comes with a user interface written in C that uses curses to provide a simple menubased code generator for creating the data files that describe input parameters and data types. GATool provides no command-line options and always requires a configuration file for execution, but automates creation of the configuration file. GATool automatically produces a template file, with much of the code needed for the fitness function when a user is working on a new problem, so the main program need not be changed.GATool is designed to be extensible, and programmers can readily add additional operators or selection techniques. There are seven main components to the GATool system: chromosome, reproduction, selection, replacement, crossover, mutation, and statistics. Chromosomes can be binary or gray coded and genes within a chromosome may be floating point or integer (binary is a special case of integer). Reproduction can be generational or individual (Genitor-style). Various selection options (including proportional and rank-based) are provided and replacement options include DeJong gap and elitism. Crossover options include traditional, uniform, and reduced surrogate. GATool comes with a suite of 14 example problems, including DeJong's F1-F5 and a partially deceptive problem.
GATool provides many options, but this complicates the program. It's necessary to learn substantially more to use GATool than is needed to use Genitor. The only documentation included with GATool is a directory of *.hlp files that should provide online help. On my system, they didn't, though I have seen the help running on another machine. There is a well written, 60-page manual for GATool that is not included with the code, but should be available as a technical report from the University of Missouri-Kansas City (Computer Science, 5100 Rockhill Rd., Kansas City, MO 64110). While the menu-driven interface provides some guidance, it also introduces compatibility problems. The interface would not run from the shell I normally use from within emacs (it did exit gracefully), nor would it run in an xconsole (it crashed). It would run within an xterm window, and it's usable with a dumb terminal. I ran into one bug in an input routine that requested a printf format code for outputting a gene the routine only accepted codes for floating-point values, and I wanted to output an integral value.
Running An Example
For comparison purposes, I ran the Counting 1s problem in both systems with various sizes of populations, different string lengths, and different numbers of trials. At first it appeared that Genitor was faster by a factor of three, but when I increased the number of trials, GATool appeared to be faster. This was not the case, rather GATool has a "spin" parameter that permits termination of a run when a specified number of generations has passed without changes to the population. With some additional testing it became clear that GATool is slower than Genitor, though it's difficult to establish an exact factor because it's difficult to make the systems perform exactly the same functions. Their statistics recording and termination tests are different, as are the structures they impose on the "chromosomes" that they process.For the record, Genitor was able to process 100,000 trials of a 256-bit string in less than two minutes, or, roughly one second per 1,000 trials. GATool, operating on a chromosome of 200 1-bit genes, required four seconds per 1,000 trials. So Genitor seems to have about a 4x speed advantage. However, GATool was adept at abandoning pointless trials. So, though it ran slower per trial, if would often finish a given run quicker. One note, while running the tests, GATool's interface and its filename conventions made multiple tests with varied parameters (and GA system components such as selection and reproduction) quite easy.
Note, GATool was implemented by Sara Lienau while I was serving as her advisor. The testing described here was performed on a remote site, with exactly the package Ms. Lienau had prepared for distribution.
Evaluation
Genitor is simpler, faster, and easier to begin using. GATool provides far more options and a convenient interface. I would suggest Genitor as the first tool to try in learning about GAs. If it is important to experiment with a wide variety of configurations, then GATool is the choice. GATool would be difficult to use without the manual and I'll try to get an ASCII version of the GATool manual to include with the CUG release of the GATool code. Both tools are definitely useful for experimenting with GAs, and I would not want to do without either one.
Going Further
The best introduction to genetic algorithms is D.E. Goldberg's Genetic Algorithms in Search, Optimization, and Machine Learning, from Addison-Wesley, 1989. Darrell Whitley and others at CSU have various technical reports describing work with Genitor (such as Genitor: A Different Genetic Algorithm, CS-88-101). Sara Lienau's M.S. Thesis (Effective Investigation of Genetic Algorithms, UMKC 1992) would be a valuable resource for working with GATool. For advanced work in genetic algorithms, proceedings from the International Conferences on Genetic Algorithms (there have been four) have been published by Lawrence Erlbaum (first and second) and Morgan Kaufmann (third and fourth).Product Information:
Product: Genitor (CUG 369)
GATooL (CUG 370)
Price: CUG 369 - $12
CUG 370 - $8System Requirements: UNIX SVR4 or later, C compiler (Genitor), C++ compiler (GATool)
Sidebar: "Overview of Genetic Algorithms"