January 1993/Hiding ISAM Function Libraries with OOP

Features

Hiding ISAM Function Libraries with OOP

Thomas Murphy

Thomas J. Murphy has a bachelor's degree in Physics from St. Ambrose College, Davenport, IA, and a Ph.D. in Biophysics from the University of Illinois. He has worked in the software-development field for 13 years and has owned his own computer-consulting and custom software business for over seven years. His major area of expertise is database systems. He can be reached at Computer Management Consultants, Ltd, Box 132, RR #10, Oswego, NY 13126.
Function libraries each have a unique set of functions, unique lists of arguments, and a unique user manual. Quality of documentation ranges from excellent to so indecipherable that you must analyze the source code to successfully use the product. On a project of any size, project programmers have to surmount the learning curve, not only for the application and hardware/operating system environment but for the function libraries in use.
In addition, you may be stuck with your first choice of libraries or face some rather nasty problems if you want to switch to another library. Should you convert the code for all your past clients so your current staff of programmers (even if that is only you) can continue to maintain it? Should you keep knowledgeable about every library you've ever used? Or should you just stick with your original choice?
This article presents an example programmer interface to a special-purpose function library designed to ease these problems for the case of a B+Tree, Indexed Sequential Access Method (ISAM) data-handling function library. The interface uses the standard Object-Oriented Programming (OOP) approach to protecting the application programmer and the function library from each other. It handles all the gory details of dealing with the function library by inserting a layer of function calls between the application programmer and the library itself. The functions called by the application programmer do not depend on the particular library product in use. They depend only on what the application programmer wants to do.
OOP is not a language; it is an approach to programming. The class presented in this article is written in C++, but it does not use any of the flashier bells and whistles most often talked about when one is discussing C++. There is no operator overloading, no inheritance, no polymorphism. All this class has is a little bit of data and function hiding and an OOP approach to doing business. The code could be rewritten in Standard C by throwing all the class-member data into a structure, declaring instances of the structure, and passing as an argument the address of the declared structure to what are presented here as member functions.
The examples in this article are for a specific product (CBTREE v3.0 by Peacock Systems, Inc.). Without changing the calling parameters or the names of the functions called by the application programmer, however, you could rewrite the body of these functions for any of the several ISAM function libraries I have used, and for many (no doubt) that I have not. The application code that uses the class is no longer dependent on what data handler is in use. The details of the data handler are hidden from the application programmer.

What the Class Does
Before looking at the code, look at what it was designed to do. Class Isam is a programmer interface to a data handler that uses ISAM files. After the application programmer/analyst has set up the ISAM files in the application (designed the record layouts, decided on index keys, created the files, etc.), I want him/her to be as free as possible from the nitty-gritty details of working with the data files themselves. The application programmer:

needs to be able to gain access to the data and index files with a single, simple statement. If the files are already opened in another part of the program, they should not be opened again; the programmer should just have access to them.

after gaining access to a file, needs to have space automatically allocated for the records he/she will be reading from the file. If the programmer is going to be dealing with 20 records at once, space must be allocated for 20 records.

if modifying, deleting, and adding records, needs to be able to put the new set of records in an array and write them. The programmer should not have to keep track of which records have been worked on or what has been done to them. The programmer should not have to keep track of which keys have to be removed and added to the btree indices. In other words, the programmer needs to be able to use indices without having to think about them.

should not have to write programs to re-index files. The class should do most of the work. If the programmer had to re-index the file because he/she added another index, all the remaining code should not need to be modified.

should have the freedom to write a function to create a key that manipulates the data as desired.

be able to address the indices by name, instead of an index number.

How the Class Does It
Listing 1 contains the header file defining class Isam. Notice that one of the member data items and one of the member functions is marked as specific to CBTREE. The rest of the member data and functions are generic and would be present regardless of the library being used. The code for the member functions is shown in Listing 2.
The class contains ten public functions, eight of which are routinely called by the application programmer. The destructor, ~Isam, is typically called only automatically when the instance of the class goes out of scope. Isam::reindex is called to recover from hopefully infrequent disasters. public data consists of one string array, rec, which will contain records read from the ISAM file.

The Constructor
The programmer gains access to an ISAM file by declaring an instance of the class, such as

Isam employeefile ("employee");
If the programmer will be dealing with more than one employee record at a time, such as all the employees in a department (for a small company), the maximum number can be specified in the declaration. The constructor, Isam, will allocate space for that number, as in

Isam employeefile ("employee", 20);
The first time the Isam constructor is called in a program, it calls a static function, isam_init, which loads a static array of structures, btparms[], with file and index parameters for all files in the application. This information is read from a setup file for the application. As it happens, CBTREE already uses such a file in its data-file and b-tree creation utilities and, optionally, in its runtime library. I found that CBTREE's parameter file, btparms.btr, contained almost all the information needed for the Isam class. Moreover, CBTREE provided a utility to display and modify these parameters. The two parameters missing from the file were kstart, the starting character position in the data record for a key, and keygen, the name of an application function that will generate a key if specified.
Rather than re-invent the wheel, I modified the data entry and display utility for the file provided by CBTREE to include these two parameters. Because the data entry and display utility is proprietary, the modified code is not presented here; but modification was completely straightforward. Adding the parameters on the end of each record in the file had no ill effects on CBTREE's use of the file. If you are not using CBTREE, you may have to create your own parameter file.
In addition to file and index parameters, the btparms array also contains a count (int count) of how many instances of class Isam are currently active that access each data file. If the count is greater than zero for a particular data file, btparms[] contains the file handles for the data and index files (datafd and indxfd). Thus, multiple instances of Isam for the same data file (in the same program) can share the file handles. The class constructor and destructor (Isam and ~Isam) manage all bookkeeping involved.
The constructor, based on the parameters in btparms[] and on the callers specification for the number of records to be handled at a time, allocates all required space, thus satisfying the requirements for easy access and automatically-allocated space.

Read and Write Functions
Isam::read and Isam::write work as a pair for reading records to be modified and writing them back to the file. You can retrieve a record, modify it, and write it back to the file using the get... functions (see the first if statement in Isam::write), but the result would be the addition of the new version of the record, not the replacement of the old version.
Isam::read asks the caller for a key (the only required parameter). Optionally, the caller can specify how many records are to be read (default is zero), what index to use (default is the first one), and where in the rec array to start storing records (default is the beginning). The function returns the number of records found matching the specified (perhaps partial) key, regardless of how many records were specified to be read. Many would have set the default number of records to be read at one; but I tend to do a lot of "authority checking," reading a cross-reference file to determine whether or not there is a record with the key entered by the user in a data entry program. In this instance, I don't even care about the record; I just want to know whether or not it is there. If you'd like the default to be one, simply change the declaration in Listing 1.
The maximum number of records to be read plus the starting location in the rec array must not exceed Isam::elements. (This check is made in the code.) If you don't know the number of records to expect and you want all matches, you must create an instance of Isam with room for one record, make a test read, and create another instance of Isam with room for the number of records found (integer returned by Isam::read).
Isam::read places each record read in public array, rec. It makes a second copy in private array, oldrec. Isam::write uses the copy in oldrec to compare the original version of the record as read with the new version to be written, and to generate values for old keys to be removed from the b-tree(s).
A record can be added by calling Isam::clear, then placing the new record in the array. Isam::rec and calling Isam::write. A record can also be added after a failed attempt to read a record using a user-entered value for a key. Records can be deleted by reading them using lsam::read, setting the appropriate array element of rec to a null string, and calling lsam::write. Modifying a record involves reading it using Isam::read, changing the data in rec, and calling Isam::write. A typical calling sequence (using non-Isam function names now) might look like
Isam invoice ("invoice");
Isam lineitm ("lineitm", 20);
while (1) {
   paint_invoice_screen();
   if (!get_invnum (inv_num))
      break;
   invoice.clear();
   lineitm.clear();
   if (invoice.read (inv_num,1)){
      get_inv_flds(invoice.rec[0]);
      lineitm.read(inv_num, 20);
      get_lin_flds(lineitm.rec);
      fill_inv_screen();
   }
   if (inv_edit()){
      package_inv(invoice.rec[0]);
      package_lin(lineitm.rec);
      invoice.write();
      lineitm.write();
   }
   else
      break;
}
In this code segment, get_inv_flds extracts the fields from the record, get_lin_flds extracts fields from the whole array of line item records, and package_inv and package_lin put the (perhaps modified) fields back into the records. These functions are still the responsibility of the application programmer.
Application functions get_inv_flds, get_lin_flds, and inv_edit may have their own instances of the Isam class (e.g., for data generation from customer and product files based on customer and product numbers in the invoice and lineitem records).
Once Isam::read has been called, the class should be cleared (Isam::clear) in order to forget what it read if the next write is not related to the records read. Notice in Listing 2 that Isam::write calls Isam::clear when its work is done.
All key generation is done in Isam::write. The function generates both keys to add to the b-tree(s) and keys to be removed from the b-tree(s). If a function is specified (btparms[].keygen), that function is called to generate the key. If not, a strnncpy from rec[] or oldrec[] is executed using btparms[].kstart and btparms[].keylen (Function strnncpy came in the CBTREE library. It works exactly like strncpy except that it null-terminates the copied string). Before any keys are added to the database, all characters in the keys are converted to upper case. Before any keys are used to retrieve records (in Isam::read or Isam::getge), the same case conversion is executed automatically.
Should an error occur during execution of Isam::write (like disk full), Isam::backout is called to reverse any changes to the index or data file made for the record before the error occurred. Only the record being operated on at the time of the error is backed out.
Listing 3 shows the source file for the catalog and cataloged programs used to generate keys. The application programmer is assumed to have access to this file (though not necessarily to the source file shown in Listing 2) . In order to set up a function to generate keys, the function name must be entered in the parameter file (btparms.btr), the function prototype and code must be added to Listing 3, and finally the name and address must be added to the Catolog[] array in Listing 3. This process is referred to as cataloging the function. Listing 3 shows a sample function, descwds, that returns a list of words used in a description type field, ready for insertion into the b-tree.
The application programmer must insure that the strings returned by cataloged functions are the correct length. The correct length can be any multiple of btparms[].keylen. Multiples larger than one indicate multiple values in the b-tree for the same record. Note how these are processed in the key sections of Isam::write (Listing 2) .

Re-Indexing Function
Once we've put key generation totally in the hands of Isam::write, it is a relatively easy job to create a generic re-indexing function. (See the last function in Listing 2. ) Listing 4 shows a sample program that uses this function. Note in Listing 4 that, not only are the parent ("invoice") and child ("line item") files re-indexed, but all orphan line items (lineitem records with an inv_num that corresponds to no record in the parent file) are eliminated.

Index Keys
For performance considerations, the read function and the getfirst and getnext functions described above require an integer argument specifying what index to use. Some addition and deletion of indices can take place for a file without the requirement for change to all code that uses the file. Thus, the programmer may not know what the current or future value for the integer corresponding to a particular index is. The Isam::keynum function returns the integer corresponding to the argument btname. Note in Listing 2 that I've chosen to make the name comparison case insensitive (stricmp).
It takes time to compare a specified name with the list of indices (stored in Isam::btnames). Isam::keynum improves performance because it is typically called once and the returned integer used several times in an application function.
If the function fails to find the specified btname, there is obviously a problem. The function executes a fatal-error exit (eprintf). Conceivably, however, the function could be called in response to entry of an index name specified by an end-user. If this type of interaction with the end-user is included in the code, the function should be written to return a negative integer to indicate failure rather than fatally exiting.

Report-Creation Functions
The functions, Isam::getfirst, lsam::getge, and Isam::getnext are designed for application procedures that go through the file in the order of an index (reports and data display programs). Isam::getfirst puts into Isam::rec[0] the first record in the file (as viewed through the specified index). Isam::getge puts into Isam:: rec[0] the first record with a key greater than or equal to the specified key. Isam::getnext gets the record following the one read by the last call to a function starting with get. These functions are the bare-bones minimum for putting together reasonable reports. All of the ISAM data-handling libraries have more functions of this type (getlast, getprevious, etc.). Add them to the class if you use them. A typical code segment using the get functions might look like

Isam inv("invoice"); int idx=inv.keynum("DATE_INVNUM"); inv.getge(idx, begin_date); get_inv_flds(inv.rec[0]); print_inv_summary_header(); while (inv_date <= end_date) { print_inv_summary_line(); if (!inv.getnext (idx)) break; get_inv_flds(inv.rec [0]); }

Utility Functions
There are three non-class-member functions used and shown in Listing 2. Function nospace removes all spaces from a character string. Function ToUpper converts all characters in a string to upper case. (It's a big version of toupper.) Function eprintf, as it is shown in Listing 2, works like members of the printf family, except that, after printing the error message, it exits the program. My own version of eprintf uses the window class in my toolbox to print the error message in a bright red window, sound an error buzzer, and wait for the user to write down the error message and press a key before exiting.

Conclusion
The Isam class presented in this article is an example of how OOP can be used to set up a function-library interface. With the interface, application code can be independent of the function library. More importantly, with the interface, a function library written by somebody else can work the way you decide it should work.

Sidebar: "Using CBTREE with C++"