Andrew is a lecturer in the department of computer science at the University of Melbourne, Australia. He can be reached at ad@cs.mu.oz.au.
The World Wide Web (WWW) is a hypertext-based system that allows users to "surf" the Internet, accessing information on topics as diverse as astronomy, the Marx brothers, and kite making. The most common WWW browser is Mosaic, a graphical tool from the National Center for Supercomputing Applications (NCSA) that's been ported to most operating systems. In addition, a plethora of new browsers are currently being released, offering similar (usually extended) capabilities for Windows, Macintosh, and the X Window system. (There are even text-based browsers such as Lynx.)
Underpinning the WWW is a page-description language called the "hypertext markup language" (HTML), derived from the "standard generalized markup language" (SGML). Essentially, HTML offers a small set of commands which, when embedded in a text file, allow a browser to display the text, replete with fancy fonts, graphics and, most importantly, hypertext links to other hypertext documents.
One feature of HTML is its simplicity: An HTML document can be produced in a few minutes, as Douglas McArthur demonstrated in his article, "World Wide Web and HTML" (DDJ, December 1994). However, HTML lacks support for writing documents which interact with the user. Interaction in most documents consists of the user deciding which hypertext link to follow next.
Fortunately, HTML is still evolving. The current specification of the language is Document Type Definition (DTD) level 2, which includes "forms" that allow a document to include text-entry fields, radio boxes, selection lists, check boxes, and buttons. These can be used to gather information for an application "behind" the document, to guide what is offered to the user next. Some typical forms documents include a movie database (see http://www.cm.cf.ac.uk/Movies/moviequery.html), weather-map order form (http://rs560.cl.msu.edu/weather/getmegif.html), questionnaires, surveys, and Pizza Hut's famous PizzaNet (http://www.pizzahut.com/). A problem with forms is that many older WWW browsers (such as the most-common Macintosh version of Mosaic) do not support them, although this problem is rapidly disappearing as browsers are updated.
In this article, I'll detail the steps in writing forms-based applications. Although I'll use NCSA Mosaic for X Windows 2.0, the approach is applicable to all WWW browsers with forms capabilities.
There are three basic stages in creating a forms-based document:
The other method for sending data is GET. This causes the string to arrive at the server in the environment variable QUERY_STRING, which may result in the string being truncated if it exceeds the shell's command-line length. For that reason, GET should be avoided.
A form can contain six types of data-entry fields: single-line text-entry fields, as in Figure 1(a); check boxes, radio boxes, and selection lists, as in Figure 1(b); multiline text-entry fields; and submit and reset buttons, as in Figure 1(c). Single-line text-entry fields, check boxes, and radio boxes are specified using the same basic HTML syntax: <INPUT TYPE="field-type" NAME=Name of field" VALUE="default value" >, where field-type can be either: text, check box, radio, hidden, or password.
For a check box or radio box, the VALUE field specifies the value of the field when it is checked; unchecked check boxes are disregarded when name=value substrings are posted to the application.
If several radio boxes have the same name, they act as a one-of-many selection: Only one of them can be switched "on" and have its value paired with the name. A hidden text field does not appear on the form, but it can have a default value which will be sent to the application. A password text field will echo asterisks (*) when a value is typed into it. A selection list is specified using the code in Example 1(a). The option chosen will become the value associated with the selection list's name. It is also possible to include the attribute MULTIPLE after the NAME string to allow multiple selections. This maps to multiple name=value substrings, each with the same name. A multiline text-entry field has the form of Example 1(b).
The submit button causes the document to collect the data from the various form fields, pair it with the names of the fields, and post it to the application. The reset button resets the fields to their default values. Example 1(c) illustrates button syntax. All of these form constructs are illustrated on Douglas McArthur's form at http://www.biodata.com/douglas/form.html.
The form's HTML code is shown in Listing One. Figure 1 shows how part of the form looks on an X-terminal running Mosaic for X Windows. Thirteen other form examples are accessible through http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/fill-out-forms/overview.html (overview.html also contains more details on the syntax of form fields).
Stylists recommend that a form be separated from the rest of a document by a horizontal rule (<HR>). Rules are also useful for subdividing logical subcomponents within a form. The submit button should always be placed at the end of the form.
When the submit button is clicked, the POST method causes a string to be sent to the application. The string consists of a series of name=value substrings, separated by ampersands (&). An added complication is that name and value are encoded so that spaces are changed into plus signs (+) and some characters are encoded as hexadecimals. Fortunately, form-application programmers have written routines for handling these coded strings.
The POST method means that the form application will receive the string on its standard input. This protocol is defined by the Common Gateway Interface (CGI) specification, which also states that an application can respond by generating suitable code on its standard output. Details on the CGI specification can be found at http://hoohoo.ncsa.uiuc.edu/cgi/interface.html.
The CGI specification permits an application to output many different types of documents (for example, an image, audio code, plain text, HTML, or references to other documents). The application determines the output type by writing a header string to standard output, of the form: Content-type: type/subtype, where type and subtype must be MIME (Multipurpose Internet Mail Extensions) types; two common types are text/html for HTML output and text/plain for ASCII text. There must be two blank lines after the header, and then the data can begin. For instance, an application (coded in C) could output Example 2. More details on the CGI output protocol can be found at http://hoohoo.ncsa.uiuc.edu/cgi/primer.html, while documentation on MIME begins at http://info.cern.ch/hypertext/WWW/Protocols/rfc1341/0_Abstract.html.
I'll now turn to a complete example--an "echoer" application. This example's input document consists of a form with five single-line text-entry fields. The application processes it by outputting an HTML document containing the text entered in the fields. In other words, the user's input is echoed.
Figure 2 shows the input document. The form is quite simple: five text-entry fields, plus submit and reset buttons labeled Start, Search, and Clear, respectively. Listing Two is the HTML code for the document (it is also available at http://www.cs.mu.oz.au/~ad/code/form-gp.html).
The text-field constructs include extra attributes to limit the size of both the input and the boxes drawn on the screen. The fields are named pat1 through pat5, although these are not displayed as part of the input document. When the terms "John" and "uk" are input and the Start Search button is clicked, the application returns Figure 3.
In form-gp.html, the name of the application is given in the FORM ACTION attribute as http://www.cs.mu.oz.au/cgi-bin/qgp, where qgp's actual location on the server depends on the configuration file for the httpd daemon (called http.conf). The relevant line in that file is Exec /cgi-bin /* /local/dept/wwwd/scripts/*. In other words, qgp must be placed in /local/dept/wwwd/scripts for the form to invoke it. This step in linking the input HTML document to the application varies from system to system. Listing Three, qgp.c (which can also be found at http://www.cs.mu.oz.au/~ad/code/qgp.c), consists mostly of utility functions for processing name=value substrings; consequently, these appear in almost every form application. The functions were written by Rob McCool and can be accessed at http://hoohoo.ncsa.uiuc.edu/cgi/forms.html. Also available from that page are similar utilities for writing applications in the Bourne Shell, Perl, and Tcl, along with several excellent small programs showing how the utilities can be used.
The qgp.c program uses five utility functions: makeword(), fmakeword(), unescape_url(), x2c(), and plustospace(). makeword() builds a word by extracting characters from a larger string up to a stopping character (or the end of the longer string). fmakeword() performs a similar operation but reads from a file and is also supplied with the length of the string left unread in the file. unescape_url() converts hexadecimal characters in a string into ordinary characters, by calling x2c(). plustospace() converts the plus signs (+) in a string into spaces.
main() begins by outputting the header line for the reply document, an HTML document in this case. The If tests perform standard error checking: The first determines whether the delivery METHOD is something other than POST; the second checks the encoding strategy for the name=value substrings. In fact, the only encoding supported by Mosaic for X Windows 2.0 is x-www-form-urlencoded, but this may not be the case for other browsers.
The If tests and the use of CONTENT_LENGTH illustrate the importance of environment variables for conveying information from the input document to the application. A complete list of environment variables supported by the CGI specification can be found at http://hoohoo.ncsa.iuc.edu/cgi/env.html.
CONTENT_LENGTH contains the length of the string sent to the application and is used by the For loop to build the entries array. Each name=value substring is extracted by a call to fmakeword(). The pluses (+) and hexadecimal URL encodings are replaced, and then the name part of the substring is removed, leaving only the value. More output to the HTML reply document follows, then the final For loop cycles through the entries array and prints the name and value strings.
The next example, a file-searcher application, uses the input document in Figure 2, but the application now searches through a text file holding a membership list. It looks for lines containing the strings entered in the text-entry fields of the form. A maximum of ten matching lines are printed, together with the total number of matching lines.
Querying again for "John" and "uk" results in the HTML document in Figure 4 being generated by the application. It can also produce an error document if no strings are entered before a search is initiated.
Listing Four is the C code for qdir.c (it can also be found at http://www.cs.mu.oz.au/~ad/code/qdir.c). The compiled version, qdir, is in /local/dept/wwwd/scripts, and form-gp.html now uses http://www.cs.mu.oz.au/cgi-bin/qdir as the URL in the FORM ACTION attribute.
main() begins with the same preliminaries as qgp.c and uses the same utility functions. The call to record_details() logs information about the user in a file and has no effect on the subsequent code. get_pats() searches through the entries array and copies the nonempty strings into a patterns array. If there are no strings in entries, then get_pats() outputs an HTML error document.
build_re() translates the strings in the patterns array into part of a UNIX command. The idea is to translate a single search string (such as "John") into the command fgrep 'John' search-file > temp-file. Multiple search strings like "John," "uk," and "LPA" would be utilized in the command fgrep 'John' search-file | fgrep 'uk' | fgrep 'LPA' > temp-file. The trick is to pipe the matching lines of one call to fgrep into another call which further filters the selection.
The matching lines are printed by a While loop which reads at most ten lines from the temporary file. The total number of lines in the temporary file is counted by the UNIX wc command (wc -l temp-file > second-temp-file). The value is read in from the second temporary file and printed to the reply document.
This approach demonstrates how to utilize UNIX as part of an application. UNIX features are preferable in this case because of the size of the file being searched and the potentially large number of matching lines that need to be manipulated. UNIX can also be employed to create forms that edit files, send mail, read news, or monitor the network, for example.
You'll find it useful in the early stages of form design to test the form without having to write the accompanying application. For instance, early form testing involves checking what default values are posted if the user immediately presses the "submit" button.
One possibility is to set the FORM ACTION to point to qgp (or a program like it), which returns name/value pairs. An alternative is to set ACTION to http://hoohoo.ncsa.uiuc.edu/htbin-post/post-query. This program does much the same thing as qgp. The drawbacks are longer network-access time and the inability to modify the application to test specific form features.
Testing is also a problem with form applications, since it is not possible to run their user interface (for example, the input form and a browser) inside a source-level debugger. In normal circumstances, if the application fails, the browser returns a cryptic message. The easiest way to avoid this problem is to test a modified version of the application that reads name and value pairs from the keyboard. Example 3, a fragment of code that illustrates this for qdir.c, reads strings straight into the val fields of the entries array. Since the name fields are not used in this application, they are not assigned values. Output can be sent to the screen and is quite readable even when mixed with HTML formatting instructions.
Remember that data may already be available in an access-log file that records all browser accesses to the server and is set up through the httpd configuration file. For many applications, however, such general-purpose logging may not capture all the information required. For instance, a common reason for recording accesses is to have the application offer different facilities to different users. Thus, in a video-ordering service, it might be useful to record the types of films that a user likes, so that similar films can be pointed out when that user next makes an order. For such specific information, it is better to have the application carry out the necessary logging.
In the logging version of qdir.c, access details are collected from four environment variables: REMOTE_USER, REMOTE_IDENT, REMOTE_HOST, and REMOTE_ADDR. In addition, the local time on the server is recorded.
The extra code is wrapped up inside the record_details() function called early in qdir.c. The code for record_details() is included in Listing Four. One drawback with the first three environment variables is that they are not guaranteed to have values. REMOTE_USER is only bound if the client and server support user authentication, and REMOTE_IDENT relies on support for RFC 931 identification. REMOTE_HOST may not be bound, but the IP address equivalent will be assigned to REMOTE_ADDR.
Forms are an extremely useful mechanism, since they transform HTML from a hypertext page-description language into a tool for creating interactive documents. Forms and their associated programs are straightforward to write, due to the availability of examples, utilities, and documentation accessible through the WWW.
Figure 2 Typical input document.
Figure 3 HTML document generated by echoing example application.
Figure 4 HTML document generated by the file-search application.
(a)
<SELECT NAME="list title">
<OPTION>first option
<OPTION>second option
:
</SELECT>
(b)
<TEXTAREA NAME="text area name" ROWS=no-of-rows COLS=no-of-columns >
Default text goes here
</TEXTAREA>
(c)
<INPUT TYPE="submit" VALUE="text on button" >
<INPUT TYPE="reset" VALUE="text on button" >
printf("content-type: text/html%c%c",10,10); /* 10 is a linefeed */
printf("<H1>Search String Error!</H1>");
printf("<BR>Must specify at least 1 pattern<p>");
char line[LINELEN];
etnum = 0;
while (etnum < PATNO) {
printf("Enter pattern %d:",etnum+1);
if (gets(line) == NULL) /* input terminated? */
break;
entries[etnum].val = (char *) malloc(sizeof(char)*(strlen(line)+1));
strcpy(entries[etnum].val, line);
etnum++;
}
<HTML>
<HEAD>
<!-- ------------------------------------------------------------------- -->
<!-- http://www.biodata.com/douglas/form.html - Modified 9/20/94 -=DCM=- -->
<!-- ------------------------------------------------------------------- -->
<TITLE>Prototypical HTML Forms</TITLE>
<H1>Prototypical HTML Forms</H1>
</HEAD>
This document displays the various form gadgets currently supported.
<P>
<FORM ACTION="http://hoohoo.ncsa.uiuc.edu/htbin-post/post-query" METHOD="POST">
<HR>
<H1>Text Fields</H1>
Basic text entry field:
<INPUT TYPE="text" NAME="entry1" VALUE="">
<P>
Text entry field with default value:
<INPUT TYPE="text" NAME="entry2" VALUE="This is the default.">
<P>
Text entry field of 40 characters:
<INPUT TYPE="text" NAME="entry3" SIZE=40 VALUE="">
<P>
Text entry field of 5 characters, maximum:
<INPUT TYPE="text" NAME="entry5" SIZE=5 MAXLENGTH=5 VALUE="">
<P>
Password entry field (*'s are echoed):
<INPUT TYPE="password" NAME="password" SIZE=8 MAXLENGTH=8 VALUE="">
<HR>
<H1>Textareas</H1>
A 60x3 scrollable textarea:
<P>
<TEXTAREA NAME="textarea" COLS=60 ROWS=3>NOTE:
Default text can be entered here.
</TEXTAREA>
<HR>
<H1>Checkboxes</H1>
Here is a checkbox
<INPUT TYPE="checkbox" NAME="Checkbox1" VALUE="TRUE">,
and a checked checkbox
<INPUT TYPE="checkbox" NAME="Checkbox2" VALUE="TRUE" CHECKED>
.
<HR>
<H1>Radio Buttons</H1>
Radio buttons (one-of-many selection):
<OL>
<LI>
<INPUT TYPE="radio" NAME="radio1" VALUE="value1">
First choice.
<LI>
<INPUT TYPE="radio" NAME="radio1" VALUE="value2" CHECKED>
Second choice. (Default CHECKED.)
<LI>
<INPUT TYPE="radio" NAME="radio1" VALUE="value3">
Third choice.
</OL>
<HR>
<H1>Option Menus</H1>
One-of-many (Third Option selected by default):
<SELECT NAME="first-menu">
<OPTION>First Option
<OPTION>Second Option
<OPTION SELECTED>Third Option
<OPTION>Fourth Option
<OPTION>Last Option
</SELECT>
<P>
Many-of-many (First and Third selected by default):
<SELECT NAME="second-menu" MULTIPLE>
<OPTION SELECTED>First Option
<OPTION>Second Option
<OPTION SELECTED>Third Option
<OPTION>Fourth Option
<OPTION>Last Option
</SELECT>
<P>
<B>NOTE: Hold down CTRL and click to multiple-select.</B>
<!-- You can also assign VALUEs using TYPE="hidden" -->
<INPUT TYPE="hidden" NAME="hidden" VALUE="invisible">
<HR>
<H1>Special Buttons</H1>
Submit button (mandatory):
<INPUT TYPE="submit" VALUE="Submit Form">
<P>
Reset button (optional):
<INPUT TYPE="reset" VALUE="Clear Values">
<P>
</FORM>
<HR>
<H1>References</H1>
Heres a link to
<A HREF="http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/
fill-out-forms/overview.html">
a handy HTML forms reference
</A>.
<P>
<HR>
<ADDRESS>
Prototypical HTML Form /
<A HREF="http://www.biodata.com/douglas/people/douglas.html">
douglas@BioData.COM
</A>
</ADDRESS>
</HTML>
<HTML> <HEAD> <TITLE>ALP Membership Search</TITLE> </HEAD> <BODY> <H1><img src="../alp/symbol.gif"> ALP Membership Search</H1> <BR> <ul> <li>Enter at most 15 characters in a box (e.g. <code>Melbo</code>). <p> <li>At least one box should contain something. <p> <li>Matches are lines in the membership list which contain all the box entries. <p> <li>The first 10 matches will be returned, together with the total number of matches. <p> <li>Click the <b>Start Search</b> button to start the search. <p> <li>All the boxes can be cleared by clicking on the <b>Clear</b> button. <p> </ul> <BR> <H2><img src="gball.gif"> Search Boxes</H2> <FORM ACTION="http://www.cs.mu.oz.au/cgi-bin/qgp" METHOD="POST"> <INPUT TYPE="text" NAME="pat1" SIZE="15" MAXLENGTH="15" VALUE=""> <INPUT TYPE="text" NAME="pat2" SIZE="15" MAXLENGTH="15" VALUE=""> <INPUT TYPE="text" NAME="pat3" SIZE="15" MAXLENGTH="15" VALUE=""> <INPUT TYPE="text" NAME="pat4" SIZE="15" MAXLENGTH="15" VALUE=""> <INPUT TYPE="text" NAME="pat5" SIZE="15" MAXLENGTH="15" VALUE=""> <P> <BR> <INPUT TYPE="submit" VALUE="Start Search"> <INPUT TYPE="reset" VALUE="Clear"> <P> </FORM> <HR> <br> <img src="gball.gif"> USE OF THIS MEMBERSHIP LIST FOR COMMERCIAL OR PROMOTIONAL PURPOSES IS PROHIBITED. <p> <img src="gball.gif"> If you have any problems using this service, contact <a href="http://www.cs.mu.oz.au/~ad">Andrew Davison</a>. <p> <img src="gball.gif"> If you would like changes made to the membership list, contact the ALP Administrative Secretary. <p> <HR> <ADDRESS> </BODY> <a href="../alp/alp-news/dir.html">To Membership List Info</a> </A> </ADDRESS> </HTML>
/* Echo name=value substrings posted from Form */
/* HTML utilities written by Rob McCool */
#include <stdio.h>
#include <stdlib.h>
#define LF 10
#define CR 13
#define MAX_ENTRIES 5 /* number of input fields */
typedef struct {
char *name;
char *val;
} entry;
char *makeword(char *line, char stop);
char *fmakeword(FILE *f, char stop, int *len);
void unescape_url(char *url);
char x2c(char *what);
void plustospace(char *str);
main() {
entry entries[MAX_ENTRIES]; /* HTML name-val pairs */
int x, cl, etnum;
printf("Content-type: text/html%c%c",LF,LF);
if(strcmp(getenv("REQUEST_METHOD"),"POST")) {
printf("This script should be referenced with a METHOD of POST.\n");
printf("If you don't understand this, read ");
printf("<A HREF=\"http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/
Docs/fill-out-forms/overview.html\">forms overview</A>.%c",LF);
exit(1);
}
if(strcmp(getenv("CONTENT_TYPE"),"application/x-www-form-urlencoded")) {
printf("This script can only be used to decode form results. \n");
exit(1);
}
cl = atoi(getenv("CONTENT_LENGTH"));
etnum = 0;
for(x=0;cl && (!feof(stdin));x++) {
entries[x].val = fmakeword(stdin,'&',&cl);
plustospace(entries[x].val);
unescape_url(entries[x].val);
entries[x].name = makeword(entries[x].val,'=');
etnum++;
}
printf("<H1>Query Results</H1>");
printf("You submitted the following name/value pairs:<p>%c",LF);
printf("<ul>%c",LF);
for(x=0; x < etnum; x++)
printf("<li> <code>%s = %s</code>%c",entries[x].name,entries[x].val,LF);
printf("</ul>%c",LF);
}
/* HTML utilities */
char *makeword(char *line, char stop) {
int x = 0,y;
char *word = (char *) malloc(sizeof(char) * (strlen(line) + 1));
for(x=0;((line[x]) && (line[x] != stop));x++)
word[x] = line[x];
word[x] = '\0';
if(line[x]) ++x;
y=0;
while(line[y++] = line[x++]);
return word;
}
char *fmakeword(FILE *f, char stop, int *cl) {
int wsize;
char *word;
int ll;
wsize = 102400;
ll=0;
word = (char *) malloc(sizeof(char) * (wsize + 1));
while(1) {
word[ll] = (char)fgetc(f);
if(ll==wsize) {
word[ll+1] = '\0';
wsize+=102400;
word = (char *)realloc(word,sizeof(char)*(wsize+1));
}
--(*cl);
if((word[ll] == stop) | (feof(f)) | (!(*cl))) {
if(word[ll] != stop) ll++;
word[ll] = '\0';
return word;
}
++ll;
}
}
void unescape_url(char *url) {
register int x,y;
for(x=0,y=0;url[y];++x,++y) {
if((url[x] = url[y]) == '%') {
url[x] = x2c(&url[y+1]);
y+=2;
}
}
url[x] = '\0';
}
char x2c(char *what) {
register char digit;
digit = (what[0] >= 'A' ? ((what[0] & 0xdf) - 'A')+10 : (what[0] - '0'));
digit *= 16;
digit += (what[1] >= 'A' ? ((what[1] & 0xdf) - 'A')+10 : (what[1] - '0'));
return(digit);
}
void plustospace(char *str) {
register int x;
for(x=0;str[x];x++)
if(str[x] == '+') str[x] = ' ';
}
/* Search file FNM via a HTML form. This version logs user in file RFNM */
/* Executable is in /local/dept/wwwd/scripts/qdir */
/* HTML utilities written by Rob McCool */
/* The rest by Andrew Davison (ad@cs.mu.oz.au), December 1994 */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#define LF 10
#define CR 13
#define MAX_ENTRIES 5 /* number of input fields */
#define PATNO 5 /* max number of patterns */
#define RESNO 10 /* max number of matching lines */
#define CMDLEN 200 /* max length of cmd */
#define LINELEN 240 /* max length of input line */
#define NAMELEN 40 /* max length of a file name */
#define FNM "/home/staff/ad/www_public/code/dir.alp" /* file searched */
#define RFNM "/home/staff/ad/www_public/code/people.txt" /* log file */
typedef struct {
char *name;
char *val;
} entry;
void get_pats(entry entries[], int etnum, char pat[][LINELEN], int *pno);
char *build_re(char pat[][LINELEN], int tot, char re[]);
void back_to_form(void);
void record_details(void);
char *makeword(char *line, char stop);
char *fmakeword(FILE *f, char stop, int *len);
void unescape_url(char *url);
char x2c(char *what);
void plustospace(char *str);
main()
{
char gcmd[CMDLEN]; /* fgrep command string */
char restexpr[CMDLEN]; /* part of fgrep string */
char wcmd[CMDLEN]; /* line count cmd string */
char result[LINELEN]; /* matching line */
char patterns[PATNO][LINELEN]; /* patterns */
char tmp_gfname[NAMELEN], tmp_wfname[NAMELEN];
FILE *gtfp, *wtfp; /* temp file ptrs */
int rno, pno, nmatch, nlines;
entry entries[MAX_ENTRIES]; /* HTML name-val pairs */
int x, w, cl, etnum;
printf("Content-type: text/html%c%c",LF,LF);
if(strcmp(getenv("REQUEST_METHOD"),"POST")) {
printf("This script should be referenced with a METHOD of POST.\n");
printf("If you don't understand this, read ");
printf("<A HREF=\"http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/
Docs/fill-out-forms/overview.html\">forms overview</A>.%c",LF);
exit(1);
}
if(strcmp(getenv("CONTENT_TYPE"),"application/x-www-form-urlencoded")) {
printf("This script can only be used to decode form results. \n");
exit(1);
}
cl = atoi(getenv("CONTENT_LENGTH"));
etnum = 0;
for(x=0;cl && (!feof(stdin));x++) {
entries[x].val = fmakeword(stdin,'&',&cl);
plustospace(entries[x].val);
unescape_url(entries[x].val);
entries[x].name = makeword(entries[x].val,'=');
etnum++;
}
record_details(); /* log the user */
/* collect non-empty strings into patterns[] */
get_pats(entries, etnum, patterns, &pno);
printf("<H1>Search Results</H1>");
printf("<BR>Maximum of 10 matching lines are shown for any search.<p>");
printf("<BR>The following strings are being used for the search:<p>%c",LF);
printf("<ul>%c",LF);
for(x=0; x < pno; x++)
printf("<li> %s%c",patterns[x],LF);
printf("</ul>%c",LF);
/* get at most RESNO matching lines */
tmpnam(tmp_gfname);
build_re(patterns,pno,restexpr);
sprintf(gcmd,"fgrep '%s' %s %s > %s",patterns[0], FNM, restexpr, tmp_gfname);
system(gcmd);
printf("<BR><b>The lines found are:</b><P>");
printf("<ul>%c",LF);
gtfp = fopen(tmp_gfname,"r");
rno = 0;
while (rno < RESNO) {
if (fgets(result, LINELEN, gtfp) == NULL)
break;
printf("<li> "); puts(result); printf("<P>");
rno++;
}
printf("</ul>%c",LF);
fclose(gtfp);
/* count the total number of matching lines */
tmpnam(tmp_wfname);
sprintf(wcmd, "wc -l %s > %s", tmp_gfname, tmp_wfname);
system(wcmd);
wtfp = fopen(tmp_wfname,"r");
fscanf(wtfp,"%d", &nlines);
fclose(wtfp);
if (nlines > RESNO)
printf("<BR><b>%d lines printed from a total of %d.</b><P>",RESNO,nlines);
else if (rno == 0)
printf("<BR><b>No matching lines.</b><P>");
else
printf("<BR><b>%d line(s) printed.</b><P>", rno);
back_to_form();
remove(tmp_gfname);
remove(tmp_wfname);
}
void get_pats(entry entries[], int etnum, char pat[][LINELEN], int *pno)
{
int x;
*pno = 0;
for (x=0; x < etnum; x++) {
if (entries[x].val[0] != '\0') {
strcpy(pat[*pno], entries[x].val);
(*pno)++;
}
}
if (*pno == 0) {
printf("<H1>Search String Error!</H1>");
printf("<BR>Must specify at least 1 pattern<p>");
back_to_form();
exit(1);
}
}
char *build_re(char pat[][LINELEN], int total, char re[])
{
char part [NAMELEN];
int idx;
re[0]='\0';
for (idx=1; idx<total; idx++){
sprintf(part,"| fgrep '%s'",pat[idx]);
strcout(re,part);
}
}
void back_to_form(void)
{
printf("<HR><BR><i><a href=\"http://www.cs.mu.oz.au/~ad/code/form-gp.html\">
Back to Form</a></i>.");
}
void record_details(void)
{
char *ruser, *rid, *rhost, *raddr;
struct tm *tp;
time_t now;
FILE *rfp;
rfp = fopen(RFNM,"a");
ruser = getenv("REMOTE_USER");
if (strcmp(ruser,"") == 0)
fprintf(rfp,"no_ruser ");
else
fprintf(rfp, "%s ",ruser);
rid = getenv("REMOTE_IDENT");
if (strcmp(rid,"") == 0)
fprintf(rfp,"no_rid ");
else
fprintf(rfp, "%s ",rid);
rhost = getenv("REMOTE_HOST");
if (strcmp(rhost,"") == 0)
fprintf(rfp,"no_rhost ");
else
fprintf(rfp, "%s ",rhost);
raddr = getenv("REMOTE_ADDR");
if (strcmp(raddr,"") == 0)
fprintf(rfp,"no_raddr ");
else
fprintf(rfp, "%s ",raddr);
now = time(NULL);
tp = localtime(&now);
if (tp == NULL)
fprintf(rfp,"no_ltime\n");
else
fprintf(rfp, "%s",ctime(&now));
fclose(rfp);
}
/* HTML utilities */
char *makeword(char *line, char stop) {
int x = 0,y;
char *word = (char *) malloc(sizeof(char) * (strlen(line) + 1));
for(x=0;((line[x]) && (line[x] != stop));x++)
word[x] = line[x];
word[x] = '\0';
if(line[x]) ++x;
y=0;
while(line[y++] = line[x++]);
return word;
}
char *fmakeword(FILE *f, char stop, int *cl) {
int wsize;
char *word;
int ll;
wsize = 102400;
ll=0;
word = (char *) malloc(sizeof(char) * (wsize + 1));
while(1) {
word[ll] = (char)fgetc(f);
if(ll==wsize) {
word[ll+1] = '\0';
wsize+=102400;
word = (char *)realloc(word,sizeof(char)*(wsize+1));
}
--(*cl);
if((word[ll] == stop) | (feof(f)) | (!(*cl))) {
if(word[ll] != stop) ll++;
word[ll] = '\0';
return word;
}
++ll;
}
}
void unescape_url(char *url) {
register int x,y;
for(x=0,y=0;url[y];++x,++y) {
if((url[x] = url[y]) == '%') {
url[x] = x2c(&url[y+1]);
y+=2;
}
}
url[x] = '\0';
}
char x2c(char *what) {
register char digit;
digit = (what[0] >= 'A' ? ((what[0] & 0xdf) - 'A')+10 : (what[0] - '0'));
digit *= 16;
digit += (what[1] >= 'A' ? ((what[1] & 0xdf) - 'A')+10 : (what[1] - '0'));
return(digit);
}
void plustospace(char *str) {
register int x;
for(x=0;str[x];x++)
if(str[x] == '+') str[x] = ' ';
}
Copyright © 1995, Dr. Dobb's Journal