User Interfaces


Expanding a Conversation Processor for Time

Russell Suereth


Russell Suereth has been consulting for over 12 years in the New York City and Boston areas. He started designing and coding systems on IBM mainframes and now also builds PC software systems. You can write to Russell at 84 Old Denville Rd, Boonton, NJ 07005. Russell Suereth is now completing a book, to be published this summer by R&D Technical Books, entitled Processing Human Conversations. He has extended his basic processor to deal with more complex issues such as idioms, generating questions, and identifying general themes.

This article expands the language processor presented in "A Natural Language Processor," CUJ, April 1993, to include time. That processor could accept input sentences from a user, and within certain limited contexts, accurately interpret their meaning. The processor can now recognize time words and phrases to process more kinds of input sentences. I've included additional processes for tense and number in the article. These additional processes help clarify the meanings of ambiguous sentences, and generate an error response when tense or number don't agree.

The code presented here is the part of the language processor that deals with time. The complete processor is not shown here, but it is available on the code disk.

Defining Time Words and Phrases

Time phrases are sequences of words that clarify when an event occurs. Without time phrases the event can only be recognized as occurring in the past, present, or future, by examining the tense of the sentence in which the event occurs.

Time phrases identify events by specific time, elapsed time, or habitual actions. Specific time is indicated by the day, month, year, or clock time. The sentence "Jim runs at one o'clock" identifies the specific time Jim runs. Elapsed time is indicated by a duration. The sentence "Jim runs for one hour" identifies the length of time Jim runs. Habitual actions are indicated by words such as "each" and "every." The sentence "Jim runs every day" identifies the action Jim often performs.

A time word is a word that clarifies when an event occurs. The word may stand on its own, or be part of a time phrase. Words such as "Tuesday," "midnight," and "morning" are time words, but the words "at," "in," and "last" are also time words when used in a time phrase. The processor identifies time words by matching the input sentence to an underlying structure.

Identifying Time Words and Phrases

Listing 1 contains the code for this article. The main routine calls the check_underlying and check_time routines to identify time words in the input sentence. The check_underlying routine matches the input sentence to the underlying structures. If the match is successful, then the sentence is processed. Time words are words that match an underlying structure and have a TIME word type. The processor recognizes adjacent time words as a time phrase and assigns the value TIMEPHRASE to these words. The check_time routine copies the time phrase into the times array. The times array contains a time phrase entry for each input sentence.

Deriving Time Meaning

The processor derives time meaning from specific time words in the input sentence. Some time words and their meanings are shown in Table 1. When one of these words occurs in a sentence and indicates time, that word's time meaning is assigned to the sentence.

The main routine calls derive_time_meaning which looks at the input sentence's time phrase. If a time phrase word matches a coded time word, then that word's time meaning is assigned to the time_meaning array. The time_meaning array contains a time meaning entry for each input sentence.

Interpreting Auxiliary Meaning

Many sentences contain auxiliary phrases which qualify the certainty of a particular event's occurrence. Example phrases are "could be" and "must have been." The auxiliary phrase helps the processor understand the sentence. On the other hand, the sentence can't be fully understood if the auxiliary meaning is unclear. Unclear auxiliary meaning occurs when the sentence has no auxiliary, when the auxiliary has more than one meaning, or when the auxiliary meaning is ambiguous. If the auxiliary meaning is unclear, the processor can use the auxiliary and sentence tense to determine a clear meaning. Table 2 shows unclear auxiliary and tense combinations with their clear meaning.

Handling No Auxiliary

The auxiliary meaning is unclear when the sentence has no auxiliary. Sentences with no auxiliary have an implied auxiliary meaning. The processor uses the sentence tense to identify the implied meaning. For example, the sentence "Jim ran in the race" has no auxiliary and is past tense. That sentence is similar to another sentence with past tense "Jim had run in the race." The auxiliary "had" means a particular point of time. When an input sentence has no auxiliary and is past tense, the processor assigns the particular point of time meaning to the sentence. In another example, the sentence "Jim runs in the race" has no auxiliary and is present tense. That sentence is similar to another sentence with present tense "Jim is running in the race." The auxiliary "is," used in the present tense, means limited duration. When an input sentence has no auxiliary and is present tense, the processor assigns the limited duration meaning to the sentence.

The main routine calls derive_aux_meaning to derive a clear auxiliary meaning from an unclear meaning. The derive_aux_meaning routine looks at the auxiliaries entry for the sentence. If the auxiliary's string length is zero, then the sentence has no auxiliary and the routine looks at the tenses entry. If the tenses value is PAST, then PARTICULAR_POINT_OF_TIME is assigned to the auxiliary meaning. If tenses is PRESENT, then LIMITED_DURATION is assigned to the auxiliary meaning. If tenses is FUTURE, then the processor assigns FIXED_PLAN to the auxiliary meaning.

Handling Ambiguous Auxiliaries

An ambiguous auxiliary is also potentially confusing to the processor. An example is the auxiliary "could be" in the sentence "Jim could be running in the race." In the present tense, "could be" has two meanings. That sentence in the present tense means "Jim is able to run in the race," or "Jim is permitted to run in the race." The auxiliary meaning remains ambiguous because "could be" has two meanings in the present tense. In the future tense, "could be" has one meaning. That sentence in the future tense means "It is possible that Jim will run in the race." The auxiliary meaning is clarified because "could be" has only one meaning in the future tense.

The derive_aux_meaning routine looks for specific ambiguous auxiliaries in the sentence's auxiliaries entry. If a specific ambiguous auxiliary is found, then the routine assigns a clear meaning to the sentence's auxiliary meaning. The routine assigns a clear meaning based on the value in the sentence's tenses entry.

Asking for Specific Meaning

The processor can clarify ambiguous auxiliary meaning when certain tenses are used. But when the auxiliary meaning remains ambiguous, then the processor generates a response that asks for more specific meaning. Figure 1 shows a processor session with ambiguous auxiliary meaning in the input sentence.

The make_response routine calls ask_meaning to generate a response that asks about ambiguous auxiliary meaning. The ask_meaning routine looks for specific auxiliaries in the sentence's auxiliaries entry. If a specific auxiliary is found, then the routine generates a response. The routine also looks at the sentence's tenses and numbers entries. These entries help identify the appropriate auxiliaries that can be used to make the response grammatical.

Error Handling

Time phrases can refer to an event in the past, present, or future. This time reference must match the past, present, or future tense of the sentence. If the time and tense conflict, then the sentence doesn't make sense and is ungrammatical. An example is the sentence "Jim had run next week." The time and tense don't match because "had run" refers to the past, while "next week" refers to the future.

The make_response routine calls check_agreement to identify agreement errors in the input sentence. The check_agreement routine looks at the time_meaning and tenses entries for the input sentence. If the time_meaning entry is LAST, and the tenses entry is PRESENT or FUTURE, then agreement_error is called to generate an agreement error response. If the time_meaning entry is NEXT, and the tenses entry is PAST, then agreement_error is called for the agreement error response. The check_agreement routine also calls agreement_error when either the sentence number or sentence tense are in error.

The processor generates an error response for three kinds of grammar conflicts in the input sentence. A grammar conflict between time meaning and auxiliary meaning is shown in the sentence "Jim will run last week." A grammar conflict between auxiliary tense and verb tense is shown in the sentence "Jim will ran in the race." A grammar conflict among subject, auxiliary, and verb number is shown in the sentence "Jim are running in the race." The explanation response gives a detailed reason why the input sentence is ungrammatical. Figure 2 shows the explanation responses for the above ungrammatical input sentences.

The check_agreement routine calls agreement_error to generate the explanation response. The agreement_error routine is passed a value in the error_type parameter that identifies the kind of grammar conflict. If the error_type is TIME_MEANING_ERROR, then the time and auxiliary meaning conflict and the routine generates an appropriate explanation response. If the error_type value is TENSES_ERROR, then the auxiliary and verb tense conflict and an explanation response is generated. If the error_type value is NUMBER_ERROR, then the subject, auxiliary, and verb number conflict and an explanation response is generated.

Responses Based on Meaning

The processor generates a response based on auxiliary meaning when the sentence is grammatical. In the original processor the generated response was a simple "OK." This expanded processor creates an interesting response based on the auxiliary meaning. Given the input sentence "Jim is running in the race," the processor assigns the limited duration meaning to the sentence. The processor then generates the response "When will Jim stop running" based on that auxiliary meaning. Table 3 shows auxiliary meanings and associated example responses.

The make_response routine calls aux_meaning_response to generate a response based on auxiliary meaning. The aux_meaning_response routine looks for specific auxiliary meanings in the input sentence. If a specific auxiliary meaning is found, then a response for that meaning is generated with words from the input sentence. The routine also checks the sentence tense, number, and subject type so appropriate words can be used to create a grammatical response.

Summary

Several processes are required to process time. The processes described in this article identify time phrases and words in the sentence; derive time meaning from the sentence; derive a clear auxiliary meaning from an unclear meaning; generate a response that asks about unclear meaning; determine time and sentence tense agreement; generate a response that explains grammar conflicts; and generate a response based on meaning.

Further expansions to the processor could compare one time to another; include time when matching input sentences; generate interesting responses based on time meaning; and allow ungrammatical sentences to be fully processed.

Processing time helps the processor identify grammatical sentences, and to derive more meaning from the sentence. As a result, the processor can generate responses that, at times, almost appear human.

Bibliography

Liles, Bruce L. 1971. An Introductory Transformational Grammar. Englewood Cliffs: Prentice-Hall.

Quirk, Randolph, and Sidney Greenbaum. 1973. A Concise Grammar of Contemporary English. San Diego: Harcourt Brace Jovanovich.

Suereth, Russell. "A Natural Language Processor." The C User's Journal. April, 1993.

Suereth, Russell. "Natural Language Expansions for Tense and Number." The C User's Journal. June, 1993.