THE DR. DOBB'S HANDWRITING RECOGNITION CONTEST

Ray Valdes

This month marks the official launch of the Dr. Dobb's Handprinting Recognition Contest. If you've been following recent issues of Dr. Dobb's Journal, you'll recall that Ron Avitzur got the ball rolling in the April issue, by presenting a Macintosh-based handprint recognizer, complete with an interactive data-collector application. Ron has since written a platform-independent harness to test recognition engines. This harness works off stylus data stored in disk files, rather than requiring interactive digitizing hardware, a pen computer, or a pen operating system.

Before delving into technical details of the harness, here's a quick summary of contest rules. For this first-ever competition, we're fortunate to be able to offer an extremely tasty first prize--in the form of a PowerBook 100 generously provided by Apple Computer. The contest begins on June 15th, when the official version of the DDJ test framework, test data, and contest entry blank become available electronically. Deadline for submissions is September 15th. We'll announce a winner in our December issue.

Your recognizer can use any platform on which the DDJ test harness runs. The DDJ harness code assumes only the C standard library. However, even though you can run the harness on any platform that has a C compiler, we can only test your code on Macintosh or PC platforms. Assuming your code is portably written, this should not be a problem.

You must send in both source code and an executable. Any other written commentary or documentation is also welcome. Source code is for publication and can be in C (or, on the PC, in any language that can be linked to the OBJ files of the DDJ test harness).

Submissions will be judged primarily on recognition accuracy. Speed is a secondary consideration; third is the conciseness and elegance of your implementation.

How the Harness Works

The test-harness package contains executable, source, object, make, and data files, as well as a sample recognizer by Ron Avitzur. The READ.ME file describes all of these in detail.

The DDJ test harness first reads all information from the character-data file into an in-memory data structure. The character-data file is in binary format. For each ASCII character, there can be a variable number of character prototypes (sample characters). Each character prototype, also known as a gesture, is composed of a variable number of strokes. Each stroke is composed of a variable number of points. The process of reading in the data therefore consists of several nested for loops.

After reading in the data, the harness loops through the top-level Char-Data[] array, which contains pointers to lists of prototypes. During the training phase, characters are passed to your recognizer's Train() routine. Your training routine should derive from this data a set of features that will later be used in the recognition phase.

During the recognition phase, the test harness passes a different selection of characters to your recognizer's Guess() routine, which can return up to three guesses per character. Each guess must have an associated weight or confidence value.

Writing a general-purpose recognizer can be a large and daunting task. For purposes of the contest, we've constrained the problem in various ways. In the test data, segmentation of strokes into individual characters has already occurred. The sample recognizer works a character at a time, as opposed to using context information (such as a word dictionary). The character set consists only of alphanumeric characters plus a few punctuation characters. Input data consists of stylus datapoints from pen-down to pen-up. There is no proximity information or velocity data, nor are there timestamps associated with point coordinates.

Hints for Contestants

The sample recognizer included with the test-harness package performs pretty well, with better than 90 percent accuracy on certain sample data. Nevertheless, it suffers from a number of limitations which you can improve upon:

The sample recognizer uses mostly local information (the relationship of one point to the following) rather than global information. It may be fruitful to select five important points from different parts of a character and establish how these points relate to each other. Another approach would be to set up a coarse 4x8 grid and color in the pixels in the grid which are touched by a character.

The sample recognizer filters out raw data points into a smaller number of points from which the features are then derived. Its simplification routine is straightforward, and currently "cuts off" corners; that is, it does not distinguish a corner point from any other point.

The sample recognizer normalizes every character to the same square, discarding potentially useful information about the aspect ratio of the character. The current recognizer cannot tell the difference between a tall, skinny character and a short, squat one, assuming the stroke motions are similar.

The sample recognizer stores all information about each character in a single "bin." The features for all versions of a character are therefore muddled together, which might confuse the current recognizer in the case of very different legitimate ways of writing a particular character.

The Importance of Data

As many researchers have discovered, writing good code is only part of the problem in building a recognition engine. The rest includes amassing a suitable collection of test data.

Our sample recognizer works well with our current set of data, but may stumble on other valid data that it has not previously encountered. For judging the contest, therefore, we will attempt to run all recognizers on as broad a data set as possible, including any data that you submit with your entry.

Copyright © 1992, Dr. Dobb's Journal