Java Solutions


DocumentBuilder: An Alternative to Hard-Coded String Concatenation

Matt Nicolls

Separating structure from content always begets a welcome flexibility. An effective deployment of Template Method, à la Mail Merge in this case, does the trick.


Overview

Whether it be SQL, HTML, XML, or even ASCII, odds are just about every application you’ve written has to generate a formatted String of some kind that represents the state of an object. Figure 1 shows a common (albeit bad) way to generate an INSERT SQL statement for a Customer object. Figure 2 shows a similar approach to creating an XML representation of a Customer object.

The problem with concatenating Strings this way is that the hard-coded values bind your application to the implementation they are dealing with. For example, you could not change the name of a field of a table in your database or modify the format of your XML without having to change, recompile, retest, and re-deploy your application. The string-concatenation overhead is also nontrivial.

The ideal solution is to put the SQL, XML, HTML, etc. into a template file or database that you can load at run time. Doing so will allow you to change the text at any time without having to change your code. This is a simple enough concept that I’m sure most developers have used it at one time or another, but it becomes difficult when the document you need to generate is riddled with dynamic information.

Using DocumentBuilder to generate documents relieves your application from knowing anything about the text (a.k.a. document) it is generating. This means that you can change the name of your Customer table without having to change a single line of code. Or, you can completely redesign your XML structure, and your application doesn’t even know it happened.

How It Works

DocumentBuilder is based on a concept similar to Mail Merge (found in most popular word processors). It merges a document template with data to produce something meaningful. The template contains “field codes” that DocumentBuilder looks for and then replaces the codes with data. Figure 3 shows a sample template and results for creating a Customer XML document. Figure 4 shows an HTML document.

Once DocumentBuilder has a template, it parses it and locates the field codes. As it finds each field code, it calls its own abstract method, getValue(String code), passing in the field code that it found. The getValue(code) method returns a String value that should replace the field code. And, since the getValue(code) method is abstract, it is up to the class that extends DocumentBuilder (a.k.a. subclass) to implement the logic that evaluates the code and returns an appropriate value. Though I’ve used function-like field codes in the earlier example, I could use any arbitrary string (such as Object_reference:Attribute_name).

A Closer Look

Fear not! I’ve written some of the most common implementations of DocumentBuilder. But before I talk about those, let’s first take a look at the source code for DocumentBuilder (see Listing 1 and Figure 5).

The DocumentBuilder implementation is a classic example of the Gang-of-Four Template Method design pattern. The real work is done in build_document, which calls various abstract methods to do implementation-specific work. Another viable alternative would be to provide an interface that defined the user overrides and then pass an implementation of that interface into buildDocument as an argument. (This latter solution would be an example of the Gang-of-Four Strategy pattern.)

Open/Close Delimiters

The open and close delimiters are used to tell DocumentBuilder what characters are used to encapsulate field codes in document templates. The default open delimiter is a left French brace (“{“); likewise, the default close delimiter is the right French brace (“}”).

You can set the open and close delimiters to whatever you want. You might use “[“ and “]”, or even “~” and “~”. Try not to use “<” and/or “>” when building an HTML or XML document though (see the sidebar, “Delimiter Issues”).

The getValue Method

The coolest part of DocumentBuilder is that it does everything for you. The only thing it leaves up to its subclasses is the getValue method. This is why DocumentBuilder is so flexible, because it leaves the processing up to the specific implementation.

For example, I’ll talk later about three basic implementations of DocumentBuilder:

  1. HashtableDocumentBuilder
  2. ResultSetDocumentBuilder
  3. ObjectDocumentBuilder

These subclasses implement the getValue method as shown in Figure 6.

Useful Implementations

ObjectDocumentBuilder

ObjectDocumentBuilder is probably the most useful of the three implementations I’ve provided (see Listing 2). You’ll notice that the constructor (lines 18-21) accepts both a document template and an Object. The Object will be used in the getValue method to retrieve values from its accessor Methods (using reflection — if you’re unfamiliar with Java’s “reflection” APIs, see the tutorial at <http://java.sun.com/docs/books/tutorial/reflect/index.html>).

Lines 40 and 41 are where I actually grab a reference to the Method using the fieldCode argument as the name of the Method and the Object model as the target. I wrap this call in a try/catch block — so if any Exceptions are fired, I simply return null.

ResultSetDocumentBuilder

ResultSetDocumentBuilder is another useful implementation of DocumentBuilder (see Listing 3). You’ll notice that the constructor (lines 20-23) accepts both a document template and a JDBC ResultSet. The ResultSet is used in the getValue method to retrieve values from its getString(String fieldName) method.

Line 35 is where I actually grab the value of the field in the ResultSet. If the field is not found, or there are any other Exceptions, null is returned.

HashtableDocumentBuilder

HashtableDocumentBuilder (see Listing 4) is similar to ResultSetDocumentBuilder. Except, instead of wrapping a JDBC ResultSet, it wraps a Hashtable. The Hashtable is used in the getValue method to retrieve values from its get(Object key) method.

Line 36 is where I actually grab the value Object stored in the Hashtable. The fieldCode argument is used as the key, and if it is not found, or there are any other Exceptions, null is returned.

Moving On

As you can see, the DocumentBuilder utility classes are very useful. The three implementations provided should give you most of what you need, but other implementations can easily be created to meet your specific needs. So whether you need to create SQL, HTML, XML, or just plain ASCII, DocumentBuilder will help you do this easily, while relieving your application code from having hard-coded String concatenation.

Matt Nicolls is president and founder of Streamlined Technology LLC, a software-design firm based in St. Louis. Streamlined Technology provides Java mentoring, application design and development services, as well as packaged software solutions. In addition to being a member of MENSA, Matt is an expert in both enterprise software development and object-oriented analysis and design and has worked in the IT field for over 10 years. Visit his company website at <www.stream-tech.com> or email him directly at matt@stream-tech.com.