Article jul2006.tar

Arts and WebWatcher for Document Management

Joe Blaylock

The creation and maintenance of documentation arguably constitutes the most important job we have as systems administrators. It informs every aspect of our work in ways so pervasive that we hardly consider them documentation any more. We comment configuration files. We send email to users, explaining how to do things. We stand around the water cooler, talking about past co-workers and debating best practices. We write more email, for other sys admins or for ourselves, hashing things out further. All this documentation has tangible value. The users can figure out how to get their work done. The programmers know why you set things up that way, and why you'll deny their request to change it. You can remember six months from now why you chose not to use the CVS pserver and explain it to your boss. Good documentation can save you work, but more importantly, it can inform your planning, giving you a solid basis for future work.

Why, then, does writing documentation so often get the short end of the stick? In my conversations with other sys admins, it seems like we always agree that writing documentation ranks on the desirable job duties scale somewhere below taking out the trash and barely above tidying up our cubicles. Or else, we think we might want to do it someday but not until we have more time. It gets put off again and again. Then somebody absolutely must have information about the Frobnicator 3000, and we write just enough to help them get by. Naturally, the documentation produced this way goes out of date almost as quickly as it's written, and we subsequently ignore it. Maintaining old documentation seems even less popular than generating new -- a source of annoyance exacerbated by users' disturbing tendency to consider old documentation accurate.

If your work habits are like mine, you probably write documentation in a completely ad hoc fashion, satisfying individual requests as they come. You probably get it where it needs to go via email, too. And it probably won't surprise you to hear that I've rewritten some end-user instructions at least half a dozen times, with only minor variations between them. Sometimes I remember that I've written something before, so I grep my sent-mail folder, clean it up, and send it out again. Only rarely does any particular piece of documentation get requested often enough for me to make a Web page out of it. Naturally, those pages go out of date quickly.

Never mind historical context. Requests-for-quote, historical digressions, meeting summaries, commentaries on the local culture, impassioned essays about password hygiene. All get produced as one-offs and sent to the appropriate mailing lists with hardly a second thought. There, they moulder in the mountainous archives, never to be read again. They don't have even the slightest chance of making it into out-of-date Web pages.

All of that changed for me when I heard about a pair of simple tools: Arts [1] and WebWatcher [2]. They changed the way I handle all manner of administrative documentation. Each does one thing, and does it well. Neither alone can change your life, but together they provide just enough functionality to begin to make sense of the madness.

Arts

Chris Dent wrote Arts long ago, while working at a small Midwestern ISP, to address just these sorts of documentation issues. The Arts package consists of a pair of scripts that provide a document repository and manage metadata on that repository. The first of these, arts.gw, gets configured as a delivery pipeline in your MTA or procmail. It takes incoming messages, adds some metadata (including a default expiration time), and sticks the messages in a predefined location as Web documents. The second, arts.ct, gets called from cron, or more rarely the command line, and generates a Web page with a table of contents of all of the Arts documents, organized by section.

Arts has a straightforward configuration and installation process, and I refer you to its documentation. I will note, however, that Arts enables anyone who can send messages to it to put data on your disk. Obviously, this can cause problems, and you should take steps to restrict who may send messages to Arts and probably also how much disk Arts can use. Some simple notions of restriction come with Arts. You can choose which user groups or address domains can add documents to the repository, for example. However, spoofing email addresses might easily overcome these checks. The diligent sys admin will want to restrict things more. Careful configuration of the MTA can help.

Configuring and installing Arts comprises only a small part of its deployment, of course. Developing the habits to use it requires more effort. The process does come easily, but it can take a while before it comes naturally. A typical usage scenario might go like this:

    Bob wants to change his password on the locally situated Frobnicator software. He sends email to the archived systems administrator's mailing list, systems@example-company.com. Alice gets the message via the list and writes a set of instructions detailing password changes on Frobnicator and several other quirky systems. She does a group reply, so that the message goes to both Bob and the systems list, and hits send.

    The next day, Carl comes back from vacation and notices the instructions. He remembers that Drew just asked him about Frobnicator passwords the week before, and so he uses his email client's bounce command to send a copy of Alice's instructions to the Arts address. Now, Alice's instructions reside in a central, Web-accessible location, and Bob, Drew, and everyone else can benefit from them.

    Later, Frances emails systems and asks for help setting her user preferences in Frobnicator. Once again, Alice promptly sends a detailed email to both Frances and the systems list. A conversation ensues between Alice and Carl, and they decide that the message should get preserved. Alice bounces the message to Arts. When Bob needs to update his user preferences, he finds the instructions already written and so doesn't bother to email systems.

I hope you can see by now how even a simple tool like this can help significantly with your documentation tasks by giving you a filing system that fits (almost) transparently into your workflow. It provides an organization system too, though. The title of Arts documents get set to the subject line of your email (adjustable later, of course). You can prepend a word or phrase followed by a hash mark to indicate a "section" in the table of contents into which a given document will go. For example:

To: myarts@example-company.com 
Subject: Frobnicator 3000#How to set your password

You go to http://www.example-company.com/frobnicator3000 and click on 

'Change Password'.  Then you change it. Have a nice day. 
            
If Alice had prepended her emails to Bob and Frances with the Frobnicator section like this example, then the next time arts.ct generated the table of contents her documents would collect together in one place.

WebWatcher

Dent wrote WebWatcher at approximately the same as Arts but didn't initially anticipate their use together [3]. WebWatcher compares an "expires on" stamp (added by hand or by arts.gw) in some set of Web pages with the file's age on disk. (Of course, you may change the regular expressions that WebWatcher uses to parse documents and use it to watch things other than Web pages. Discussion of such matters goes beyond the scope of an introductory article such as this.)

When you run WebWatcher from cron, it generates output when static Web content has expired. These complaints should help spur you to keep the files up to date. You can configure WebWatcher to watch multiple different collections of files and complain to the owner recorded in each file's metadata section when it expires. It even has a boss feature; if you let your work slide for too long, it emails a third party.

WebWatcher by itself solves a pretty annoying problem. The real value to the working systems administrator, however, comes from combining WebWatcher and Arts. Now, in addition to your (almost) transparent system for documentation creation and filing, you get (user adjustable) periodic reminders to keep it all up to date. If you expect some particular document not to change, set it to never expire (live meeting minutes or requests-for-quote, for example), or if another changes all the time, set it to expire in just a few days (a running log of progress on some project).

Like Arts, the proper use of WebWatcher requires the internalization of some new practices. In particular, if you want timely documentation, you must consider WebWatcher's nag emails as high-priority tasks. Reviewing and updating your documents shouldn't get put off any longer than necessary. If a particular document sends too much email, then review it anyway, and adjust the expiration period so that it will stay fresh longer. Refrain from simply using "touch" on documents without review just to get WebWatcher to stop complaining. When you do so, you admit that you don't care about your documentation. Why use WebWatcher at all, if you don't care about documentation?

Conclusion

Ultimately, good documentation comes from you. You will have to make the production and maintenance of documentation into a daily habit, like reading email or getting that first cup of coffee in the morning. The discipline to consistently send good email into the document repository or to update old documents must come from you. Once acquired, however, these habits can save you time and work. Tools like Arts and WebWatcher can help ease the process, making the habits easier to acquire and live with and helping you get on to cooler work more quickly.

References

1. Arts currently resides at: http://www.jrbl.org/projects/arts

2. WebWatcher hails from: http://www.jrbl.org/projects/webwatcher

3. Chris Dent, the author of WebWatcher and Arts, still works on projects to make communication and documentation easier at: http://www.burningchrome.com/~cdent

The search for good pizza brought Joe Blaylock to Indiana University, but a job as a systems administrator made him stay. He dreams of one day becoming a successful dog trainer, or science fiction author. Send questions or comments to blaylock@indiana.edu.