Article Figure 1 Figure 2 Figure 3 may2006.tar

BackFire: A Flexible Backup Tool

Robert Sciuk

It is said that systems administration, like flying fighter planes from an aircraft carrier, is 90% boredom interspersed with moments of sheer terror. Okay, I said it, but there is an element of truth to it. To mitigate the panic involved when one's system drive fails and a career-changing moment hangs upon your ability to recover your boss's family photos, I present the evolution of a script, called BackFire, which had its origins as NFS_Backup (see "Backup Scripts from UnixReview.com" in the April 2002 issue of Sys Admin magazine: http://www.samag.com/documents/s=7033/sam0204d/) and has been in production in various incarnations since 1996.

What Is BackFire?

BackFire is a solution for a heterogeneous network backup and is capable of backing up Windows shares, local directories, or Linux, BSD, or proprietary Unix NFS links. More importantly, BackFire was conceived to be extended simply and easily to do much more. Backfire evolved from a simple Bourne shell script and was rewritten in Tcl a few years back. In preparation for this article, BackFire was given a facelift and a graphical administration tool was developed to handle setup and manage the targets.

Why Roll Your Own?

With a number of commercial offerings available, you might wonder what advantages exist in developing your own backup script. A major consideration is cost, but more important is the flexibility to extend the functionality as required -- to essentially "future-proof" your backup solution. In keeping with that idea, I decided to use open source tools wherever possible and to make best use of the machine's offerings. Although tar is used exclusively for archiving the target files, there is no reason why it could not be replaced with cpio, or any other file archiver for that matter; the choice is entirely yours. BackFire and its various predecessors have been in production use for a number of years and have served their role admirably.

A Plethora of Scripting Languages -- Why Tcl/Tk?

In the last decade or so, scripting languages have earned the respect of serious computer jocks, and their proliferation is testament to the popularity of a non-compiled applications. Given the many choices of language (Perl, Python, Ruby, Java, Rexx, etc.), I settled upon a particular favorite of mine, Tcl/Tk. As a student of scripting languages, I had just purchased a book on Ruby, and in the process of learning yet another scripting language, Tcl still had two major advantages to recommend it. First, I've already been through a somewhat thorough learning curve, and the language remains one of the easiest to maintain after six months away from the code. Second, because I suspected that BackFire would be an evolving utility, I fell back to the "devil you know" and have not regretted that decision for an instant.

SQLite: The Database of First Resort

The original NFS_Backup tool was designed to use a flat configuration file to describe the target mount points for backup archive. With the goal of simple extensibility, it was felt that anyone who could use a text editor could make the necessary changes to the .CFG file to add or remove an archive from the nightly run. As BackFire evolved, the .CFG file was retained, but as I approached this article, a long-desired feature came to mind, which was to use an SQL-based database to hold the archive data. As a function of using an SQL database, we have the opportunity to enhance the security of the configuration by extending password protection to include a blowfish encryption algorithm rather than simply relying upon the chmod protections afforded to a configuration script readable only to the superuser.

I am familiar with a number of DBMSs, including MySQL, Postgresql, Firebird, and various commercial offerings, and I excluded them all as having too large a footprint to rationalize embedding in a system tool of this nature. Without too much agonizing, using SQLite was a decision that practically made itself. Designed as an embedded DBMS, but with the ACID properties one might expect of a transactional platform, D. Richard Hipp's gift to the open source world is truly a remarkable achievement. The icing on the cake was the fact that SQLite comes right out of the box with a well designed and tight integration with the Tcl/Tk language, and thus Hipp has reduced the effort of binding DBMS platform to language tool to trivia. Notable too is SQLite's license, which takes the form of a "blessing" and can be found in the source.

Online, Nearline, Offline, Offsite...

While nothing is more convenient for a computer user than to have his files reside upon the local hard drive, it is just such a device that we need to protect. BackFire will copy portions of the shared drive or directory, or a path within the NFS mount on the backup server into a compressed tar file, organized by "Target". A Target is typically derived as some function of machine and user, and each Target will have its own subdirectory on the Backup Server, which contains <N> images of the form "bkp_YYYYMMDD_TARGET.tgz". In this way, each nightly snapshot, while segregated by Target, makes date and archive information, which is embedded in the name of the backup file, available to the end user.

It is possible, and in some circumstances desirable, to place each Archive into a read-only Samba share, accessible by user/password to allow the owners of the archived data to recover their own backups. In this way, an end user would at least in theory be able to recover his own lost data using the familiar WinZip tool, and moreover, do so without any intervention on the part of the administrator.

Setting up the shares on the target computers, Samba configuration of the backup server, end-user training, and convincing end users that their important data should reside on that portion of their C: drive that is actually backed up each night is left as an exercise for the reader. Typically, by moving the desktop or work directory onto the password-protected share, this effort will become somewhat painless, even for the most reticent end user.

By burning the results of each nightly backup onto a DVD, the latest data can be removed from the system, and potentially removed from the site to facilitate off-site storage procedures. BackFire maintains no volume expiration or protection mechanism and will silently overwrite any re-writeable DVD left in the drive during its operation. BackFire does not currently check to ensure that the .iso is under the 4.7 GB commonly available to DVD burners, but the next generation of DVD burner should further increase the capacity and utility of DVD as removable media.

The Backup Server

As is typical when I implement BackFire at a client site, the platform of choice is FreeBSD. Whenever I require a simple to maintain, secure, and scalable box, headless or otherwise, I tend to gravitate towards a 'BSD. And on Intel commodity hardware, there is no better integration for this type of workhorse platform than FreeBSD. The hardware itself is unremarkable, and anything approaching a 1-Ghz box with 256 MB of ram, a DVD burner, and a couple of large hard drives should do just fine.

SAMBA and NFS Do the Heavy Lifting

As mentioned, BackFire is an evolving utility and will likely change to suit my (client's) needs. For now, BackFire assumes that any NFS links are in place when the script runs, and there is no code in place to manage the NFS mount and umount commands. I do not check for the truth of this assumption, though one might easily see where in the code to place those policy-enforcing sniglets. Currently, BackFire assumes one of three different types of archive: SAMBA, NFS, or a local Unix directory, though the code for the NFS and Unix directory are currently shared. The Tcl procedure "backup_Target" contains a switch that determines which procedure should be called to perform either the specific work at hand and implements a wrapper for either tar or smbtar as the case may be. (Complete source for BackFire is available from the Sys Admin Web site: http://www.sysadminmag.com.)

Should it become necessary in the future to provide sophisticated NFS management, the BackFire script already has the necessary structure in place. Furthermore, if one needed to add perhaps an FTP or Macintosh backup facility, the plumbing lends itself to the necessary extensions.

As mentioned, the Windows backups do not require a Samba mount, but rather use the "smbtar" script, which is included with the Samba 3.x distritution and is based upon the very useful "smbclient" tool. A current limitation of smbtar is that it does not contain the -z (gzip compression) option as does the version of tar shipped with FreeBSD. To keep the nearline archives spinning on the backup archive orthogonal, a separate gzip operation is performed on the SMB archives, and so the same procedure can be used to recover any of the three discernible backup types.

So, What Does It Really Do?

Backup.TCL is launched by cron to run after hours and, for each of its TARGET archives, it will create a dated archive specific to the platform that is retained on the backup server for a user-specified number of days. This retention policy is implemented in the ironically named Tcl procedure "retention_Policy", which expects the user-specified number of days to retain online and the name of the path to check. This is an important way of ensuring that the backup server does not overrun available disk space.

The program logic is spelled out pretty clearly in the Tcl procedure "main" (yes, yes, I know, I still use C extensively), and the logical flow is fairly simple to follow (see Listing 1).

There are two output streams. The stdout from the Backup.TCL script serves as a summary of the events of the session and makes for a nice email report to the person on-site who cares about such things. Errors are logged into a file that is kept for two nights (Bkp_Log and Bkp_Log.old; see Listing 2). This should probably be changed to respect the retention policy, but I have not found it necessary given that the nightly output is emailed on a daily basis, and if one night's backup fails (owing to an unexpected shutdown), there are <N>-1 others to choose from.

In Practice, I'm Disappointed with Theory

When BackFire was implemented on a FreeBSD 5.3-RELEASE platform, with a particularly flakey network card (both the driver name and manufacturer will be left to the reader's imagination), I saw some disconcerting failures, and often the network was left in an unusable state. Because of the workout that BackFire will give a network, it is best to run BackFire from a late night cron script, and it is crucial to have solid hardware and drivers. As a workaround, it became necessary to reset the network interface between each Target, and a simple ifconfig <interface> down/up seemed to suffice. In keeping with this, the ifconfig program was duly added to the required list of system resources, as was the interface.

Another issue revolved around the burning of DVDs with the "growisofs" utility. It seems that the developers of growisofs thought that overwriting media from a batch script was dangerous, and the only way to do so was to use an undocumented feature involving a reference to an imaginary character in a galaxy many light years away. Moreover, growisofs, when used to burn a directory directly onto media uses the "mkisofs" utility, and if the MKISOFS environment variable is not set, then growisofs will fail. I mention these little annoyances in hope that the mystical behaviors found within the BackFire code can be better explained than as simple eccentricities of the author.

A Simple Schema for Simple Scripts

The schema for the SQLite database is almost trivial, and the target table corresponds directly to the old .CFG file used in BackFire's predecessors. For the interested student, the SQL schema may be found in the GlobalDefaults file. This was done so that the schema would be easy to find in case it needed to be changed or extended. As the database eliminates the use of delimited fields in flat files, it at once simplifies the code, and to some extent obscures it. Code to access the database is kept in the Database.TCL script. The system resources are maintained in the resource record, and these resources are used throughout the .TCL scripts that comprise the BackFire utility.

Security Starts at Home

One consideration of caching the SMB user name and passwords necessary to gain access to the Windows shares is that of security. While the backup server is considered a "back-room" box and can be kept under lock and key for the most part, it is still susceptible to network access and hacking. As I typically use the administrator login and passwords to access the Windows-based Targets, significant efforts must be made to keep that data away from unauthorized prying eyes.

The sys::encrypt procedure uses the OpenSSL Blowfish encryption cipher to encipher the password data. This, of course, is reversible, and the equivalent sys::decrypt procedure is provided, though a dedicated and persistent hacker may find the key value. One might wish to revisit the security aspects of BackFire should it be run in a hostile environment. Of course, password management leads directly to the next topic, which is the setup, configuration, and administration of BackFire.

The BackFire Administration Tool

The need for an administration tool arises primarily from the need to handle and encrypt passwords. While this can be done simply in a flat file that contains the create table and insert row SQL commands, it was felt that a simple Tk-based GUI tool would launch BackFire into the "big leagues" of home-brewed, script-based backup tools, if there is such a thing. The resources are managed in the tabs labeled Archives Programs Devices and Retention. The <BROWSE> button will search the path for an appropriate executable file where appropriate or open a directory search dialogue when needed. The Targets are administered in the Target tab and by a small form that allows the user to create, edit, and delete targets simply and easily.

Co-Dependence -- A Harsh Mistress

As BackFire is just a glorified wrapper for a number of tools already close to hand for any Unix/Linux type administrator, there is no magic here. Indeed, the list of machine resources required to run BackFire is quite large and includes about 18 resources, utility programs, a network device and DVD burner, source and archive paths for the script source, the archive location, the logfile, the directory for the daily snapshot and its .iso image, as well as the file retention policy. These are itemized, if not defined, in the GlobalDefaults file, which is included by the scripts that need it. The BackFire administration tool allows you to easily define the programs and devices, as well as the target directories and retention policy.

Room for Improvement

Like many tools, BackFire has grown out of a need. Re-written from Bourne Shell into Tcl/Tk, it has become more robust and error resistant, but much more can be done. As mentioned, the NFS volume management would be relatively easy to add, but much of the improvement might come by way of extending the BackFire GUI tool from an administrator's tool to more of an end-user file backup/recovery scheduler. This is a likely next step in the evolution of BackFire and may well come to pass in the near term.

BackFire was tested with relatively recent versions of FreeBSD, Tcl, Samba, and SQLite. Note that to support tabbed notebooks, the BWidgets toolkit must be added to the Tcl/Tk release for compatibility.

Resources

BWidgets 1.7.0 -- FreeBSD ports

FreeBSD 6.0-RELEASE -- http://www.freebsd.org

Growisofs -- FreeBSD ports

Gzip/gunzip -- FreeBSD

Mkisofs -- FreeBSD ports

Samba 3.0.20b -- http://www.samba.org

SQLite 3.3.4 -- http://www.sqlite.org

Tcl/Tk 8.4.10 -- http://www.tcl.tk

Robert S. Sciuk has been involved in the software industry since the 1970s and is currently the director of Control-Q Research located near Toronto Ontario Canada. Control-Q specializes in open software solutions and is active in software development, complex Web hosting, and network and management consulting for small and mid-size businesses.