BackFire:
A Flexible Backup Tool
Robert Sciuk
It is said that systems administration, like flying fighter planes
from an aircraft carrier, is 90% boredom interspersed with moments
of sheer terror. Okay, I said it, but there is an element
of truth to it. To mitigate the panic involved when one's system
drive fails and a career-changing moment hangs upon your ability
to recover your boss's family photos, I present the evolution of
a script, called BackFire, which had its origins as NFS_Backup (see
"Backup Scripts from UnixReview.com" in the April 2002 issue of
Sys Admin magazine: http://www.samag.com/documents/s=7033/sam0204d/)
and has been in production in various incarnations since 1996.
What Is BackFire?
BackFire is a solution for a heterogeneous network backup and
is capable of backing up Windows shares, local directories, or Linux,
BSD, or proprietary Unix NFS links. More importantly, BackFire was
conceived to be extended simply and easily to do much more. Backfire
evolved from a simple Bourne shell script and was rewritten in Tcl
a few years back. In preparation for this article, BackFire was
given a facelift and a graphical administration tool was developed
to handle setup and manage the targets.
Why Roll Your Own?
With a number of commercial offerings available, you might wonder
what advantages exist in developing your own backup script. A major
consideration is cost, but more important is the flexibility to
extend the functionality as required -- to essentially "future-proof"
your backup solution. In keeping with that idea, I decided to use
open source tools wherever possible and to make best use of the
machine's offerings. Although tar is used exclusively for archiving
the target files, there is no reason why it could not be replaced
with cpio, or any other file archiver for that matter; the choice
is entirely yours. BackFire and its various predecessors have been
in production use for a number of years and have served their role
admirably.
A Plethora of Scripting Languages -- Why Tcl/Tk?
In the last decade or so, scripting languages have earned the
respect of serious computer jocks, and their proliferation is testament
to the popularity of a non-compiled applications. Given the many
choices of language (Perl, Python, Ruby, Java, Rexx, etc.), I settled
upon a particular favorite of mine, Tcl/Tk. As a student of scripting
languages, I had just purchased a book on Ruby, and in the process
of learning yet another scripting language, Tcl still had two major
advantages to recommend it. First, I've already been through a somewhat
thorough learning curve, and the language remains one of the easiest
to maintain after six months away from the code. Second, because
I suspected that BackFire would be an evolving utility, I fell back
to the "devil you know" and have not regretted that decision for
an instant.
SQLite: The Database of First Resort
The original NFS_Backup tool was designed to use a flat configuration
file to describe the target mount points for backup archive. With
the goal of simple extensibility, it was felt that anyone who could
use a text editor could make the necessary changes to the .CFG file
to add or remove an archive from the nightly run. As BackFire evolved,
the .CFG file was retained, but as I approached this article, a
long-desired feature came to mind, which was to use an SQL-based
database to hold the archive data. As a function of using an SQL
database, we have the opportunity to enhance the security of the
configuration by extending password protection to include a blowfish
encryption algorithm rather than simply relying upon the chmod protections
afforded to a configuration script readable only to the superuser.
I am familiar with a number of DBMSs, including MySQL, Postgresql,
Firebird, and various commercial offerings, and I excluded them
all as having too large a footprint to rationalize embedding in
a system tool of this nature. Without too much agonizing, using
SQLite was a decision that practically made itself. Designed as
an embedded DBMS, but with the ACID properties one might expect
of a transactional platform, D. Richard Hipp's gift to the open
source world is truly a remarkable achievement. The icing on the
cake was the fact that SQLite comes right out of the box with a
well designed and tight integration with the Tcl/Tk language, and
thus Hipp has reduced the effort of binding DBMS platform to language
tool to trivia. Notable too is SQLite's license, which takes the
form of a "blessing" and can be found in the source.
Online, Nearline, Offline, Offsite...
While nothing is more convenient for a computer user than to have
his files reside upon the local hard drive, it is just such a device
that we need to protect. BackFire will copy portions of the shared
drive or directory, or a path within the NFS mount on the backup
server into a compressed tar file, organized by "Target". A Target
is typically derived as some function of machine and user, and each
Target will have its own subdirectory on the Backup Server, which
contains <N> images of the form "bkp_YYYYMMDD_TARGET.tgz".
In this way, each nightly snapshot, while segregated by Target,
makes date and archive information, which is embedded in the name
of the backup file, available to the end user.
It is possible, and in some circumstances desirable, to place
each Archive into a read-only Samba share, accessible by user/password
to allow the owners of the archived data to recover their own backups.
In this way, an end user would at least in theory be able to recover
his own lost data using the familiar WinZip tool, and moreover,
do so without any intervention on the part of the administrator.
Setting up the shares on the target computers, Samba configuration
of the backup server, end-user training, and convincing end users
that their important data should reside on that portion of their
C: drive that is actually backed up each night is left as an exercise
for the reader. Typically, by moving the desktop or work directory
onto the password-protected share, this effort will become somewhat
painless, even for the most reticent end user.
By burning the results of each nightly backup onto a DVD, the
latest data can be removed from the system, and potentially removed
from the site to facilitate off-site storage procedures. BackFire
maintains no volume expiration or protection mechanism and will
silently overwrite any re-writeable DVD left in the drive during
its operation. BackFire does not currently check to ensure that
the .iso is under the 4.7 GB commonly available to DVD burners,
but the next generation of DVD burner should further increase the
capacity and utility of DVD as removable media.
The Backup Server
As is typical when I implement BackFire at a client site, the
platform of choice is FreeBSD. Whenever I require a simple to maintain,
secure, and scalable box, headless or otherwise, I tend to gravitate
towards a 'BSD. And on Intel commodity hardware, there is no better
integration for this type of workhorse platform than FreeBSD. The
hardware itself is unremarkable, and anything approaching a 1-Ghz
box with 256 MB of ram, a DVD burner, and a couple of large hard
drives should do just fine.
SAMBA and NFS Do the Heavy Lifting
As mentioned, BackFire is an evolving utility and will likely
change to suit my (client's) needs. For now, BackFire assumes that
any NFS links are in place when the script runs, and there is no
code in place to manage the NFS mount and umount commands. I do
not check for the truth of this assumption, though one might easily
see where in the code to place those policy-enforcing sniglets.
Currently, BackFire assumes one of three different types of archive:
SAMBA, NFS, or a local Unix directory, though the code for the NFS
and Unix directory are currently shared. The Tcl procedure "backup_Target"
contains a switch that determines which procedure should be called
to perform either the specific work at hand and implements a wrapper
for either tar or smbtar as the case may be. (Complete source for
BackFire is available from the Sys Admin Web site: http://www.sysadminmag.com.)
Should it become necessary in the future to provide sophisticated
NFS management, the BackFire script already has the necessary structure
in place. Furthermore, if one needed to add perhaps an FTP or Macintosh
backup facility, the plumbing lends itself to the necessary extensions.
As mentioned, the Windows backups do not require a Samba mount,
but rather use the "smbtar" script, which is included with the Samba
3.x distritution and is based upon the very useful "smbclient" tool.
A current limitation of smbtar is that it does not contain the -z
(gzip compression) option as does the version of tar shipped with
FreeBSD. To keep the nearline archives spinning on the backup archive
orthogonal, a separate gzip operation is performed on the SMB archives,
and so the same procedure can be used to recover any of the three
discernible backup types.
So, What Does It Really Do?
Backup.TCL is launched by cron to run after hours and, for each
of its TARGET archives, it will create a dated archive specific
to the platform that is retained on the backup server for a user-specified
number of days. This retention policy is implemented in the ironically
named Tcl procedure "retention_Policy", which expects the user-specified
number of days to retain online and the name of the path to check.
This is an important way of ensuring that the backup server does
not overrun available disk space.
The program logic is spelled out pretty clearly in the Tcl procedure
"main" (yes, yes, I know, I still use C extensively), and the logical
flow is fairly simple to follow (see Listing 1).
There are two output streams. The stdout from the Backup.TCL script
serves as a summary of the events of the session and makes for a
nice email report to the person on-site who cares about such things.
Errors are logged into a file that is kept for two nights (Bkp_Log
and Bkp_Log.old; see Listing 2). This should probably be changed
to respect the retention policy, but I have not found it necessary
given that the nightly output is emailed on a daily basis, and if
one night's backup fails (owing to an unexpected shutdown), there
are <N>-1 others to choose from.
In Practice, I'm Disappointed with Theory
When BackFire was implemented on a FreeBSD 5.3-RELEASE platform,
with a particularly flakey network card (both the driver name and
manufacturer will be left to the reader's imagination), I saw some
disconcerting failures, and often the network was left in an unusable
state. Because of the workout that BackFire will give a network,
it is best to run BackFire from a late night cron script, and it
is crucial to have solid hardware and drivers. As a workaround,
it became necessary to reset the network interface between each
Target, and a simple ifconfig <interface> down/up seemed
to suffice. In keeping with this, the ifconfig program was duly
added to the required list of system resources, as was the interface.
Another issue revolved around the burning of DVDs with the "growisofs"
utility. It seems that the developers of growisofs thought that
overwriting media from a batch script was dangerous, and the only
way to do so was to use an undocumented feature involving a reference
to an imaginary character in a galaxy many light years away. Moreover,
growisofs, when used to burn a directory directly onto media uses
the "mkisofs" utility, and if the MKISOFS environment variable is
not set, then growisofs will fail. I mention these little annoyances
in hope that the mystical behaviors found within the BackFire code
can be better explained than as simple eccentricities of the author.
A Simple Schema for Simple Scripts
The schema for the SQLite database is almost trivial, and the
target table corresponds directly to the old .CFG file used in BackFire's
predecessors. For the interested student, the SQL schema may be
found in the GlobalDefaults file. This was done so that the schema
would be easy to find in case it needed to be changed or extended.
As the database eliminates the use of delimited fields in flat files,
it at once simplifies the code, and to some extent obscures it.
Code to access the database is kept in the Database.TCL script.
The system resources are maintained in the resource record, and
these resources are used throughout the .TCL scripts that comprise
the BackFire utility.
Security Starts at Home
One consideration of caching the SMB user name and passwords necessary
to gain access to the Windows shares is that of security. While
the backup server is considered a "back-room" box and can be kept
under lock and key for the most part, it is still susceptible to
network access and hacking. As I typically use the administrator
login and passwords to access the Windows-based Targets, significant
efforts must be made to keep that data away from unauthorized prying
eyes.
The sys::encrypt procedure uses the OpenSSL Blowfish encryption
cipher to encipher the password data. This, of course, is reversible,
and the equivalent sys::decrypt procedure is provided, though a
dedicated and persistent hacker may find the key value. One might
wish to revisit the security aspects of BackFire should it be run
in a hostile environment. Of course, password management leads directly
to the next topic, which is the setup, configuration, and administration
of BackFire.
The BackFire Administration Tool
The need for an administration tool arises primarily from the
need to handle and encrypt passwords. While this can be done simply
in a flat file that contains the create table and insert row SQL
commands, it was felt that a simple Tk-based GUI tool would launch
BackFire into the "big leagues" of home-brewed, script-based backup
tools, if there is such a thing. The resources are managed in the
tabs labeled Archives Programs Devices and Retention. The <BROWSE>
button will search the path for an appropriate executable file where
appropriate or open a directory search dialogue when needed. The
Targets are administered in the Target tab and by a small form that
allows the user to create, edit, and delete targets simply and easily.
Co-Dependence -- A Harsh Mistress
As BackFire is just a glorified wrapper for a number of tools
already close to hand for any Unix/Linux type administrator, there
is no magic here. Indeed, the list of machine resources required
to run BackFire is quite large and includes about 18 resources,
utility programs, a network device and DVD burner, source and archive
paths for the script source, the archive location, the logfile,
the directory for the daily snapshot and its .iso image, as well
as the file retention policy. These are itemized, if not defined,
in the GlobalDefaults file, which is included by the scripts that
need it. The BackFire administration tool allows you to easily define
the programs and devices, as well as the target directories and
retention policy.
Room for Improvement
Like many tools, BackFire has grown out of a need. Re-written
from Bourne Shell into Tcl/Tk, it has become more robust and error
resistant, but much more can be done. As mentioned, the NFS volume
management would be relatively easy to add, but much of the improvement
might come by way of extending the BackFire GUI tool from an administrator's
tool to more of an end-user file backup/recovery scheduler. This
is a likely next step in the evolution of BackFire and may well
come to pass in the near term.
BackFire was tested with relatively recent versions of FreeBSD,
Tcl, Samba, and SQLite. Note that to support tabbed notebooks, the
BWidgets toolkit must be added to the Tcl/Tk release for compatibility.
Resources
BWidgets 1.7.0 -- FreeBSD ports
FreeBSD 6.0-RELEASE -- http://www.freebsd.org
Growisofs -- FreeBSD ports
Gzip/gunzip -- FreeBSD
Mkisofs -- FreeBSD ports
Samba 3.0.20b -- http://www.samba.org
SQLite 3.3.4 -- http://www.sqlite.org
Tcl/Tk 8.4.10 -- http://www.tcl.tk
Robert S. Sciuk has been involved in the software industry
since the 1970s and is currently the director of Control-Q Research
located near Toronto Ontario Canada. Control-Q specializes in open
software solutions and is active in software development, complex
Web hosting, and network and management consulting for small and
mid-size businesses. |