Respect Your Backup Admin?
Roger Feldman
After working for 5 years with Backup Administration, I was informed that my job position would be changed to Application Support and Linux Administration. I soon learned that our backups were on their way out of the country. A strategic decision described as "Right Shoring" would move most of the backup administration to Asia. I considered myself lucky to get placed in a safe job area where the support was needed close to the local customer. But, I was a bit disappointed because I was looking forward to getting involved in a state-of-the-art backup to disk system. I always felt that I would have been happy if I continued working with enterprise backup.
During my years of backup administration, I was often confronted with the attitude that being a backup administrator was somehow a lesser job. It seems as if many companies have placed some of their less-talented or less-experienced employees in the backup admin role. I have even been forced to take a bit of friendly teasing at times being called "Tape Monkey" by my fellow admins. The final blow was hearing that such jobs that require less-skilled employees were the focus of the relocation of jobs to foreign lands. All of these things led me to do a bit of soul searching, and I began to really question whether there really was any value left in being a backup administrator.
Backup Questions
The backup market now includes a new breed of technologies that have changed the way we do our backups. We suddenly have "snapshots" or "mirrors" that are easily configured and leave us with unattended backups several times a day. We perform backups to disk, which in some cases (where the service level agreement permits) replace writing data to tapes. Suddenly we have so many choices to make, and some of them leave us with very little administration.
So we are left with a new set of questions. Should we outsource our backup? Do we really need a full-time backup admin? Can we use tape-less technologies? Can we have unattended backups? Can we move our administration to a low-cost foreign land? After considering these issues for more than a year, I have come to some interesting conclusions. The answers are both yes and no, and they depend on correctly analyzing the scenario for the site in question.
While many high-level executives may be tempted to "Right Shore" or outsource the backups, these decisions must be carefully weighed against to real needs for recovery and site operation service levels. Management may be wary of the administrator's assessment as it may be viewed as a costly implementation based on the personal interest in technologies. The real answer is to spend the time and money to have an IT architect re-evaluate your sites traditional backup configuration to find a cost-effective solution that can guarantee that the customer reach his service level agreements.
In this article, I will look at a series of scenarios where the need for administration-intensive backups could be eliminated and also consider situations where a full-time administrator is needed to run the backups. I will try to focus on the real skills needed to run these types of backups. I will briefly describe some of the technologies as they appear in our scenarios.
Scenario 1
Administration Level: Candidate for low administration
Customer: Small-to-medium business
Backup Policy: 6-month browse and recovery of data
Data Objects: In this case, we have 10-60 servers with mixed Unix, Linux, and Microsoft. There are clients that save dynamic client user data to the NAS shared HOME and PROJECT catalogs. The site has 1-5 databases running MSSQL on Windows, with Sybase or MySQL on Linux.
Technologies: Unaccompanied backup to disk with supplemented snapshots of NAS objects. In this case, we will choose backup to disk with no tape backup. The "shared" NAS data will be a "snapshot" that is taken at several intervals during the day.
Nearly all backup vendors are now offering some form of tape-less disk backup. If you are unaware of this, then it is time to start investigating the vendor's offerings. It won't take much time until you see their implementation. Please note that "open" or "freeware" disk backup options have been available for quite some time. Just one example is the "amanda" backup utility:
http://www.amanda.org
The concept is quite simple, you don't back up to a tape library; instead, you write your backups directly to a disk device that has space to hold backups for the period of the retention policy. This disk device should have some kind of redundancy protection. Your vendor will give you lots of choices on how to protect your "backup disk device". Several backup companies have created disk products that work in conjunction with the most popular backup software. One example is "data domain", which offers a product that uses a compression and data redundancy algorithm allowing optimal use of the storage device:
http://www.datadomain.com/products
Another popular commercial vendor product for disk backup comes from Bakbone:
http://www.bakbone.com/products/disk-to-disk/svdl.asp
You may want to look into the breed of inexpensive ATA RAID disk arrays. These disks provide for an inexpensive array that can hold your backups. This choice allows smaller companies to have a large backup disk array available to suit their backup window.
The shared NAS data will be backed up with the vendor-specific "snapshot" or "mirror" utility, which leaves unattended backups available several times during the day. All of the major storage players have a built-in snapshot or mirror solution. Check the sites for storage giants, such as NetApp and EMC, for the details of their implementations and products.
Technical Demands: Low-level administration installation of backup client software and updates to the backup server software configuration. Part-time monitoring of backup server software and backup reports. Part-time administration of the backup disk device and contact with the vendor for maintenance and service. The NAS server will be administrated by the Server/Storage Administrator, and the snapshot backup will be considered an operational task of the Server/Storage team Ñ not of the backup administrator. Users and administrators will need a brief instruction on how to locate and recover data using the snapshots.
In this case, the duties of backup administrator should be strictly part-time duty given to a junior member of the server administration staff. Initial installation of the disk backup may require short-term help of a vendor consultant or a senior server administrator. Documentation and instruction from the initial configuration will be transferred to the junior administrator with part-time backup duties. Problems with service, upgrades, and troubleshooting will be handled by the part-time backup administrator who will receive help from the vendor consultant or senior staff if needed.
The recovery of database objects will be placed in the hands of the database administrator for the respective databases. The database vendor backup utility will be used to take database backup directly to a pre-defined local disk partition for backup. The backup software will simply take a backup of the database backup which will be saved to a pre-defined local disk area. The part-time administrator will not assist in database recovery.
Scenario 2
Administration Level: Candidate for low administration. This model has the possibility for remote administration or outsourcing.
Customer: Small-to-medium companies and enterprise. This variation of the scenario takes the low administration concept to its limits. Moving more towards an enterprise adds a bit of complexity.
Backup Policy: 12-month browse and recovery of data. Monthly full backup to tape to be saved for 12 years.
Data Objects: In this case, we have 20-300 servers with mixed Unix, Linux, and Microsoft. There are clients that save dynamic client user data to the NAS shared HOME and PROJECT catalogs. There are also centralized applications that are shared from the NAS. The site has 3-15 databases, which in some cases need to be backed up in conjunction with backup software modules.
Technologies: We must now expand our technologies just a bit to include our old friend the "tape backup". The medium business and small enterprise will most often require some sort of long-term backup with options for archiving of data before removal. My experience is that tape libraries and drivers can be the source of time-consuming administration and service. It is therefore critical to move towards the latest libraries and tape drivers to ensure low administration.
I have spent countless hours wrestling with tapes from old DLT drivers in out-of-date libraries while waiting for the vendor service technician to arrive with a new drive. It may be worth investing in the latest LTO technologies as a replacement for legacy tape technologies. This, of course, increases capacity and speed, which are also of importance.
The full monthly backups to disk can easily be added to your disk backup configuration, which could be similar to Scenario 1, thus allowing for a mixed tape and disk backup. Mixed tape and disk backup with additional snapshot mirrors may be the compromise of the future.
An interesting concept with the disk backups is that many of the popular backup software products incorporate the disks so that they appear as a virtual tape drive. This allows for the integration of tape and disk backup. You have the ability to configure some of your saves to disk and long-term backups to tape. There is also the concept of d2d2t (disk to disk to tape) where you first back up quickly to disk (called staging) and then copy the data to tape.
Our shared NAS data is expanded to a series of NAS servers but still uses the same snapshot or mirror techniques described in Scenario 1. This once again is handled not by the backup administrator but as a part of the storage administration.
Our databases now have demands to use backup software modules in conjunction with the backup software. These modules will connect the vendor database backup directly to your backup program. One example of this involves the Oracle database modules for EMC Legato Networker. Your DBA will now be working a bit more closely to create the initial installation and test of the module connections to backup. Once the backup is configured, it should run without further problems.
Technical Demands: We are still able to manage this form of enterprise backup with low administration methods. This is probably where many corporations are making their backup administration cutbacks. You have a combination of modern backup technologies that take fast and safe backups with a minimum of manpower. Here we have a possibility for offsite administration in conjunction with a low-level local technician to handle the tape rotations, removals, and vendor service contacts.
The key to low-level administration in this scenario is that the storage team now handles the NAS objects of the backup data via the snapshots. The storage team can also handle the installation of the backup disks and will cover for the junior backup administrator (or local backup contact) when he is unable to handle any higher-level problems with the backup server, backup disk devices, tape libraries, or upgrades.
This kind of model seems to be a reason why many companies are able to outsource administration or to eliminate the use of a full-time experienced administrator. We may eliminate our local backup administrators and have local "runners" who simply manage tapes, service contacts, client installations, and other daily tasks. The real question with Scenario 2 is "why" would you want to spend money to have an expensive administrator? Are there daily tasks that require advanced daily administration?
Scenario 3
Administration Level: Candidate for medium or advanced administration, which includes a backup technician(s) onsite who may work full-time.
Customer: Large or complex business and small or medium enterprise corporation.
Backup Policy: Varied policy dependent on data objects that include long-term backup and archiving.
Technologies: Mixed backup to disk and tape with skilled administrative needs for recovery of database and enterprise objects. Supplemented snapshots and knowledge of applications needed. As you see, we are using the same technologies from Scenario 2. We now have demands for local staff to be actively involved in backup and recovery. We no longer have a "hands-off" policy.
Now that we have technicians who can operate site implementations, we may consider the Solaris and Linux snapshots. The Solaris operating system has the ability to make a snapshot with "fssnap" so that you can use that technique via storage attached to a Sun server. A previous article by Peter Baer Galvin (Sys Admin, January 2002) describes "fssnap" in more detail. You may also want to investigate Linux LVM1 and LVM2 snapshot options.
Technical Demands: We now have a glimmer of hope for those backup administrators who feel threatened by some of the concepts that I described earlier in this article. We are now requiring an onsite employee to be knowledgeable of the backup, server objects, and application objects that are backed up. The techniques used are quite similar to those used in Scenario 2; the difference is that the backup administrator is a resource who is an expert for the configuration of data objects and recovery of varied types of data. It is now mission critical that staff members assist to ensure recovery speed and consistency of recovery.
This means the backup administrator will be required to have knowledge of applications that may have dynamic content. One example is helping with a recovery of CVS version control data or Rational (IBM) Clearcase version control data, which demands special knowledge of the underlying application. Such applications may require that special scripts are run to halt the application before a snapshot is taken. In-depth knowledge may be needed to confirm the consistency of such data. There may be a complex procedure to clear the application to a sane state after a failed operation or backup. This is also true of databases, which may require knowledge of vendor backup utilities, such as Oracle RMAN. Much of our current application data may be connected through two or more servers or tiers. Being able to re-create or re-initialize complex applications requires site-specific knowledge of these objects.
We now have a backup administrator with possible full-time duties. Another possibility is that these tasks are shared as part-time duties for two or more full-time administrators, who each have expertise in other areas. Outsourcing or elimination of backup duties will not be an option here.
Scenario 4
Administration Level: Candidate for senior administrative skills
Customer: Large or complex business and enterprise corporation
Backup Policy: Varied policy dependent on data objects. Long-term backup, archiving, and offsite cloning.
Technologies: Complex enterprise environment with fiber network, NAS storage, database, and complex application support.
Technical Demands: Administration of a SAN network and complex NAS storage is not an area of expertise for everyone on your staff. Choosing to base your backups on these complex objects will require that a senior administrator dedicate a least a portion of his/her job hours to backup administration and maintenance. Many now consider this to be a part of storage administration instead of traditional backup administration. The catch here is that you are very much dependent on vendor-specific product demands and issues. An excellent article by Greg Schuweiler ("Building a SAN Backup Solution", Sys Admin, April 2000) shows an example of the problems you may face.
The sales pitch sounds wonderful, and it could work if you have the exact mix of products from your vendors. With the SAN backup, you can back up the data direct from the SAN to other SAN libraries or disk objects. This means that if a server is allocated disk from the SAN, then the backup could be taken directly through the SAN and not through the client. There is also the possibility for shared libraries and remote copy of data.
The traffic of the backups is not directed through your LAN and there exists the possibility for snapshots and mirrors, which are offered by your SAN vendor. It is a revolution in technology, but it could be an expensive trap if you do not have a heavyweight architect design your solution based on the results and experience of many other site implementations.
Sites that are large enough to consider SAN solutions will most probably set high criteria for backup administration of databases and enterprise multi-tier applications. The full-time backup admin will be an expert for all sites applications, servers, and services and will be expected to assist in recovery within the SLA. All of the demands described in Scenario 3 will also apply to the backup admin. The site will most likely have NAS objects and snapshots.
Such sites may also have other backups, which can be easily administered with the techniques described earlier in this article. The SAN backup will not cover all of your sites data. The Scenario 3 model can be used for the data outside the SAN.
Other Considerations
There are a few items worth mentioning that are not addressed in the scenarios.
Outsourcing
Let's say that your company has successfully found a solution similar to Scenario 2, and we have an outsourced firm administrating the backup. This will require coordination with the outsourced backup team. There will updates and service issues that must be handled between the local site and the outsource firm. Changes and updates in the sites data structure must be updated and implemented with the outsource team. The liaison roles, cooperation, and communication will determine whether this can be a success.
Remote Backup
One scenario that I want to mention could in fact be the subject of a whole article. A large corporation may now choose to take backups over longer distances if the bandwidth permits. The idea of a data center where a master backup takes backups from remote sites or where the remote backup center controls the backups at multiple sites is also part of the future for many large corporations. I have focused on issues for sites that have not yet been chosen for integration into a master backup center scheme.
Backups vs. Disk Protection
How you define backup is important. In this article I think of backup as data being restored via a backup program in the traditional sense. Many of the techniques mentioned are used as a form of disk protection and failover disk redundancy, which may not been seen as traditional backup. With some forms of mirrors and RAID, there is no client restoring the data Ñ you are protecting against disk or array failure. I have tried to focus on the traditional backup role. As I stated earlier, storage administrators are slowly taking care of many of the traditional areas of backup, and these techniques and roles are merging.
Conclusion
The scenarios described here may or may not match your site, as there are endless possibilities and variations. The most important concept is to impartially examine the needs of your site and to negotiate a realistic solution that includes both the interests of management and the technical staff. The goal is to come to a decision that is based on both the financial needs of corporation, the recovery policy for the site, and the needs of the staff and customers. I may not have solved your problems here, but I hope to have stimulated a discussion and evaluation of where you will be going with backup in the future.
Making a change in your traditional backup configuration could require making changes in job duties and financial expenditures of the site. The new methods and technologies will make it easier to perform backups, yet more difficult to choose the proper solutions to match all of your data objects. There will be winners and losers; some people will be forced to move on to new job duties and others will become a new breed of backup experts.
Roger Feldman is a jack-of-all-trades Unix administrator who has been involved in just about everything at one time or another. He is currently working as a consultant for a major company at his home in Stockholm Sweden. He can be contacted at: roger.feldman@bostream.nu.
|