Article

Distributing Software Modules Using rsync

Ed Schaefer and John Spurgeon

Distributing software packages to all of our servers is a tedious task. Currently, a release manager makes a connection to each server and transfers files using ftp. This involves entering passwords multiple times, waiting for transfers to complete, changing directories, and keeping files organized. We developed a shell script, distribute_release (Listing 1), that makes the job easier.

Our script has some advantages over the ftp process:

Directory trees can be used to organize release modules.
A distribution network defines how files are transferred from server to server.
When a release module is ready to be distributed, it is replicated to all of the servers in the network using rsync, which helps minimize network traffic.
Various authentication methods can be used to avoid entering passwords for each server.

We’ll describe the directory structures including creating the distribution network. Then we’ll talk about the scripts. Finally, we’ll discuss an example.

Directory Structures

Each release module is stored in the directory /var/spool/pkg/release/[module]/. A module directory can be flat, or it can contain subdirectories. Hidden directory trees under the ./release/ directory define the distribution network. Therefore, the names of these directories cannot be used as module names.

Transport protocols supported by distribute_release include nfs, rsh, and ssh. If a release module is distributed using nfs, then the directory /var/spool/pkg/release/.nfs/[module]/ contains symbolic links corresponding to the hosts in the server’s distribution network:

/var/spool/pkg/release/.nfs/[module]/[host] -> \
 /net/[host]/var/spool/pkg/release/

When using nfs, rsync functions like an improved copy command, transferring files between the directories /var/spool/pkg/release/[module]/ and /var/spool/pkg/release/.nfs/[module]/[host]/[module].

When using rsh or ssh, the directory structures are similar. With rsh, for example, empty files of the form /var/spool/pkg/release/.rsh/[module]/[host] define the hosts in the distribution network.

The Scripts

Before distribute_release can be called, the directory structures and the distribution network must be created. The script create_distribution (Listing 2) facilitates these tasks.

One argument, the name of a release module, must be passed to create_distribution. When no options are used, the local host functions as a terminal node in the distribution network. In other words, the system may receive updates from another host, but it will not propagate those updates to downstream hosts. Downstream hosts and transport protocols may be specified with the -h and -t options respectively.

When using distribute_release, the name of a release module must be passed to the script. The -q and -v options may be used to control the amount of information displayed to the user. Hosts to be included or excluded from the distribution may be specified using the -i and -e options. The -r option may be used to determine how many times the program will recursively call itself to distribute the module to successive levels in a distribution hierarchy. When using nfs, the recursive calls are made locally. With rsh and ssh, the program calls itself on a remote server.

Distribute_release first gets the argument and any command-line options. Then, for each transport protocol, the script builds a distribution list and executes the appropriate rsync command for each host in the list. If a recursion depth is specified, then another instance of distribute_release is executed in a detached screen session, allowing the parent instance to continue running while the child processes propagate the module to other hosts.

An Example

Our example network (see Figure 1) contains five servers -- bambi, pongo, pluto, nemo, and goofy. One of the release modules is named TS1 (located on bambi) and the module is named TS2 (located on pluto). By executing the create_distributions script (Listing 3) on each server, the complete distribution network for both modules is built using the proper create_distribution calls.

Consider the TS1 release module; after the module has been distributed to all of the systems in the network, the directory /var/spool/pkg/release/TS1/ contains the following files and subdirectories:

 ./README
./v1/TS1-v1.pkg
./v2/TS1-v2.pkg
./beta/TS1-v3.pkg

On bambi, the directory /var/spool/pkg/release/.ssh/TS1/ contains a file named pongo. So, executing "distribute_release TS1” on bambi synchronizes the TS1 module with pongo using ssh as the transport protocol. The TS1 module can be distributed from pongo to all servers in the network using the -r option:

distribute_release -r 2 TS1

When using ssh, passwords can be avoided by using public/private key pairs with empty passphrases. When using rsh, you can update /etc/hosts.equiv or the appropriate .rhosts file. Obviously, passwords are not an issue with nfs. Deciding which protocol to use depends on security concerns, potential performance issues, and configuration complexity.

John Spurgeon is a software developer and systems administrator for Intel’s Factory Information Control Systems, IFICS, in Hillsboro, Oregon. He is currently preparing to ride a single-speed bicycle in Race Across America in 2007.

Ed Schaefer is a frequent contributor to Sys Admin. He is a software developer and DBA for Intel’s Factory Information Control Systems, IFICS, in Hillsboro, Oregon. Ed also hosts the monthly Shell Corner column on UnixReview.com. He can be reached at: shellcorner@comcast.net.