The Argument for AFS
Roger Feldman
As long as I have been aware of the Andrew File System
(AFS), I have heard many arguments both for and against it. It seems to
create passion from the architects and administrators who use and promote
it, and there are others who want to make sure that it doesn't take a
foothold in their IT environment. I first came in contact with AFS in the
late 1990s while it was being tested by a team of administrators who were
known for being the open software gurus of our organization. The other
administrators shook their heads as they tested Kerberos and AFS. It seemed
like a game for the techies who were mostly interested in working with the
latest cutting-edge software tools. What could this have to do with the
practical needs of the organization?
I found myself a bit reluctant to learn about these
Kerberos servers and was a bit daunted by the talk of implementing some
strange file system that required learning a whole new set of work skills.
Many of our administrators didn't want to get involved with Kerberos
and AFS and were just happy with the NFS and Windows file system sharing.
After some time, the members of the AFS testing team convinced a small
portion of our customers to use AFS, and suddenly we had Kerberos and AFS
servers in operation. Because I was known as one of the open-minded
administrators, I was deputized as Kerberos administrator.
Two years later, our organization was being
consolidated and we had to make decisions about how to distribute our
application environment, which was then dominated by NFS file sharing. By
this time, one of our departments was running a full-blown application tree
using the AFS file system. The department using AFS was known for having
the most complete and up-to-date application tree, so their AFS model was
chosen as the consolidated application environment. The rest of the Unix IT
department was then faced with learning about AFS and Kerberos. The model
was not chosen for the writable project/user file shares, and I will
explain why later in this article.
This won't be a "how to set up an AFS
environment" article. In this article, I will, however, show you
"hands on" how a few things work on an AFS server and client as
a means of exploring the main concepts, security features, and technical
functions of the AFS. This can be a starting point for comparing AFS to
other file system technologies. There are many popular products on the
market that, after many years of development, can do some of the special
things for which AFS is known. AFS has, for example, always had security,
distribution, and replication features in its focus.
What Is AFS?
Most sites use a shared file system to distribute
their applications, project data, home directories, and other data-using
protocols and programs, such as NFS, Samba, Windows CIFS, and Netware. NFS
is very often used for Unix sites. The AFS offers an alternative to these
well-known methods. Surprisingly, AFS is not known at all to many in the
Unix world. Even if you do not choose to implement AFS, it is worth
investigating because it has influenced the growth of other distributed
file systems and NAS products.
The history of AFS began at Carnegie Mellon
University, and later it became a commercial product at Transarc
Corporation, which was eventually bought up by IBM. IBM later branched the
AFS code to open source, and it became known as the OpenAfs Project, which
has its home at:
http://www.openafs.org
The latest version is openafs-1.4.1.
The concept is that you create a cell that resembles a
domain. This cell has servers that distribute a Unix directory tree
structure built up of AFS volumes that can be accessed by clients running
the OpenAFS client. The clients use Kerberos to identify themselves to the
servers, which also have advanced ACL file permission protection, which
increases the level of security. The OpenAFS clients have an advanced local
cache mechanism on their local disks, which makes often-used files and
applications appear to be local. The cache can also be used should a server
become temporarily unavailable.
The servers can easily replicate to read/only
replication servers at other sites allowing for fast access in remote
cities, sites, and countries. AFS volumes can be moved to another server
while the users are working. The OpenAFS clients can even access other
cells. The name space has location independence, a single global and local
independent space, which makes for structural consistency. Put this all
together, and you have a distributed file system that can transparently
allow safe access to data all over the world.
A Few Examples of Server and Volume Administration
The server infrastructure and administration can be a
bit daunting at first. You may need to have Kerberos servers available, and
setting up those is out of the scope of this article. You are not forced to
use your Kerberos servers as your authentication method, but they are often
used and can increase security and stability. This article is based on a
site that is using MIT Kerberos V as a replacement for the native
authentication server offering. Most sites that already have their own
Kerberos servers running for other security purposes feel safer having the
Kerberos on separate machines with a redundancy scheme. Please note that
using the native authentication server and tools offers easier and more
compact administration.
Let's dive in and take a look at some of the
things that appear on an AFS server. Here are the disks from our
"writable" AFS server that will be holding the AFS volumes:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/afs_vg/vicepa_lv 103212320 80720000 21443744 80% /vicepa
/dev/afs_vg/vicepb_lv 103212320 90172744 11991000 89% /vicepb
/dev/afs_vg/vicepc_lv 103212320 94212124 7951620 93% /vicepc
/dev/afs_vg/vicepd_lv 103212320 90483344 11680400 89% /vicepd
As you see, we have configured the disks with
mountpoints named "vicep(a-d)". The "vicep" is the
standard prefix for AFS partitions and you use "a-z" to name
each disk (up to 256 available using other alphabetic combinations). Later,
you will be creating AFS volumes in these "vicep" partitions.
It is common to use a RAID disk system to hold the "vicep"
partitions.
The catalogs, files, and data you create are not seen
to the clients in the "vicep" partitions. You first create a
logical container called a volume, which is stored in the
"vicep" partitions. After you
create a volume you choose "where" to mount it. Once you mount
the AFS volume, it is available for use. The standard namespace used with
AFS is to mount under /afs/cellname, as the /afs mount is the view that the
clients see. This is important because it allows all AFS cells to have the
same structure, and you may be accessing data from different cells. You
will now look at the top structure of the /afs on your cell:
% ls -al /afs
.mydomain.com
mydomain.com
In this case, we have two AFS mounts because we have
both read-only and write mounts. The dot ".mydomain.com" is the
writable mount, and the "mydomain" is the read-only mount. To
create a new directory, you would work in the /afs/.mydomain.com
mountpoint, assuming that you had the permissions and account, which are
stored in the "pts" account database. You will also need to be
included in a special file that resides on the servers; this file is named
"UserList", and it lists users that can perform some of the
administrative commands. Then you will need to take a Kerberos ticket,
which in turn allows you to take an AFS token. This is done using
"kinit" to authenticate with Kerberos and "afslog"
to get your AFS token. There are several variations (depending on your
decision to use your own Kerberos or the local auth server) on how you
authenticate, and I recommend that you make your own choices after reading
about authentication at the OpenAFS site.
With AFS token in hand, you can now create a volume
and mount it in the /afs tree. One of the most useful features of AFS
administration is that you do not need to be logged onto the AFS servers
while creating volumes and administrating many aspects of your cell. You
can do most of your work from a client machine as long as you have the
correct administrative credentials. This opens up a whole new way of
thinking as your physical location becomes much less relevant in the AFS
world. You are always seeing the same /afs namespace, which is regulated
and filtered by your credentials in the cell. The concept is similar to
having a file system that has the global availability of the Web and
explains why AFS has used the slogan: "No matter where you go, there
you are".
Next, we'll create a volume and mount it under
/afs/mydomain.com/progs. The vos command is used to create and manipulate volumes and is
used quite often in AFS administration. We use
the following arguments: create to create a volume, server01 to denote the server
where the volume resides, vicepa to specify which vice partition to use, and finally app.newprog to denote the name
of the new volume:
[/afs/.mydomain.com/progs] -> vos create server01 vicepa app.newprog
Create the mount under the prog directory with the fs command. The argument
after mkmount directory name is where the clients will see the volume in
the AFS tree:
[/afs/.mydomain.com/progs] -> fs mkmount newprog app.newprog
Once you have created new volumes and files, you must
"release" them with the vos release command. This releases the changes to the read/only
replicated servers. You have now created the new catalog
"newprog", which is available to the clients in your cell. Of
course I have created a simplified example here. In reality, you will have
to consider an overall strategy of which layers of your file system
structure will be actual volumes in relation to which volumes will simply
contain sub-directories that are not AFS volumes. If that sounds strange,
don't worry, it is one of the concepts that you will get used to,
although it may take a bit of thought before it all sinks in. You will
handle those issues in different ways depending on what kind of data you
are creating, such as your home directories, as opposed to an application
hierarchy that I have used in this example.
One other important feature of AFS is worth mentioning
at this point. The "@sys" variable reflects the client OS
architecture. To see the architecture name of your client, use the fs command. (Please do not
confuse the AFS fs command with the earlier Unix fs command.)
% fs sysname
This command returns the following output:
"Current sysname is 'sun4x_58'". If you had run the
command on a Linux machine with the 2.6 kernel, it would have answered with:
"Current sysname is 'i386_linux26'".
You can take advantage of the "@sys" variable when creating installations for each architecture release that you
are supporting. The installations for all architectures will be created
under the ".sys" catalog, which makes for some transparency:
/afs/mydomain.com/progs/newprog/.sys
This allows you to create installations for several
different architectures, such as Sun and Linux:
/afs/mydomain.com/progs/newprog/.sys/sun4x_58
/afs/mydomain.com/progs/newprog/.sys/sun4x_59
/afs/mydomain.com/progs/newprog/.sys/i386_linux24
/afs/mydomain.com/progs/newprog/.sys/i386_linux26
You then install each application under the correct
architecture catalog under the current release number:
/afs/mydomain.com/progs/newprog/.sys/i386_linux26/1.2
After creating a link directly under newprog, the AFS
client will automatically transcend to the correct architecture version of
the program using the @sys variable. When the AFS client sees the @sys
variable, it automatically translates @sys to the correct OS architecture.
Assuming that we have installed version 1.2 of an application named
newprog, it would look like this:
[/afs/.mydomain.com/progs/newprog] -> ls -al
drwxr-xr-x 3 root root 14336 Nov 22 14:53 ..
drwxr-xr-x 8 myid mygrp 2048 Dec 8 13:19 .sys
lrwxr-xr-x 1 myid mygrp 15 Dec 8 13:16 1.2 -> .sys/@sys/1.2
After you have created the volume you can use the fs command again to set the
quota and permissions on the catalog or files. The ACL permissions used by
AFS are more robust than the standard Unix ACLs and are considered to be
one of the advantages of AFS. To check the current permissions in the
current catalogue:
% fs listacl
This command will return something like the following:
--------------------
Access list for . is
Normal rights:
staff rlidwka
system:administrators rlidwka
system:anyuser rl
--------------------
You see the permissions are a list of the following
possible bits, "rlidwka" which correspond to:
'l' The lookup right
'i' The insert right
'd' The delete right
'a' The administer right
'r' The read right
'w' The write right
'k' The lock right
The ACL of a directory may be changed using the fs setacl command. By default, fs sa adds to or alters the
existing ACL, rather than replacing it entirely. Let's look at two of
the predefined protection groups that are quite helpful in situations in
which you want to allow large groups to have read access to certain
directories:
- system:anyuser -- The system:anyuser allows any AFS
user to access the catalog. It is quite handy for a site that wants to
allow any AFS client to access and use applications without regard to
getting tickets and tokens, thus reducing administration for users that
won't be creating files in AFS.
- system:authuser -- The system:authuser requires that you
authenticate within the cell, thus creating access for all in your cell
while blocking other cells that may have access to other areas of your
site.
The fs command is also used to set the quota. Here we use the sq argument to set 100 MB
as our quota:
[/afs/.mydomain.com/progs/newprog] -> fs sq . 100000
I chose these previous examples to give you a hands-on
idea of how AFS works when creating and mounting a volume. I don't
want to spend too much time describing the server daemon processes as you
can get the details at:
http://www.openafs.org/documentation
Note that AFS denotes each server process as a
"server", and in reality several of these servers often reside
on the same server machine. Here is a brief look at some of the AFS server
processes:
Database servers:
Kaserver -- Handles authentication
Ptserver -- Protection database; runs on database servers
Vlserver -- Volume location server; keeps track of their location
Buserver -- Backup server; needed to run the backup utility
It is best practice to replicate the database servers
to one of your other AFS servers. This process is not to be confused with
the replication of volumes. This would cover you in the event of a crash.
One of the important concepts of AFS is that the clients can always contact
another server if they need information from one of the database servers.
If you want the details about this subject, please read further about the
"ubik" algorithm. The AFS concept always takes consideration to
create a form of failover, which gives a certain amount of high
availability.
Fileserver types:
Fileserver -- Takes care of storing and delivering files
Bosserver -- Runs on fileservers; controls and monitors status
Upserver -- Handles updates and distributes them
Volserver -- Handles vos commands to manipulate volumes
You will want to familiarize yourself with the bos command, because it is
used both to get server status and to start and stop services. You can see
it as a command suite. If you have a large enterprise site, the actual
server administration (physical machine, RAID disk maintenance, and
processes) may be separated from the cell administration of volume
creation, user administration, and user data.
Volume replication is an important part of AFS,
especially if you are distributing a lot of read-only data like
applications. The concept is that you install a replica AFS fileserver with
the same-size disk partitions and use that machine as a read-only server at
another site or location (or possibly at your site) to handle a heavy load.
This means that if you have a writable AFS server at your main site, you
can simply replicate to your site in another location or country. Once you
have set up your replication server, you add the sites to the volumes that
are to be replicated. We use the vos command with the addsite argument. The remote01 argument is the name of the remote read-only server, vicepa is the vice disk on the
replication server, and app.newprog is the volume to be replicated:
[/afs/.mydomain.com/progs] -> vos addsite remote01 vicepa app.newprog
We then run the vos release to replicate to the remote
read/only server:
[/afs/.mydomain.com/progs] -> vos release app.newprog
I find it very interesting how tight this mechanism is
built into the AFS concept. Once you have installed the replication server,
it is simple to replicate. How much would that cost if you were using a
commercial product? How long has it taken for many commercial products to
catch on to this theme?
AFS Client
Many of the aspects of the AFS client are not seen by
the actual users. The administrator will have installed the AFS client,
which hopefully included the files needed for cell configuration. If your
client is doing anything more than accessing read-only applications, you
will also need Kerberos client configuration so users can obtain Kerberos
tickets and AFS tokens. You will also need to educate your users so that
they can requests tickets and tokens if you have not made that a
transparent part of the login process. The name of the default cell is
stored in a file named "ThisCell" under /usr/vice/etc/. This
also contains a file name "CellServDB", which lists the IP
address and DNS names of the database server machines in the cells that you
need to contact. Don't forget that you may be contacting several
different cells. The fs command mentioned earlier can show information about your
cells:
% fs listcells
One of the vital functions and features of the client
is the cache manager that is loaded into the kernel. The cache manager is
also one of the special features of AFS. AFS uses a pre-defined area of
disk; this could be a separate partition and should allow at least several
hundred megabytes to be used for caching. Applications that request files
from the Volume Location server will place the files in cache. You will
often read about the "callback" concept, which means that the
server will promise to notify the clients if they need a newer version of
the file. I always found this concept hard to believe, but it does work and
we have been amazed that many distributed applications run like locally
installed applications because of the cache manager. The file
/usr/vice/etc/cacheinfo contains the following information to define 500MB
cache located under /usr/vice/cache:
/afs:/usr/vice/cache:500000
There are many ways to measure, monitor, and tweak
your cache, and you can investigate those yourself if you decide to test
AFS. Here are two quick samples with the fs command:
% fs getcacheparms
% fs setcachesize
The fs command can also be used to list and set server preferences;
this can be valuable if some servers are having problems.
One final note about the client -- your site will
have to consider a strategy on how to incorporate the /afs namespace into
your local file system hierarchy namespace. This should not be a big
problem as you have probably already tackled such issues while mounting and
accessing other types of distributed file systems.
Users and PTS
At the beginning of this article, I mentioned that we
have chosen not to use AFS for the bulk of our project-writable data and home catalogs. There is a very positive side to
using AFS with applications and read-only site configuration data. The
clients can use applications without much regard to the Kerberos and pts
user database, which decreases administration in AFS, Kerberos, and pts. If
you choose to use AFS for writable project data, then all users will need
to be administered in both Kerberos (or Authentication server) and pts.
Attention must also be paid as to how you will synchronize their uid names,
uid numbers, groups, and passwords with the regular Unix accounts and any
existing password schemes, such as NIS, that may be running at your site.
We do also have some project- and user-writable data in AFS, and I expect
that many readers may be interested in using AFS for home catalogs and
project data. This means that your decisions about Kerberos and user
administration will greatly affect how you administer the site.
The documentation at OpenAFS (which seems to be
derived from the IBM/Transarc docs) assumes that you are using the built-in
authentication server and kas admin suite. There are other utilities that
take advantage of this, thus making user and Kerberos administration much
easier. My main point is that there are decisions to be made, and you must
thoroughly consider your site needs before deciding your strategy.
Here are a few examples of how it will look when you
administer the pts database. Pts tracks and registers users, client
machines, and groups in AFS. You must be a member of the third pre-defined
protection group system:administrators to create users and machine entries.
Please note that it is standard practice to map a separate Kerberos
"admin" account for each admin who will be listed in system:
administrators. Once you start to create your
regular users, concern should be taken to match the Unix login name and uid
number with the pts user and uid. If you have a Unix user named
"jonesj" with uid 1099, then you could create a user in pts
with the following command:
% pts createuser jonesj 1099
You will also need to create this same user in
Kerberos or authentication server with the "kas" command, as
both are needed for AFS access. You can look at your pts entries using the
"examine" argument:
% pts examine jonesj
It is also interesting that you can have machines
defined. Once they are added to a group entry, they could be used on ACLs.
Add to that the fact that wild cards can be used with machine addresses,
and you can really fine-tune which networks access your data.
Creating groups is straightforward and can even be
done at the user level because AFS has the concept of group ownership. You
can see who owns a group with the following:
% pts listowned [user] or [group] or [id]
To create a group, use the "creategroup" argument:
% pts creategroup jonesj:testers
Then use "adduser" to place other users in your group:
% pts adduser -user smithb -group jonesj:testers
The pts membership command will show you the groups to which a certain user
belongs or the members of a group. The pts
listentries command will dump all pts
entries for the users or groups.
As you can see from these few short examples,
combining pts users, machines, and groups with the ACL listing provides
deep possibilities to fine-tune your access to sensitive information. This
is just the tip of the iceberg, and you'll have to study a while if
you are used to working with standard Unix permissions.
Backup
There are different methods used to backup AFS data.
Many of the well-known commercial backup products do not have support for
AFS. One option is to use the AFS backup utility command, which works in
coordination with the buserver. The backup utility supports a whole series
of commands to define dump devices, schedule, and administer backup.
Our site has opted to use the vos dump command, which is
incorporated into a Perl script that can handle both full and incremental
dumps under a monthly period. The dumps are dumped to disk where our
commercial backup product writes the dump files to disk. Please note that
this is not the recommended backup method. It works for our site because we
have a resource that can program in Perl, and we have tested and confirmed
that this method can consistently backup and restore our AFS data.
Conclusion
In this article, I've mentioned both the good
points and some of the difficulties with AFS. There are many topics and
issues that I did not have room for in this short introduction; whole books
have been written on the subject. I didn't focus on server
installation, setup, and administration because I don't think that it
would given you a feeling for how AFS works in a more practical sense.
I did not write this article as a mission to convert
you to AFS. I think it is beneficial to enterprise Unix administrators and
architects to be aware of the AFS model, because it has been a frontrunner
in distributed file system world. Choosing to use AFS is a major decision
for a site and requires planning and education. Using OpenAFS requires that
you have the administrative resources, support, and knowledge to run
OpenAFS and Kerberos. I hope that you can see that it is worth
investigating the advantages of AFS with regard to security, availability,
operation, and administration.
A book that I recommend on this topic is Managing AFS: The Andrew File System by Richard Campbell (Prentice Hall). Please note that
despite the fact that it is 8 years old and that some of the information
has become obsolete, it is considered a very informative and well-written
book on AFS.
Roger Feldman is a jack-of-all-trades Unix
administrator who has been involved in just about everything at one time or
another. He is currently working as a consultant for a major company at his
home in Stockholm Sweden. He can be contacted at: roger.feldman@bostream.nu.
|