Article

nov2006.tar

Understanding Exim

Nathan McCourtney

The purpose of this article is to familiarize readers with Internet mail generally and the Exim mail transfer agent specifically. Why Exim? Because, as in any trade, systems administration requires a trusty set of tools. You could learn Postfix, Sendmail, smtpd, et al., but unless you're dealing with really high volume (like at a Fortune 500 or something), you should find and learn the most simple and portable tool available for this particular task. I would argue that Exim owns that honor.

First, Exim is small. The rpm for it measures a whopping 1.1MB by my count. In the great tradition of Unix tools, it does one thing well and thus resists bloat. Second, it's portable. Any *nix worth running (and even some that aren't) have an Exim port. Thus a skill learned on one platform will serve you well on a wide spectrum of others. Exim also manages to be incredibly simple to use while offering broad functionality. As I will describe, a few key concepts are all it takes to follow along. Everything else is common sense. Third, Exim is one of the best-documented pieces of software I've ever come across. And I don't just mean by sheer volume; the documentation is clear and to the point. This particular aspect is really a breath of fresh air after poring over man pages and wikis all day.

What's A Mail Transfer Agent?

When discussing "mail servers", it is important to differentiate between those services that move mail between domains and those that make mail available to end users. Mail User Agents (aka MUAs) are front ends to mail systems. A good example would be the Mozilla Foundation's Thunderbird. A maildrop service is a program that makes the mail resident on a particular server available to the appropriate mailboxes and mail user agents. Often, maildrops implement the POP or IMAP protocols. A mail transfer agent (MTA) is the workhorse that moves mail around the network and is primarily what we are concerned with here.

Before we get into the details of the actual executable, let's first examine the various forms an MTA can take. One consideration is whether this mail server is the primary domain mail exchanger or merely an internal mail router. A Mail Exchanger (MX) is the outward-facing MTA for a particular domain. A quick host call will tell you the MTAs for a particular domain:

nsmc@grendel:~> host -t MX mccourtney.com 

mccourtney.com mail is handled by 5 mail1.mccourtney.com. 
mccourtney.com mail is handled by 10 mail.mccourtney.com.

Following a typical SMTP algorithm, any MTA trying to send to user@mccourtney.com will first attempt a connection to mail1.mccourtney.com and, failing that, will try mail.mccourtney.com. Those two boxes will either hold the mail for the appropriate account or relay it to the designated server.

But an MTA doesn't need to be an MX. Most Web servers I've managed have provided mail services to their sites. After all, why should mail from an anonymous Internet user clog up your precious corporate Exchange server (which needs all the free memory it can get)? Just tell Perl to connect to SMTP on your localhost and off the message will go. Under these circumstances, no one knows or cares which machine generates the mail. All that matters is that the mail gets to the proper destination. That's where the Simple Mail Transport Protocol (SMTP) comes in. (See the RFCs for more detail.)

Implementing a full-blown MX is beyond the scope of this article. It involves setting up the proper DNS records and arranging for users to retrieve their mail. Since neither of those is a function of Exim, I'll leave that discussion for another day. Instead, let's stay focused on the relatively simple topics of mail coming in and mail going out.

Setting Up

To begin, let's get the software installed. As I mentioned earlier, Exim is packaged for every *nix there is. In my case, I'm running SUSE 9.2 Professional. A quick look at the following URL shows us some excellent sources:

http://www.exim.org/eximwiki/ObtainingExim

Once the appropriate rpms are retrieved, installation is as simple as:

rpm -ivh --force exim-4.51-0.1.i586.rpm 
rpm -ivh --force eximon-4.51-0.1.i586.rpm

It's important to note that if you're already running Postfix, Sendmail, or some other MTA, you'll have to uninstall it first. Don't worry about the dependency complaints. Like most modern Unix mail systems, Exim mimics the commands and executables of the Sendmail interface, and once you've got the rpm installed, everything should be copasetic. In my case, the --force option was necessary because I already had Exim 4.42 running and needed to overwrite it.

When learning a new application, I generally prefer to compile from a tarball. This provides an excellent idea of the authors' view of what the layout should look like and what a vanilla configuration entails. In the case of Exim, though, I would recommend against it. Mail systems are implemented just differently enough on every OS (or even distro) that fighting the dependencies and recreating all the aliases, etc. can be a nightmare. To avoid that pain, I recommend you just install from the package and be grateful that someone else took the time to lay it all out.

A quick look at the files from the rpm speaks volumes about the system itself:

rpm -ql exim

Grepping on "bin" will demonstrate what I mentioned earlier about the sendmail emulation.

Now that everything's in place, let's make sure we're up and running. As root, do:

insserv /etc/init.d/exim

then:

rcexim start

Assuming you're running SUSE, you should have gotten a green "done" back. If so, try this little SMTP exercise:

grendel:/home/nsmc # telnet localhost 25 
Trying 127.0.0.1... 
Connected to localhost. 
Escape character is '^]'. 
220 grendel ESMTP Exim 4.51 Wed, 03 Aug 2005 23:33:34 -0400

Next, enter the following sequence of commands followed by ENTER:

ehlo localhost 
mail from:<some email address>
rcpt to:<your email address>
data 

This is a test. 

. 

quit

Make sure you have the empty lines before and after the period.

Check your email account. There should be an unusually sparse message containing the properties you entered above. This isn't much different from an automated SMTP session (minus some less important commands, of course). For that matter, this is actually a great test for any MTA. The commands should be universal. Next, do:

tail /var/log/exim/main.log

You should see some record of the mail transaction that just took place. And errors should also be reflected in /var/log/exim/reject.log.

If you run into problems, try running the following:

grendel:/home/nsmc # exim -bV

This command asks Exim to parse your config file and let you know about any errors it finds.

If the address you're sending to seems problematic, try:

grendel:/home/nsmc/txt # exim -bt user@mccourtney.com 
user@mccourtney.com 
  router = dnslookup, transport = remote_smtp 
  host mail1.mccourtney.com [66.235.200.144] MX=5 
  host mail.mccourtney.com  [66.235.200.178] MX=10

The output tells you how Exim is processing the destination address. The two most important fields are the router and the transport. These two drivers are really the heart of the "transfer" part of this MTA. To understand exactly what those two terms mean, we need to take a look at how Exim processes mail.

Under the Hood of Exim

To begin with, Exim operates on a very simple transport model. Remote reception and remote delivery take place only via SMTP (vs. an MTA like Sendmail, which can work over a wide variety of protocols). Locally, messages can be received via the command line (interactively or batched) or via the aforementioned SMTP commands. Local messages are delivered either to the appropriate user's mail file or via any of the pipes outlined in /etc/aliases.

When a message comes in, it is assigned a 16-character Message ID in the form XXXXXX-XXXXXX-XX. The first six characters are the time it was received, in seconds, since the epoch. The next six chars are the PID that received the message. The last two represent the exact fraction of the second it was received.

Two files are then created in the input directory under the mail spool. To find out where your spool directory is, or the value of any compiled option, simply run:

exim -bP spool_directory

Mine comes back as:

spool_directory = /var/spool/exim

And listing confirms the presence of the necessary directories:

grendel:/home/nsmc # ls /var/spool/exim/ 
.  ..  db  input  msglog

But let's get back to the message files. The first is named <message id>-H and contains header information. The second is named <message id>-D and contains the message body. A third file is sometimes created when delivery is incomplete to track which addresses are still outstanding: <message id>-J. This last one remains in input until delivery is successful or the message is deleted (usually by an admin or a housecleaning process).

When the message is actually processed, it steps through the configured routers in top-down order, based on their position in the configuration file. Routers are basically sets of criteria the message is compared to. If the criteria are met, the router in question sends the message to its configured transport. If they are not met, the message is passed to the next router in line, etc. (Note that it's also possible to use routers to rewrite mail addresses -- at which point they start back at the top of the router chain.)

Transports are exactly what they sound like: methods by which the messages are moved to their destination. There are several local transports but, as noted earlier, the only remote transport is SMTP.

At times, a message may be undeliverable. When that happens, the message files remain but the addresses that were successfully delivered to (if any) are removed from the header. The messages remain in the spool until their retry intervals are met and a "qrunner" process comes through. If all copies are delivered, a note is made in the log, and the message files are deleted. If not, the retry interval is reset.

Mail is typically stored in structures called "spools". And the "qrunners" I mentioned previously are basically spool sweepers. They run through the messages in the Exim input directory and attempt to deliver them. A qrunner is launched either at regular intervals specified to the Exim binary at startup (see the -q option) or manually via the command runq. (Another useful command is mailq, which shows all the messages waiting to be delivered in a Sendmail-style output.)

Now that I've outlined the basic process, let's look at some of the details of the configuration. Let's first examine the routers configured in my MTA out of the box. Here's the router used for remote domains:

dnslookup: 
  driver = dnslookup 
  domains = ! +local_domains 
  transport = remote_smtp 
  ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8 
  no_more

The parameters are fairly straightforward. Basically, you can read it as follows. Pass this message to the dnslookup driver (which, unsurprisingly, routes the message using an MX lookup via DNS).

The next line says "don't bother with local domains". "+local_domains" is a named list that is set at runtime to the name of the local host. This can either be set manually with the "primary_hostname" option or retrieved by Exim from a uname() call. Essentially, this is the trigger by which local messages get kicked out of this router and on to the next one.

The "transport" line states which transport to use. This would probably be a good time to mention that routers and transports often have names similar to, or even the same as, the drivers they use. For instance, despite appearances, dnslookup is not using itself for a driver. Remember that the named instances are the ones you can configure at runtime. So, dnslookup is a named instance of the dnslookup driver. And remote_smtp is a named instance of the SMTP driver.

The "ignore_target_hosts" statement is a host list similar to the domains list above. In this case, though, it references by IP address. 0.0.0.0 and 127.0.0.0 are both ways of saying "localhost".

"no_more" simply means that if you made it through all the criteria and the dnslookup driver still failed, don't bother passing the message on to the other routers. This triggers a fail condition, and the message gets bounced back to the sender.

The rest of the routers deal with local traffic, so let's look at them all together:

system_aliases: 
  driver = redirect 
  allow_fail 
  allow_defer 
  data = ${lookup{$local_part}lsearch{/etc/aliases}} 
  file_transport = address_file 
  pipe_transport = address_pipe 
userforward: 
  driver = redirect 
  check_local_user 
  file = $home/.forward 
  no_verify 
  no_expn 
  check_ancestor 
  file_transport = address_file 
  pipe_transport = address_pipe 
  reply_transport = address_reply 
localuser: 
  driver = accept 
  check_local_user 
  transport = local_delivery 
  cannot_route_message = Unknown user

You can see that the first two use the redirect router driver. From the documentation:

The [redirect] router operates by interpreting a text string, which it obtains either by expanding the contents of the 'data' option, or by reading the entire contents of a file whose name is given in the 'file' option."

The first two routers above give excellent examples of both situations. "system_aliases" does a lookup in the /etc/aliases file à la Sendmail. "userforward" checks a file in the user's home directory for their personal mail-forwarding preferences. The latter is especially interesting in regard to the last one: localuser. You probably have realized that the only way that the localuser router could ever get passed a message would be if the userforward router rejected it, which would seem odd since they look almost identical and both run the check_local_user test. Clearly, then, there must be another implicit test (or tests) contained within the userforward definition.

This is one thing that I don't like about Exim -- syntactically, there's little or no differentiation between property settings (e.g., "no_verify") and logical operations (e.g., "file"). You have to look up every option to find out which tests are being evaluated. Moreover, the order of those tests would seem to be top to bottom, but the settings are scattered among them in no discernible order whatsoever.

In the case of userforward, "check_local_user" pulls the local part of the mail address (e.g., the "nathan" in "nathan@example.com") and issues a getpwnam() system call. If it's not the name of a system user, the router declines the message and sends it down the line to the next router. If it is the name of a system user, the user's home directory is assigned to the variable $home and the next configured test is evaluated. In this case, the next configured test is "file". With file, if $home/.forward does not exist, the router declines, and so on. Once you get used to it and know what is a check and what is a property, the simple syntax is easy to understand. But for those seeing the process for the first time, looking up each and every option can be a bit tedious.

This syntactic over-simplicity carries into the configured transports, as well. Within the redirect router, the generic "transport" option cannot be used (see the documentation for details). You must specify named transport instances for a variety of <type>_transport options. Which transport to use is determined by the data returned by the file call. If a rule configured in .forward returns a string starting with a pipe ("|"), the pipe_transport is selected, and so on. I've got to think there might a more readable way to nest these interrelations within the configuration file.

But getting back to the task at hand, the final router driver used in my configuration is "accept". It seems to be a simple catch-all that points to a general transport. The "cannot_route_message" parameter tells it what message to write to the logs if it doesn't get routed to the transport by virtue of failing the tests.

Putting this all together, my routers basically work as follows:

1. Is the domain (i.e., the "example.com" in "nathan@example.com"?) local to this host? If so, decline it to the next router. If not, resolve it. If it resolves to the localhost ip, fail it without passing it to the next router. If not, pass it to the remote_smtp transport.

2. Does the local part of the address have an entry in /etc/aliases? If not, decline it to the next router. If so, pick a transport based on whether the aliases file returns a pipe, a file destination or email address(es).

3. Is the local part the same as a local username? If not, decline it to the next router. If so, does that user have a .forward file in their home directory? If not, decline it to the next router. If so, process the rules in .forward against the destination address. Pick a transport based on the appropriate rule string returned from .forward.

4. Is the local part the same as a local username? If not, write a message to the log and bounce the mail. If so, forward it to the local_delivery named transport instance.

This represents perhaps the most basic router setup that still manages to cover all the bases. There are many, many different ways that routers can be configured, though. For instance, routing typically stops once a router has accepted a message and queued it for transport. However, you can set a parameter called "unseen" that tells the router to do its work and pass the message on to the next router. You might want to do this to archive messages, or possibly because you just like to make your Xeons earn their keep. The point is you can.

Moving on, here's a list of my default transports:

begin transports 
remote_smtp: 
  driver = SMTP 
local_delivery: 
  driver = appendfile 
  file = /var/mail/$local_part 
  delivery_date_add 
  envelope_to_add 
  return_path_add 
address_pipe: 
  driver = pipe 
  return_output 
address_file: 
  driver = appendfile 
  delivery_date_add 
  envelope_to_add 
  return_path_add 
address_reply: 
  driver = autoreply

Note that the order in which the transports are listed is immaterial because they are determined solely by the routing process. That said, let's take a look at each one in turn.

remote_smtp -- This is the named instance of the sole driver capable of remote delivery in Exim. As you can see, there aren't too many options configured here (which is to say: none). For the most part, that's because the defaults are set, and the retry and authentication aspects have their own areas in the configuration file. But you can configure almost any other facet of the basic SMTP process here, including which interface the traffic should go out on and what hosts to send to (in case the dnslookup router wasn't the origin of the message). Some other important ones are timeouts and the maximum messages per TCP/IP connection.
local_delivery -- The appendfile transport driver has its origin in the traditional Unix mail file system. Each user has a file that contains an aggregate of their messages. Each mail user agent operates on that user's file; hence, we see the /var/mail/$local_part file directive. According to this setting, the user name in the mail address (again, the "nathan" in "nathan@example.com") gets its own file.

There are obvious file system permission issues here, but suffice it to say the Exim process must be able to create files and directories under the mail spool. On my system, /var/mail is a link to /var/spool/mail, and the permissions for it and the Exim binary are as follows:

grendel:/home/nsmc # ls -ld /var/spool/mail/ 
drwxrwxrwt  2 root root 4096 2005-08-10 15:54 /var/spool/mail/ 
grendel:/home/nsmc # ls -l /usr/sbin/exim 
-rwsr-xr-x  1 root root 778791 2005-05-18 11:45 /usr/sbin/exim

The directory's sticky bit is set (meaning that deletions to each user's file are the prerogative of that user alone), and the setuid bit is active on Exim (meaning that regardless of whose identity it runs under, it has the access rights of the binary's owner -- in this case root). So Exim and the user accounts have all the access they need.

The other options (delivery_date_add, envelope_to_add, and return_to_path) are actually general transport settings. They indicate that Exim should add headers to the message on its way to the file.

address_pipe -- This is a particularly interesting feature of the application, in my opinion. Recall that this transport can be triggered by the presence of a pipe-structured alias in /etc/aliases. Pipes themselves are part of what gives Unix so much command-line power, and they allow Exim to pass a message on to almost any executable. The transport can limit which commands are runnable, but, if misconfigured, this could be a huge security hole. Be very careful when allowing anyone to pass commands on the sly like that. Note that the return_output parameter sends any feeback from the pipe to the sender.
address_file -- This is another item that uses the appendfile transport. But there's no file parameter, so how does it figure out which one to use? Looking back at the routers, you can see that address_file is referenced by the file_transport option of the system_aliases router. Logically, then, system_aliases must pass a file parameter to whichever transport is referenced in that way. According to the documentation, that's exactly what happens: a variable named $address_file is populated and passed. The rest of the options are configured and behave the same way as local_delivery.
address_autoreply -- This instance of the autoreply transport driver is triggered by the reply_transport setting of the userforward router. You'll recall that this is in turn an instance of the redirect router. Essentially, autoreply does what you'd expect; it automatically replies with a canned message. To state things a little more clearly, when a user receives a mail and has an "out of office" rule set up in her .forward file, the redirect router picks the driver configured in reply_transport and supplies it with all the info it needs to create and transmit a message back to the mail's originator. Got it? Great!

There are four other primary subsections of the configuration file: retry, rewrite, authenticators, and ACLs. Retry sets the parameters for reinitiating remote transport on delivery failures. But, because all the defaults are fine for my environment, there's nothing to configure.

There's also nothing in my rewrite, because I don't need to rewrite any addresses on the fly. In the past, I've used this really excellent feature to circumvent an anti-spam measure utilized by some of the less-sophisticated filters out there -- namely, to refuse all mail originating from the domain "localhost". That kind of limitation isn't a problem for a corporate mail exchanger, but when you're just kicking out notifications from a host in the dmz it can get a little sticky.

Truthfully, this wasn't even that good of a fix for the problem I mentioned, but it was a good lesson in just how flexible rewrites can be. Almost any aspect of a mail can be rewritten according to rules and/or regular expressions. And the names of the fields to replace are surprisingly intuitive. This was one of the easiest feature sets I've ever learned.

Authentication isn't necessary on the host we're examining because the only mail it will accept is for the localhost. If it were needed, though, we'd have every option an admin could want, from plaintext to TLS and SSL.

Access Control Lists (ACLs) allow you to set rules on what types of messages can move through your MTA. There is a series of pre-defined points that can be assigned ACLs using syntax similar to the following from the top of my configuration file:

acl_smtp_rcpt = acl_check_rcpt

This particular setting assigns the ACL "acl_check_rcpt" defined within the configuration file to the predefined event "acl_SMTP_rcpt". Basically, acl_check_rcpt makes sure that no potentially nefarious characters are passed during the part of the SMTP session when the sender is specifying the recipient. It's a security measure to prevent directory-traversal attacks and the like.

Summation

While I've only scratched the surface of the Exim Mail Transfer Agent configuration, I hope I've shed some light on what an MTA is and how Exim works in that role. Below are listed some Exim resources for more information. The best teacher, however, is experience so I encourage you to experiment with your configuration. Often, Exim works so smoothly you won't have time to really eyeball legitimate traffic. Try intentionally jamming your queue with bogus messages (on a test box, of course) and observe the behavior of the various components. Get creative!

References

Exim Download Info -- http://www.exim.org/eximwiki/ObtainingExim

Exim Specification -- http://www.exim.org/exim-html-4.50/doc/html/spec.html

Nathan McCourtney has been in the technology field for 9 years, first as a developer then as a high-availability sys admin. He is currently the Director of Data Center Operations for Angel.com.