Understanding Exim
Nathan McCourtney
The purpose of this article is to familiarize readers
with Internet mail generally and the Exim mail transfer agent specifically.
Why Exim? Because, as in any trade, systems administration requires a
trusty set of tools. You could learn Postfix, Sendmail, smtpd, et al., but
unless you're dealing with really high volume (like at a Fortune 500
or something), you should find and learn the most simple and portable tool
available for this particular task. I would argue that Exim owns that
honor.
First, Exim is small. The rpm for it measures a
whopping 1.1MB by my count. In the great tradition of Unix tools, it does
one thing well and thus resists bloat. Second, it's portable. Any
*nix worth running (and even some that aren't) have an Exim port.
Thus a skill learned on one platform will serve you well on a wide spectrum
of others. Exim also manages to be incredibly simple to use while offering
broad functionality. As I will describe, a few key concepts are all it
takes to follow along. Everything else is common sense. Third, Exim is one
of the best-documented pieces of software I've ever come across. And
I don't just mean by sheer volume; the documentation is clear and to
the point. This particular aspect is really a breath of fresh air after
poring over man pages and wikis all day.
What's A Mail Transfer Agent?
When discussing "mail servers", it is
important to differentiate between those services that move mail between
domains and those that make mail available to end users. Mail User Agents
(aka MUAs) are front ends to mail systems. A good example would be the
Mozilla Foundation's Thunderbird. A maildrop service is a program
that makes the mail resident on a particular server available to the
appropriate mailboxes and mail user agents. Often, maildrops implement the
POP or IMAP protocols. A mail transfer agent (MTA) is the workhorse that
moves mail around the network and is primarily what we are concerned with
here.
Before we get into the details of the actual
executable, let's first examine the various forms an MTA can take.
One consideration is whether this mail server is the primary domain mail
exchanger or merely an internal mail router. A Mail Exchanger (MX) is the
outward-facing MTA for a particular domain. A quick host call will tell you
the MTAs for a particular domain:
nsmc@grendel:~> host -t MX mccourtney.com
mccourtney.com mail is handled by 5 mail1.mccourtney.com.
mccourtney.com mail is handled by 10 mail.mccourtney.com.
Following a typical SMTP algorithm, any MTA trying to
send to user@mccourtney.com will first attempt a connection to
mail1.mccourtney.com and, failing that, will try mail.mccourtney.com. Those two boxes will either hold the mail for
the appropriate account or relay it to the designated server.
But an MTA doesn't need to be an MX. Most Web
servers I've managed have provided mail services to their sites.
After all, why should mail from an anonymous Internet user clog up your
precious corporate Exchange server (which needs all the free memory it can
get)? Just tell Perl to connect to SMTP on your localhost and off the
message will go. Under these circumstances, no one knows or cares which
machine generates the mail. All that matters is that the mail gets to the
proper destination. That's where the Simple Mail Transport Protocol
(SMTP) comes in. (See the RFCs for more detail.)
Implementing a full-blown MX is beyond the scope of
this article. It involves setting up the proper DNS records and arranging
for users to retrieve their mail. Since neither of those is a function of
Exim, I'll leave that discussion for another day. Instead,
let's stay focused on the relatively simple topics of mail coming in
and mail going out.
Setting Up
To begin, let's get the software installed. As I
mentioned earlier, Exim is packaged for every *nix there is. In my case,
I'm running SUSE 9.2 Professional. A quick look at the following URL
shows us some excellent sources:
http://www.exim.org/eximwiki/ObtainingExim
Once the appropriate rpms are retrieved, installation
is as simple as:
rpm -ivh --force exim-4.51-0.1.i586.rpm
rpm -ivh --force eximon-4.51-0.1.i586.rpm
It's important to note that if you're
already running Postfix, Sendmail, or some other MTA, you'll have to
uninstall it first. Don't worry about the dependency complaints. Like
most modern Unix mail systems, Exim mimics the commands and executables of
the Sendmail interface, and once you've got the rpm installed,
everything should be copasetic. In my case, the --force option was necessary
because I already had Exim 4.42 running and needed to overwrite it.
When learning a new application, I generally prefer to
compile from a tarball. This provides an excellent idea of the
authors' view of what the layout should look like and what a vanilla
configuration entails. In the case of Exim, though, I would recommend
against it. Mail systems are implemented just differently enough on every
OS (or even distro) that fighting the dependencies and recreating all the
aliases, etc. can be a nightmare. To avoid that pain, I recommend you just
install from the package and be grateful that someone else took the time to
lay it all out.
A quick look at the files from the rpm speaks volumes
about the system itself:
rpm -ql exim
Grepping on "bin" will demonstrate what I mentioned earlier about the sendmail emulation.
Now that everything's in place, let's make
sure we're up and running. As root, do:
insserv /etc/init.d/exim
then:
rcexim start
Assuming you're running SUSE, you should have
gotten a green "done" back. If so, try this little SMTP exercise:
grendel:/home/nsmc # telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 grendel ESMTP Exim 4.51 Wed, 03 Aug 2005 23:33:34 -0400
Next, enter the following sequence of commands
followed by ENTER:
ehlo localhost
mail from:<some email address>
rcpt to:<your email address>
data
This is a test.
.
quit
Make sure you have the empty lines before and after
the period.
Check your email account. There should be an unusually
sparse message containing the properties you entered above. This
isn't much different from an automated SMTP session (minus some less
important commands, of course). For that matter, this is actually a great
test for any MTA. The commands should be universal. Next, do:
tail /var/log/exim/main.log
You should see some record of the mail transaction
that just took place. And errors should also be reflected in
/var/log/exim/reject.log.
If you run into problems, try running the following:
grendel:/home/nsmc # exim -bV
This command asks Exim to parse your config file and
let you know about any errors it finds.
If the address you're sending to seems problematic, try:
grendel:/home/nsmc/txt # exim -bt user@mccourtney.com
user@mccourtney.com
router = dnslookup, transport = remote_smtp
host mail1.mccourtney.com [66.235.200.144] MX=5
host mail.mccourtney.com [66.235.200.178] MX=10
The output tells you how Exim is processing the
destination address. The two most important fields are the router and the
transport. These two drivers are really the heart of the
"transfer" part of this MTA. To understand exactly what those
two terms mean, we need to take a look at how Exim processes mail.
Under the Hood of Exim
To begin with, Exim operates on a very simple
transport model. Remote reception and remote delivery take place only via
SMTP (vs. an MTA like Sendmail, which can work over a wide variety of
protocols). Locally, messages can be received via the command line
(interactively or batched) or via the aforementioned SMTP commands. Local
messages are delivered either to the appropriate user's mail file or
via any of the pipes outlined in /etc/aliases.
When a message comes in, it is assigned a 16-character
Message ID in the form XXXXXX-XXXXXX-XX. The first six characters are the
time it was received, in seconds, since the epoch. The next six chars are
the PID that received the message. The last two represent the exact
fraction of the second it was received.
Two files are then created in the input directory under the mail
spool. To find out where your spool directory is, or the value of any
compiled option, simply run:
exim -bP spool_directory
Mine comes back as:
spool_directory = /var/spool/exim
And listing confirms the presence of the necessary directories:
grendel:/home/nsmc # ls /var/spool/exim/
. .. db input msglog
But let's get back to the message files. The
first is named <message id>-H and contains header information. The
second is named <message id>-D and contains the message body. A third
file is sometimes created when delivery is incomplete to track which
addresses are still outstanding: <message id>-J. This last one
remains in input until
delivery is successful or the message is deleted (usually by an admin or a
housecleaning process).
When the message is actually processed, it steps
through the configured routers in top-down order, based on their position
in the configuration file. Routers are basically sets of criteria the
message is compared to. If the criteria are met, the router in question
sends the message to its configured transport. If they are not met, the
message is passed to the next router in line, etc. (Note that it's
also possible to use routers to rewrite mail addresses -- at which
point they start back at the top of the router chain.)
Transports are exactly what they sound like: methods
by which the messages are moved to their destination. There are several
local transports but, as noted earlier, the only remote transport is SMTP.
At times, a message may be undeliverable. When that
happens, the message files remain but the addresses that were successfully
delivered to (if any) are removed from the header. The messages remain in
the spool until their retry intervals are met and a "qrunner"
process comes through. If all copies are delivered, a note is made in the
log, and the message files are deleted. If not, the retry interval is
reset.
Mail is typically stored in structures called
"spools". And the "qrunners" I mentioned previously
are basically spool sweepers. They run through the messages in the Exim input directory and attempt to
deliver them. A qrunner is launched either at regular intervals specified
to the Exim binary at startup (see the -q option) or manually via the command runq. (Another useful command is mailq, which shows all the
messages waiting to be delivered in a Sendmail-style output.)
Now that I've outlined the basic process,
let's look at some of the details of the configuration. Let's
first examine the routers configured in my MTA out of the box. Here's
the router used for remote domains:
dnslookup:
driver = dnslookup
domains = ! +local_domains
transport = remote_smtp
ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8
no_more
The parameters are fairly straightforward. Basically,
you can read it as follows. Pass this message to the dnslookup driver
(which, unsurprisingly, routes the message using an MX lookup via DNS).
The next line says "don't bother with
local domains". "+local_domains" is a named list that is
set at runtime to the name of the local host. This can either be set
manually with the "primary_hostname" option or retrieved by
Exim from a uname() call.
Essentially, this is the trigger by which local messages get kicked out of
this router and on to the next one.
The "transport" line states which
transport to use. This would probably be a good time to mention that
routers and transports often have names similar to, or even the same as,
the drivers they use. For instance, despite appearances, dnslookup is not
using itself for a driver. Remember that the named instances are the ones
you can configure at runtime. So, dnslookup is a named instance of the
dnslookup driver. And remote_smtp is a named instance of the SMTP driver.
The "ignore_target_hosts" statement is a
host list similar to the domains list above. In this case, though, it
references by IP address. 0.0.0.0 and 127.0.0.0 are both ways of saying
"localhost".
"no_more" simply means that if you made it
through all the criteria and the dnslookup driver still failed, don't
bother passing the message on to the other routers. This triggers a fail
condition, and the message gets bounced back to the sender.
The rest of the routers deal with local traffic, so
let's look at them all together:
system_aliases:
driver = redirect
allow_fail
allow_defer
data = ${lookup{$local_part}lsearch{/etc/aliases}}
file_transport = address_file
pipe_transport = address_pipe
userforward:
driver = redirect
check_local_user
file = $home/.forward
no_verify
no_expn
check_ancestor
file_transport = address_file
pipe_transport = address_pipe
reply_transport = address_reply
localuser:
driver = accept
check_local_user
transport = local_delivery
cannot_route_message = Unknown user
You can see that the first two use the redirect router
driver. From the documentation:
The [redirect] router operates by interpreting a text
string, which it obtains either by expanding the contents of the
'data' option, or by reading the entire contents of a file
whose name is given in the 'file' option."
The first two routers above give excellent examples of
both situations. "system_aliases" does a lookup in the
/etc/aliases file à la Sendmail. "userforward" checks a
file in the user's home directory for their personal mail-forwarding
preferences. The latter is especially interesting in regard to the last one:
localuser. You probably have realized that the only way that the localuser
router could ever get passed a message would be if the userforward router
rejected it, which would seem odd since they look almost identical and both
run the check_local_user test. Clearly, then, there must be another
implicit test (or tests) contained within the userforward definition.
This is one thing that I don't like about Exim
-- syntactically, there's little or no differentiation between
property settings (e.g., "no_verify") and logical operations
(e.g., "file"). You have to look up every option to find out
which tests are being evaluated. Moreover, the order of those tests would
seem to be top to bottom, but the settings are scattered among them in no
discernible order whatsoever.
In the case of userforward,
"check_local_user" pulls the local part of the mail address
(e.g., the "nathan" in "nathan@example.com") and
issues a getpwnam() system call. If it's not the name of a system user, the router
declines the message and sends it down the line to the next router. If it
is the name of a system user, the user's home directory is assigned
to the variable $home and the next configured test is evaluated. In this case, the
next configured test is "file". With file, if $home/.forward
does not exist, the router declines, and so on. Once you get used to it and
know what is a check and what is a property, the simple syntax is easy to
understand. But for those seeing the process for the first time, looking up
each and every option can be a bit tedious.
This syntactic over-simplicity carries into the
configured transports, as well. Within the redirect router, the generic
"transport" option cannot be used (see the documentation for
details). You must specify named transport instances for a variety of
<type>_transport options. Which transport to use is determined by the
data returned by the file call. If a rule configured in .forward returns a
string starting with a pipe ("|"), the pipe_transport is
selected, and so on. I've got to think there might a more readable
way to nest these interrelations within the configuration file.
But getting back to the task at hand, the final
router driver used in my configuration is "accept". It seems to
be a simple catch-all that points to a general transport. The
"cannot_route_message" parameter tells it what message to write to the logs if it doesn't get routed to the
transport by virtue of failing the tests.
Putting this all together, my routers basically work
as follows:
1. Is the domain (i.e., the "example.com"
in "nathan@example.com"?) local to this host? If so, decline it
to the next router. If not, resolve it. If it resolves to the localhost ip,
fail it without passing it to the next router. If not, pass it to the
remote_smtp transport.
2. Does the local part of the address have an entry
in /etc/aliases? If not, decline it to the next router. If so, pick a
transport based on whether the aliases file returns a pipe, a file
destination or email address(es).
3. Is the local part the same as a local username? If
not, decline it to the next router. If so, does that user have a .forward
file in their home directory? If not, decline it to the next router. If so,
process the rules in .forward against the destination address. Pick a
transport based on the appropriate rule string returned from .forward.
4. Is the local part the same as a local username? If
not, write a message to the log and bounce the mail. If so, forward it to
the local_delivery named transport instance.
This represents perhaps the most basic router setup
that still manages to cover all the bases. There are many, many different
ways that routers can be configured, though. For instance, routing
typically stops once a router has accepted a message and queued it for
transport. However, you can set a parameter called "unseen"
that tells the router to do its work and pass the message on to the next
router. You might want to do this to archive messages, or possibly because
you just like to make your Xeons earn their keep. The point is you can.
Moving on, here's a list of my default
transports:
begin transports
remote_smtp:
driver = SMTP
local_delivery:
driver = appendfile
file = /var/mail/$local_part
delivery_date_add
envelope_to_add
return_path_add
address_pipe:
driver = pipe
return_output
address_file:
driver = appendfile
delivery_date_add
envelope_to_add
return_path_add
address_reply:
driver = autoreply
Note that the order in which the transports are listed
is immaterial because they are determined solely by the routing process.
That said, let's take a look at each one in turn.
- remote_smtp -- This is the named
instance of the sole driver capable of remote delivery in Exim. As you can
see, there aren't too many options configured here (which is to say:
none). For the most part, that's because the defaults are set, and
the retry and authentication aspects have their own areas in the
configuration file. But you can configure almost any other facet of the
basic SMTP process here, including which interface the traffic should go
out on and what hosts to send to (in case the dnslookup router wasn't
the origin of the message). Some other important ones are timeouts and the
maximum messages per TCP/IP connection.
- local_delivery -- The appendfile
transport driver has its origin in the traditional Unix mail file system.
Each user has a file that contains an aggregate of their messages. Each
mail user agent operates on that user's file; hence, we see the
/var/mail/$local_part file directive. According to this setting, the user
name in the mail address (again, the "nathan" in
"nathan@example.com") gets its own file.
There are obvious file system permission issues here,
but suffice it to say the Exim process must be able to create files and
directories under the mail spool. On my system, /var/mail is a link to
/var/spool/mail, and the permissions for it and the Exim binary are as
follows:
grendel:/home/nsmc # ls -ld /var/spool/mail/
drwxrwxrwt 2 root root 4096 2005-08-10 15:54 /var/spool/mail/
grendel:/home/nsmc # ls -l /usr/sbin/exim
-rwsr-xr-x 1 root root 778791 2005-05-18 11:45 /usr/sbin/exim
The directory's sticky bit is set (meaning that
deletions to each user's file are the prerogative of that user
alone), and the setuid bit is active on Exim (meaning that regardless of
whose identity it runs under, it has the access rights of the
binary's owner -- in this case root). So Exim and the user
accounts have all the access they need.
The other options (delivery_date_add, envelope_to_add,
and return_to_path) are actually general transport settings. They indicate
that Exim should add headers to the message on its way to the file.
- address_pipe -- This is a
particularly interesting feature of the application, in my opinion. Recall
that this transport can be triggered by the presence of a pipe-structured
alias in /etc/aliases. Pipes themselves are part of what gives Unix so much
command-line power, and they allow Exim to pass a message on to almost any
executable. The transport can limit which commands are runnable, but, if
misconfigured, this could be a huge security hole. Be very careful when
allowing anyone to pass commands on the sly like that. Note that the
return_output parameter sends any feeback from the pipe to the sender.
- address_file -- This is another item
that uses the appendfile transport. But there's no file parameter, so
how does it figure out which one to use? Looking back at the routers, you
can see that address_file is referenced by the file_transport option of the
system_aliases router. Logically, then, system_aliases must pass a file
parameter to whichever transport is referenced in that way. According to
the documentation, that's exactly what happens: a variable named
$address_file is populated and passed. The rest of the options are
configured and behave the same way as local_delivery.
- address_autoreply -- This instance
of the autoreply transport driver is triggered by the reply_transport
setting of the userforward router. You'll recall that this is in turn
an instance of the redirect router. Essentially, autoreply does what
you'd expect; it automatically replies with a canned message. To
state things a little more clearly, when a user receives a mail and has an
"out of office" rule set up in her .forward file, the redirect
router picks the driver configured in reply_transport and supplies it with
all the info it needs to create and transmit a message back to the
mail's originator. Got it? Great!
There are four other primary subsections of the
configuration file: retry, rewrite, authenticators, and ACLs. Retry sets
the parameters for reinitiating remote
transport on delivery failures. But, because all the defaults are fine for
my environment, there's nothing to
configure.
There's also nothing in my rewrite, because I
don't need to rewrite any addresses on the fly. In the past,
I've used this really excellent feature to circumvent an anti-spam
measure utilized by some of the less-sophisticated filters out there
-- namely, to refuse all mail originating from the domain
"localhost". That kind of limitation isn't a problem for
a corporate mail exchanger, but when you're just kicking out
notifications from a host in the dmz it can get a little sticky.
Truthfully, this wasn't even that good of a fix
for the problem I mentioned, but it was a good lesson in just how flexible
rewrites can be. Almost any aspect of a mail can be rewritten according to
rules and/or regular expressions. And the names of the fields to replace
are surprisingly intuitive. This was one of the easiest feature sets
I've ever learned.
Authentication isn't necessary on the host
we're examining because the only mail it will accept is for the
localhost. If it were needed, though, we'd have every option an admin
could want, from plaintext to TLS and SSL.
Access Control Lists (ACLs) allow you to set rules on
what types of messages can move through your MTA. There is a series of
pre-defined points that can be assigned ACLs using syntax similar to the
following from the top of my configuration file:
acl_smtp_rcpt = acl_check_rcpt
This particular setting assigns the ACL
"acl_check_rcpt" defined within the configuration file to the
predefined event "acl_SMTP_rcpt". Basically, acl_check_rcpt
makes sure that no potentially nefarious characters are passed during the
part of the SMTP session when the sender is specifying the recipient.
It's a security measure to prevent directory-traversal attacks and
the like.
Summation
While I've only scratched the surface of the
Exim Mail Transfer Agent configuration, I hope I've shed some light
on what an MTA is and how Exim works in that role. Below are listed some
Exim resources for more information. The best teacher, however, is
experience so I encourage you to experiment with your configuration. Often,
Exim works so smoothly you won't have time to really eyeball
legitimate traffic. Try intentionally jamming your queue with bogus
messages (on a test box, of course) and observe the behavior of the various
components. Get creative!
References
Exim Download Info -- http://www.exim.org/eximwiki/ObtainingExim
Exim Specification -- http://www.exim.org/exim-html-4.50/doc/html/spec.html
Nathan McCourtney has been in the technology field for
9 years, first as a developer then as a high-availability sys admin. He is
currently the Director of Data Center Operations for Angel.com.
|