E-Mail Security

Maintaining privacy in a world of public data transfer

Bruce Schneier

Bruce is president of Counterpane Systems, a cryptography and data-security consulting company. He is the author of Applied Cryptography and E-Mail Privacy, both published by John Wiley & Sons. Bruce is also a contributing editor to Dr. Dobb's Journal and can be reached at schneier@chinet.com.

The world of electronic mail is the world of postcards. Messages travel from machine to machine in the open, just like the messages on the backs of postcards. These messages can easily be read, altered, forged, or deleted--without anyone's knowledge.

Cryptography provides an easy and effective solution to these problems, even though few people take advantage of it.

Imagine that Alice sends a message to Bob over the Internet. The message flows through the system, going from one machine to another. When a machine gets the message, the machine reads the header, figures out if the message is for anyone who has an account on that machine, and then sends it off to another machine if it isn't. The other machine does the same, and so on. Eventually the message reaches the correct machine and is placed in Bob's electronic-mail file. The next time Bob logs in, he reads the message.

In reality, electronic mail doesn't bounce randomly from one machine to another, hoping to find its destination. The different computers on the Internet have routing tables. If a computer receives a piece of electronic mail that is for someone on another computer, it knows enough to look up that computer on the routing table and to send it in the general direction of that computer. Even so, look at the routing information next time you receive a piece of e-mail; it probably passed through quite a few intermediaries between its source and its destination.

Any of these intermediaries can easily read Alice's mail. Imagine that Eve, an eavesdropper, is sitting at one of these intermediary machines on the Internet. She can be the system administrator, a clever hacker, or, if the security on the machine is poor enough, a regular user. In any case, the world is an open book to her. She can sit at her terminal and see every electronic-mail message that passes through the machine, no matter who it is addressed to. She can print a message out and show it to her friends. She can send it to the New York Times. Alice and Bob have no control over who reads their mail in transit. It doesn't matter if their mail is marked "confidential," if their computers are in locked rooms, or if they both have been subjected to rigorous psychological screening and have been selected for their discretion. By sending a message over the Internet to Bob, Alice is trusting the security of every machine the message will pass through--without even knowing which machines they will be. The only real security they have is the honesty of those machines.

Envelopes for E-Mail

If e-mail messages are like postcards, what we want are letters in envelopes. Like electronic-mail messages, letters are routed through a network. Alice drops a letter in a mailbox and postal workers send it via a variety of post offices and transport vehicles to Bob's mailbox. A dozen different people might handle a letter as it travels through the system, but none of them can read the letter. The envelope protects it.

You can mirror this process with cryptography, using strong encryption as an "envelope." By encrypting her electronic mail so that only Bob can read it, Alice ensures that Eve cannot--even if Eve intercepts it in transit. The addition of digital signatures to the electronic mail lets Bob verify that Alice sent the message and that it was not altered in transit.

Who Wants to Read Your Mail?

Anyone who wants to can read your e-mail, remote login sessions, ftp downloads, real-time conversations, and anything else you do on the net. But who would want to? The answer depends on who you are, what you are doing, and who may be interested in it.

The military-intelligence organizations of major governments are the most sophisticated of potential eavesdroppers. Reading people's mail is their business. Since the beginning of the Cold War, intelligence organizations have spent fortunes collecting, compiling, and analyzing intelligence data on each other. Just because the Cold War is over, don't think that these organizations are not still at it. Computer transmissions are just a part of that overall collection effort, but it is an important part.

This is not to say that if you are not involved with the government, you are safe from military-intelligence collection efforts. The lines between military and corporate espionage are fuzzy; many commercial technologies have military applications. Several countries routinely target foreign companies for spying, and think nothing of passing the information they collect on to companies in their own country. France and Japan are the most well-known offenders, but there are undoubtedly others. The NSA has been accused of, in at least one instance, intercepting a telephone call between two European countries and passing on marketing information to a U.S. competitor. As the post-Cold-War world continues to evolve, large military-intelligence communities need new reasons to justify their existence. So industrial spying by military-intelligence organizations is likely to increase.

Governments are often interested in spying on their own citizens, as well. This is certainly true in totalitarian regimes such as China, North Korea, and Cuba, but it exists in other countries, too. The government of France prohibits encryption on civilian communications circuits unless a copy of the encryption key and algorithm is given to the authorities. Both the governments of Taiwan and South Korea have been known to request that companies remove encryption from voice, data, and facsimile telephone connections. Even the United States has a long, sordid, history of conducting illegal wiretaps. Any organization that would, without a court order, tap the telephones of Martin Luther King, Jr. could easily justify reading its citizens' electronic-mail messages. And since electronic mail doesn't yet have the same Constitutional protection as paper mail, it's easier.

Several U.S. government organizations might be interested in reading private e-mail. The FBI might be looking for criminals, people starting fringe political parties, people who don't floss regularly, or other unsavory characters. Pornographers are particularly popular targets. The DEA might be looking for drug dealers. (The "war on drugs" has at times been used as an excuse for questionable law-enforcement ideas.) The IRS might be looking for tax cheats. There's also the Treasury Department, the BATF, the CIA: If you're doing something even remotely mysterious, somewhere in the bowels of Washington there is a government acronym that wants to know about it.

Businesses might use espionage against rival companies. They could be interested in customer lists, employee directories, marketing plans, financial data--almost anything. Coca-Cola might pay dearly to know Pepsi's new advertising plan; Ford might be similarly interested in the designs for next year's Chevrolet models. Stockbrokers might be very interested in data about a company that may eventually affect its stock price. A salesman might be very interested in the customer database of a rival salesman, perhaps even a rival salesman in the same company.

Investigative reporters might be interested in private e-mail conversations between public individuals: politicians, corporate leaders, entertainers, and other public citizens. Remember when the Washington D.C.'s City Paper collected and published data on Supreme Court nominee Robert Bork's videotape-rental records? Or when Prince Charles's telephone conversations made it into the British tabloids? What about when reporters broke into Tonya Harding's electronic mailbox? Although there have not yet been any public instance of reporters actually going so far as publishing someone's electronic-mail messages, it is bound to happen. How would a Senate candidate feel if his college-era postings to alt.beer.belch were published?

Criminals can get valuable data from electronic mail, as well. Police have long known that people monitor cellular phone channels, listening for credit-card numbers. There's no reason why they can't look for the same thing amongst the electronic-mail messages moving back and forth across the networks. Some companies are already opening up shop on the Internet, offering various consumer goods for sale by credit card. It would be easy to set up an automatic program to scan the mail feed for credit-card numbers. If commerce on the Internet ever takes off, this practice would become widespread.

And finally, colleagues, friends, and family are possible spies. These are not sophisticated spies, but they may very well be the most interested. Some companies explicitly reserve the right to read all electronic mail sent by their employees, whether work-related or personal. A worker in an office might be very interested in the personal electronic-mail correspondence of a coworker, for no other reason than nosiness. A family member might be carrying on an illicit love affair. E-mail messages have already shown up in divorce court.

The Collection Problem

The biggest problem in reading someone's e-mail is finding it amongst the sea of other electronic-mail messages. It's a small needle inside an enormous haystack.

One of the National Security Agency's jobs, for instance, is to monitor computer data flowing into and out of the United States, as well as data flowing between other countries. This is a task of Herculean proportions. At least a gigabyte of computer data flows in and out of our borders every day. This includes e-mail, Internet newsgroups, remote logins, ftp downloads, real-time "chat" conversations, and everything else. Storing the data on computer tape is a massive problem, let alone reading and analyzing it.

The NSA uses computers to sift through the data in real time, looking for interesting information. Maybe the computers look for certain key words. An electronic-mail message containing the words "nuclear," "cryptography," or "presidential assassination" might be stored on tape for further analysis. They might look for data from particular people, or from particular organizations. They might look for data with a particular structure. They might have artificial-intelligence software that does things I can't even comprehend. The NSA has a lot of money to throw at this problem, and they've been working on it for a long time.

The important point is that they have to do this in real time. There is just too much data to save. The best they can hope for is to collect as much data in a day that can be analyzed in a day. They can't collect any more, because more is coming the next day. Data collection is like a never-ending treadmill; if they fall behind, they will never catch up.

Collection is useless without analysis, and that is a far more complicated process. Unless the NSA has more advanced computing resources than I can imagine, they need people for this part. People have to read those "interesting" electronic-mail messages to determine if they are really interesting. Maybe the message with the words "nuclear" and "assassination" was really about a science-fiction movie. For a while it was popular to add a string of interesting words at the end of all messages, just to frustrate these collection efforts. Maybe the mention of blowing up the UN was frivolous, and maybe it was a message from one terrorist to another, discussing their plans. Maybe that message from an American high-tech company was innocuous, and maybe it was a foreign spy passing information back home.

Encryption as a Defense

Encryption makes the NSA's job difficult on several fronts. The most obvious is that they cannot read the various e-mail messages. This is only true if the encryption method is secure enough that the NSA can't break it. If the NSA can break the encryption method, then it is just a matter of allocating the resources necessary to break it.

However, this is only really true if encryption is not widespread. Remember the collection problem. There is an enormous amount of data flowing through the Internet every day; far too much to examine it all. The NSA's interesting-stuff checkers could easily collect encrypted messages and then route them to another computer program for further analysis, but this is only feasible if a small percentage of messages are encrypted. If 80 percent of all e-mail traffic, ftp downloads, remote login sessions, and so on are routinely encrypted, the NSA's computers will not have the time to break them--even if they are all breakable.

And even worse, the interesting-stuff checkers have a much harder time deciding which messages to ignore and which are worth breaking. If only a few messages a day are encrypted, then those are obviously interesting messages. If everyone routinely encrypts their messages--even people chatting with their friends--the NSA can't tell the interesting encrypted messages from the innocuous encrypted messages. There's too much data flowing through the network, and not enough computing power to bring to bear on the problem. Encryption, even poor encryption, can quickly make the collection problem intractable.

Traffic Analysis

The NSA isn't out of work yet. Even if they can't read your e-mail, it can collect some pretty impressive data on you through traffic analysis.

Traffic analysis is the analysis of who you send electronic mail to, who you receive electronic mail from, how long those electronic-mail messages are, and when they are sent. There's a lot of good information buried in that data if you know where to look.

Most European countries don't have itemized telephone bills. European telephone bills list how many "message units" were used from a particular phone, but not where and when these message units were used. American telephone bills list every long distance call made from the telephone number: date and time, number called, and duration of the call. The American system makes it easier to spot errors and to catch your children making hundreds of calls to 1-900-HI-SANTA, but it also allows the telephone company to learn information about your calling patterns. Do you make a lot of long-distance calls to Montana? Then maybe you are interested in these Montana vacation packages? Do you order from catalogues frequently? Then you should be on this mailing list. Do you call the Suicide Prevention Hotline regularly? Then maybe a prospective employer should hire someone else.

During World War II, the Germans used detailed calling records to round up the friends of suspected enemies of the state. Many European countries believe it is worth the loss of a detailed telephone bill to prevent that from ever happening again.

E-mail messages can yield the same information. Even if the message is encrypted, the header clearly states who the message is from, who it is to, when it was sent, and how long it is. There are anonymous remailing services on the Internet that purport to hide who a message is from, but while services such as these may prevent the average Internet user from knowing where a particular piece of electronic mail came from, they'll likely not fool a sophisticated eavesdropper such as the NSA.

Imagine that Eve is interested in Alice, a suspected terrorist. Alice encrypts all her electronic mail, so Eve can't read the contents of her messages. However, Eve collects all the information she can on Alice's traffic patterns. Eve knows the e-mail addresses of everyone Alice regularly corresponds with. She often sends long messages to someone called Bob, who always immediately responds with a very short message. Perhaps she is sending him orders, and he is confirming receipt of those orders. One day there is a big jump in the volume of electronic mail traffic between Alice and her correspondents. Perhaps they are planning something. Then, there is silence. No mail flows between Alice and her correspondents. The next day a government building is bombed. Is this enough evidence to arrest the whole bunch of them? Perhaps not in the United States, but certainly in countries with weaker concepts of personal freedom.

Terrorists are not the only ones who fear traffic analysis. Would the FBI start investigating people for drug use simply because they corresponded over electronic mail with a convicted--or even just a suspected--drug dealer? Would a company, after receiving information that an employee is regularly corresponding with an electronic-mail address in a competitor's offices, have grounds to fire that employee? What would a jealous person think after learning that his or her spouse was corresponding regularly with a potential rival? Traffic analysis is an important intelligence tool, and its implications on personal privacy are significant.

Spoofing

Spoofing is one person impersonating another. Whether the impersonation is intended as a joke, a means to discredit or disgrace, or a means to defraud, it is a problem.

Every month or so a particular type of message is posted to a variety of newsgroups on the Internet. The message might have a header that reads "I am a child molester and I'm proud of it," or maybe a racist slogan; the text isn't any better. Then, anywhere from ten minutes to a day later, there is another message from the same person apologizing for the first message. "It was a forgery," the second posting insists, "don't believe any of it."

This may be true, but damage is already done. People see the first message and, not knowing that it is a forgery, believe the purported sender to be whatever the message claims him to be. They reply angrily. They write a scathing letter to his system administrator demanding he be removed from the network. They report him to the police, or to some political-action group. If they know him, they may avoid his presence. They may even further damage his reputation by spreading the story to even more people. (This is particularly damaging, because those other people are even less likely to see the retraction.)

Maybe Eve wants to smear Alice. She writes an incriminating e-mail message, puts Alice's name on the bottom, forges Alice's header on top (it's not hard for a skilled hacker to fake a message header), and sends it to a public forum. Then she sends a copy to the print media.

There are other, less overt, ways to do damage by impersonating someone else. Imagine that Alice and Bob are collaborating via electronic mail on some project. Eve, purporting to be Bob, sends a message to Alice. In it, "Bob" claims that he has moved, and that this is his new electronic-mail address. Alice doesn't know better and changes her address directory. Now Eve can correspond with Alice, pretending to be Bob. If Eve is really clever, she can simultaneously convince Bob that she is Alice. Then, Eve can have conversations with both of them, passing messages through most of the time and only changing them on occasion. Eve can thwart whatever project Alice and Bob are working on through judicious use of misinformation. If Alice and Bob don't communicate face to face or over the telephone regularly, Eve can keep this ruse up for a long time.

Eve also could get an account in the name of a known, but not too well-known, reporter. She could promise Alice publicity in exchange for some information. Alice trusts the name on the "From" line of the mail header, and is tricked into revealing whatever information Eve wants.

Spoofing can be prevented with something called a "digital signature." Just as a written signature provides proof of authorship of (or, at least, agreement with) a physical document, a digital signature provides the same for an electronic document. With this sort of digital authentication, Alice can always check to make sure a document is actually from the person it is purported to be from. No one can send an incriminating posting purporting to be from Alice. Alice can always check who really sent a piece of electronic mail she received. And no one can pretend to be Alice to someone else and hope to get away with it.

Making Security Work

There are several things we can do to keep our digital connections secure. The simplest and easiest is to regularly use encryption and digital signatures. This means all the time. Encrypt and sign all of your correspondence, even when the content doesn't warrant secrecy. To do otherwise only invites trouble.

Currently, almost no one encrypts their e-mail because doing so is a nuisance. Unfortunately, the side effect is that anyone who does, immediately arouses suspicions. If everyone regularly uses encryption, then encryption is not suspicious. No one in the post office stares at a sealed envelope, wondering what is so private that it can't be written on a post card. If encrypted electronic mail were the rule, no one would assume an encrypted message has something to hide.

Likewise, digital signatures should be the rule. Almost no one signs their digital correspondence, so a digitally signed message is sure to arouse suspicion: What is so important about this piece of electronic mail that it has to be signed? If everyone routinely signs their messages, it won't even warrant a comment.

The way to make security ubiquitous is to make it transparent. It is just as easy to send a signed and encrypted piece of electronic mail as it is to send an unsigned and unencrypted piece of mail, then people are more likely to choose the former. Signed and encrypted correspondence will be commonplace, yet another example of the triumph of personal privacy over Big Brother government.

RSA for Encryption and Digital Signatures

Public-key cryptography is based on the idea of a key pair. One key remains private, and the other one is public. With this tool, you can both encrypt files and create digital signatures.

If Alice wants to send a message to Bob, she encrypts it with Bob's public key. The key is public, so she can get it off the net somewhere. Bob, after he receives the encrypted message, decrypts it with his private key. The key is private, so only he can decrypt it.

If Alice wants to sign a message, she encrypts it with her private key. Bob, or anyone else, can verify the signature by decrypting the message with Alice's public key. The key is public, so anyone can verify the signature. But Alice's private key is private, so only she can sign messages.

The reason this kind of cryptography is so useful for e-mail security is that Alice and Bob do not have to meet somewhere secret and exchange keys. If they were using a conventional algorithm, they would have to agree on a secret key before they could communicate securely. With public-key cryptography, all communication can be out in the open over an insecure channel.

RSA is the most common public-key algorithm. Its security is based on the difficulty of factoring large numbers. To generate a public-key/private-key pair, first choose two large (500 bits or more) prime numbers, p and q. Then, n=p*q. Choose e such that it has no factors in common with (p-1)*(q-1). Then compute d such that d=e^-1 mod ((p-1)*(q-1)). The public key is d and n; the private key is e and n. Destroy p and q, and do not reveal them to anyone.

To encrypt a message m, compute c=m^e mod n. To decrypt, compute m=c^d mod n. To sign a message m, compute c=m^d mod n. To verify the message, compute m=c^e mod n.

There are more subtleties to this, and a complete system is quite a bit more complicated. Any modern book on cryptography covers this in more detail.

Copyright © 1994, Dr. Dobb's Journal