Detaching Attachments

The Perl Journal September, 2004

By brian d foy

brian has been a Perl user since 1994. He is founder of the first Perl Users Group, NY.pm, and Perl Mongers, the Perl advocacy organization. He has been teaching Perl through Stonehenge Consulting for the past five years, and has been a featured speaker at The Perl Conference, Perl University, YAPC, COMDEX, and Builder.com. Contact brian at comdog@panix.com.

I read my mail with Washington University's PINE and I do it on a remote server. I've been doing it that way for years and it works for me. Since I travel so much, I never know where I might be or which computer I might be using, but I can usually ssh to my shell account.

I can't look at most attachments since PINE is just a text thing. It isn't going to show me pictures or translate Word documents. Since I read this stuff on a remote account, I usually have to save the attachments, then move them into my web space so I can download them on whatever computer I may be using.

I never felt annoyed or motivated enough to fix this—until now, that is. I don't want to give up PINE, but I want to view the attachments with as little pain as possible. I wrote a script that I can pipe a message to using PINE's enable-unix-pipe-cmd feature (which you may have to turn on). When I read a message with an attachment that I want to save, I display the full message with the h command (full header mode) so PINE displays the entire message including all the attachments. I can scroll through all of this if I like, and it's the same thing my program will get as input. With the | command, PINE prompts me for a program name to send the message to; I called this program "detach" and placed it in my personal bin directory. I also created a symlink named "d" to reduce typing. This program is shown in Listing 1.

The program is deceptively short. I start off with the usual Perl invocation and strict and warnings declarations. The ExtUtils::Command module provides me a mkpath() function that acts like a portable 'mkdir -p' so I can easily create new directories. The File::Spec::Functions module has the handy catfile() function that joins together path parts according to the preference of the current operating system. The File::Path module also has an mkpath() function, but it kills the script if it fails. I don't like that sort of interface, so I stick to ExtUtils::Command.

I try to use this whenever I need to work with paths so my script will have fewer portability issues. Both of these modules have come with Perl for a long time so feel free to use them liberally.

I also pull in MIME::Parser to do all of the hard work, which makes the script as short as it is. It knows how to parse and save the attachments. Although I normally don't like this sort of tight coupling and multiple action interface, in this case, it is just what I want.

Before I used this script, I defined the environment variable ATTACHMENT_ROOT in my shell configuration file. I have this set to a directory in the protected space of my web directory so I can download the file through a web browser. That takes care of the download hassle I had before. If I don't set this variable, I use my home directory.

I read the message from STDIN with one big chunk. The do{} idiom is a favorite of mine. It acts sort of like an inline subroutine: I get a block of code to define a scope and it returns the last evaluated expression. In this case, I set the $/ (the input record separator) variable to nothing, so the diamond operator (<>) reads the entire file at once. I store the whole message, headers and all, in $message.

Once I have the message, I want to figure out where to store my attachments. I think everyone names their files the same thing, so I have a bunch of files named "tpr-article.doc" from different authors. I decided to create a directory for each sender, so I send the entire message to my from() subroutine to pull out the e-mail address, then store it in $from. My e-mail address extractor is very simple, and it's the same thing I wrote about in my last TPJ article, "Pipelines and E-mail Addresses" (TPJ, August 2004). It is not a complete (or "correct") solution, but it's good enough and it hasn't failed me yet. If I was worried about weird e-mail address or From lines, I could use one of the e-mail parsing modules.

Before I go any further, though, I want to make sure that the string in the $from address is something safe for a path. The MIME::Parser::Filer module, which does a lot of the work my script doesn't show, can take a string and ensure it's something I can safely use. If it can't rescue a bad e-mail address, it returns undef and I use "malformed-sender" as a default.

I use catfile() to join my base directory and sender directory. As I said earlier, catfile() does the right thing for the operating system. In the next line, I use the do{} block to create a scope that I can modify with unless(). If the directory already exists, I skip this step, but if it doesn't, I put my new directory name into a local() version of @ARGV then call mkpath(), which uses the values in @ARGV as its arguments. It's an odd bit of history, but that's the way it works.

Once the directory exists, I tell MIME::Parser that I want to use that directory to store the files. This is where my attachments will show up. After I set the directory, I call parse_data() and all the interesting stuff happens.

The files end up in one of my private web directories, so I go to that URL, which is just a directory listing, then choose the link to the e-mail address of the message. There are my attachments. It's not as easy as clicking a link in web mail, but now everything that everyone sends me gets stored in an easily accessible web directory.

TPJ

Listing 1

#!/usr/local/bin/perl5.8.0
use strict;
use warnings;

use ExtUtils::Command qw(mkpath);
use File::Spec::Functions qw(catfile);
use MIME::Parser;

my $Base    = $ENV{ATTACHMENT_ROOT} || $ENV{HOME};
my $message = do { local $/; <> };
my $from    = from( \$message );

my $parser = MIME::Parser->new();
$from = $parser->filer->exorcise_filename($from)
  || 'malformed-sender';

my $path = catfile( $Base, $from );
do { local @ARGV = ( $path ); mkpath; } unless -d $path;

$parser->output_dir( $path );

my $entity = $parser->parse_data( $message );

sub from
    {
    my $message = shift;

    my( $from ) = $$message =~ m/^From:\s+(.*)/mg;

    $from =~ s/\s* \(.*?\) \s*//x;
    $from =~ s/.* < (.*@.*) > .*/$1/x;

    return $from;
    }

Back to article