The Perl Journal April 2003
Most Perl scripts start with a line beginning with #! and containing the world perl (for example, #!/usr/local/ bin/perl -w). This line tells the UNIX operating system that it shouldn't treat the current file as a binary executable, but it should invoke the specified perl interpreter with the specified options, and that interpreter will take care of the Perl script. Although the concept of this first line is simple, writing it to be universally applicable tends to be very hard.
For example, specifying #! /usr/local/bin/perl -w will cause a confusing my_script.pl: No such file or directory message on many out-of-the-box Linux systems, on which perl is located in /usr/bin/. Specifying #! /usr/bin/perl -w solves the problem on those Linux boxes, but will break compatibility with most Solaris systems, on which the standard place for perl is /usr/local/bin/perl. The bad news is that there is no standard place for perl to be specified after #!. To make matters worse, some older UNIX systems impose very strict rules on what can be specified in the #! line. As a result, you may choose between:
This article describes the last optionthe Magic Perl Header, a multiline solution for starting a Perl script on any UNIX operating system.
The most obvious #! one-liners are just not good enough:
It is clear that there is no single-#!-line solution to the problem in the general case, because there is no portable way to start Perl to run a script. A multiple-line solution will be necessary. In this section, I will begin to build this solution. I will identify problems and limitations along the way, and in the next section, present the final, complete magic header that will allow you to start a Perl script on any UNIX system.
The only portable beginning for a script is:
#! /bin/sh
/bin/sh is available on all UNIX systems, but it might be a symlink to any shell, including Bourne shell variants (such as Bash and ash), Korn shell variants (such as pdksh and zsh), and C shell variants (such as csh and tcsh). Many UNIX utilities, and the libc system(3) function (conforming to ANSI C, POSIX.2, BSD 4.3) rely on a working /bin/sh. So it is fairly reasonable to assume that /bin/sh exists and is a Bourne, Korn, or C shell variant. On Linux, /bin/sh is usually a symlink to /bin/bash. (On Linux install disks, it is sometimes a symlink to /bin/ash or the built-in ash of BusyBox.) On Win32 MinGW MSYS, /bin/sh is Bash, but there is no /bin/bash. On Solaris, /bin/sh is Sun's own simplistic Bourne-shell clone, and Digital UNIX also has a simple Bourne-shell clone in /bin/sh. (The line #! /bin/sh -- that is seen in many shell scripts to allow arbitrary filenames for the executable won't work here because tcsh gives an error for the -- switch.)
We can write a simple shell wrapper that will find the perl executable in $PATH and run it with the correct switches. In fact, this is the only way that this works on Win32 systems, using .bat batch files. A candidate for the solution is:
## file my_script.sh, version 1 #! /bin/sh perl my_script.pl ## file my_script.pl # real Perl code begins here
This has the following problems:
1. It doesn't pass command-line arguments.
2. It doesn't propagate exit() status.
3. It cannot find the Perl script on the $PATHit will take it from the current directory, which is usually wrong, and might also present a security issue.
4. It needs two separate files.
Problems 1-3 can be overcome quite easily:
## file my_script.sh, version 2 #! /bin/sh exec perl -S -- my_script.pl "$@"
All Bourne and Korn shells (such as GNU Bash, ash, zsh, pdksh, and Solaris /bin/sh) can interpret my_script.sh correctly. However, C shells use a different notation for "all the arguments passed to the shell, unmodified." They use $argv:q instead of "$@". The perlrun(1) manual page describes a memorable construct that detects the C shell:
eval '(exit $?0)' && eval 'echo "Korn and Bourne"' echo All
The message "All" gets echoed on all three shell types, but only Korn and Bourne shells print the "Korn and Bourne" message. (In zsh, the result depends on the value of $?, but it won't cause a problem since zsh understands both the csh and Bourne shell constructs we use.) The trick here is that $? is the exit status of the previous command, with the initial value of 0, but $?0 in the C shell is a test that returns "1" because the variable $0 exists.
We can change echo in the C shell detection code to exec perl, and that's it:
## file my_script.sh, version 3 #! /bin/sh eval '(exit $?0)' && exec perl -S "$0" "$@" exec perl -S -- "$0" $argv:q
Now we're ready to make our first wizard step: Combine my_script.pl and my_script.sh into a single file, which invokes itself using perl when run from the shell. (Forget about csh-compatibility for a momentwe'll get to that later.)
A simple attempt would be:
#! /bin/sh
eval 'echo DEBUG; exec perl -S $0 ${1+"$@"}'
if 0;
# real Perl code begins here
Unfortunately, it doesn't run the real Perl code, but it produces an infinite number of DEBUG messages. That's because Perl has a built-in hack: If the first line begins with #! and it doesn't contain the word perl, Perl executes the specified program instead of parsing the script. See the beginning of the perlrun(1) manual page for further details.
In the following simple trick, suggested by the perlrun(1) manual page, we include the word perl in the first line:
#! /bin/sh # -*- perl -*-
eval 'exec perl -S $0 ${1+"$@"}'
if 0;
# real Perl code begins here
This fails to work on many systems, including Linux, because the OS invokes the command line (/bin/sh, -- # *-* perl -*-, ./my_script.pl), and the shell gives an unpleasant error message about the completely bogus switch.
So we can omit the first line:
eval 'exec perl -S $0 ${1+"$@"}'
if 0;
# real Perl code begins here
This solution is inspired by Thomas Esser's epstopdf utility, and it seems to work on Linux systems with both perl my_script.pl and ./my_script.pl. But we can do better. The major flaw in this script is that it relies on the fact that the operating system recognizes executables beginning with ASCII characters as scripts, and runs them through /bin/sh. On some systems, a "Cannot execute binary file'' or "Exec format error'' may occur.
Note that this script is quite tricky since the first line is valid in both Perl and Bourne-compatible shells. (It doesn't work in the C shell, but we'll solve that problem later on.)
The solution has another problem: If someone gives the script a weird filename with spaces and other funny characters in it, such as:
-e system(halt)
then the command
perl -S -e system(halt)
will be executed, which is a disaster when there is a dangerous program named halt on the user's $PATH. This problem can be solved easily, by quoting $0 from the shell, and prefixing it with -- to prevent Perl from recognizing further options.
We have two conflicting requirements for the #! line: The portability requirement is that it must be exactly #! /bin/sh; but it must contain the word perl to avoid the infinite DEBUG loop described earlier. There is no single line that can satisify both of these requirements, but what about having two lines, then running perl -x, so the OS will parse the first and Perl will find the second?
#! /bin/sh
eval 'exec perl -S -x "$0" ${1+"$@"}'
if 0;
#!perl -w
# real Perl code begins here
The trick here is that Perl, when invoked with the -x switch, ignores everything up to #!perl. Users of nonUNIX systems should invoke this script with perl -x. UNIX users may freely choose any of perl my_script.pl, perl -x my_script.pl, ./my_script.pl, and even sh my_script.pl.
The subtle bilingual tricks in this script are worth studying. When the file is read by perl -x, it quickly skips to the real Perl code. When the file is read by the shell, it executes the line with eval: it calls perl -x with the script filename and command-line arguments. The double-quotes and $@ are shell script wizardry, so things will work even when arguments contain spaces or quotes. The -S option tells Perl to search for the file in $PATH again because most shells leave $0 unchanged (i.e., $0 is the command the user has typed in).
Although the second and the third lines contain valid no-op Perl code, Perl never interprets these lines because of the -x switch. These lines are also completely ignored by perl my_script.pl because that immediately invokes /bin/sh. However, when the user loads this script with the do Perl built-in, the second and third lines get compiled and interpreted, a harmless no-op code is run, and no syntax error occurs.
There are still deficiencies that remain:
With regard to the line number problem, the do Perl built-in can be used to reread the script with a construct like this:
BEGIN{ if(!$second_run){ $second_run=1; do($0); die $@ if $@; exit } }
BEGIN is required here to prevent Perl from compiling the whole file and possibly complaining about syntax errors with the wrong line numbers. The die $@ if $@ instruction will print runtime error messages correctly. See perlvar(1) for details about $@. Unfortunately the code
BEGIN{ if(!$second_run){ $second_run=1; do($0); die $@ if $@; exit } }
die 42;
yields an extra error message "BEGIN failedcompilation aborted." This error is confusing because die 42 causes a run-time error, not a compile-time error. To get rid of the message, we should eliminate exit somehow, and tell Perl not to continue parsing the input after } }. We'll use the __END__ token to stop parsing early enough.
The locale warning is a multiline message starting with "perl: warning: Setting locale failed." Perl emits this if the locale settings specified in the environment variables LANG, LC_ALL, and LC_* are incorrect. See perllocale(1) for details. The real fix for this warning is installing and specifying locale correctly. However, most Perl scripts don't use locale anyway, so a broken locale doesn't do any harm to them.
Although Perl is a good diagnostics tool for locale problems, most of the time we don't want such warning messages, especially not in CGI (these warnings would fill the web server's log file), or some system daemon processes, when the program is prohibited from writing to stderr on normal operation. The system administrator should really fix locale settings, but that can take time. Most users don't have time to wait weeks to run a single Perl script that doesn't depend on locale anyway.
The perllocale(1) man page says that PERL_BADLANG should be set to a true value to get rid of locale warnings. Actually, PERL_BADLANG must be set to a nonempty, nonnumeric string (for example, PERL_BADLANG=1 doesn't work). So we'll set it to PERL_BADLANG=x in the shell script section. Note that this has no effect if Perl is invoked before the shell. For example, perl, perl -x, perl -S, and perl -x -S all emit the warning long before the shell has a chance to change PERL_BADLANG.
Combining it all together, we have the final version of the Magic Perl Header; see Example 1. The file should have the executable attribute on UNIX systems.
This header is valid in multiple languages, so its meaning depends on the interpreter. Fortunately, the final effect of the header in all interpreters is that perl gets invoked running the real Perl code after the header. Let's see how the header achieves this:
#! /bin/sh
true && eval '...; exec perl -T -x -S "$0" #{1+"$@"}' # comment
garbage
So they run perl -x.
#! /bin/sh -- false && eval '...' ; eval '...; exec perl -T -x -S "$0" $argv:q' # comment garbageSo they run perl -x.
#!perl -w untaint $0; do $0; die $@ if $@; __END__ garbageSo it runs the current file again, with do, not respecting the #! lines. This is a good idea to make error line numbers come out correctly.
eval 'garbage' if 0; eval 'garbage' . q+garbage+ if 0; # real Perl code
So the real Perl code gets executed, even on old UNIX systems, no matter how the user starts the program. The header is suitable for inclusion into CGI scripts. (In nonCGI programs, where extreme security is not important, occurences of the -T option can be removed.)
All of the following work perfectly, without the locale warning:
DIR/nice.pl # preferred ash DIR/nice.pl sh DIR/nice.pl bash DIR/nice.pl csh DIR/nice.pl tcsh DIR/nice.pl ksh DIR/nice.pl zsh DIR/nice.pl
The following invocations are fine:
perl -x -S DIR/nice.pl # locale-warning perl DIR/nice.pl # locale-warning perl -x DIR/nice.pl # locale-warning perl -x -S nice.pl # locale-warning; only if on # $PATH, recommended on Win32 perl nice.pl # locale-warning; only from curdir perl -x nice.pl # locale-warning; only from curdir nice.pl # only if on $PATH (or $PATH contains '.')
The following don't work, because buggy Perl 5.004 tries to run /bin/sh -S nice.pl:
perl -S nice.pl # doesn't work perl -S DIR/nice.pl # doesn't work
Of course, there is a noticeable performance penalty: /bin/sh is started each time the script is invoked. This cannot be completely avoided because PERL_BADLANG has to be set before perl gets invoked. After the shell has finished running, one line of helper Perl code is parsed (after #!perl), and the do causes five lines of helper code to be parsed. The time and memory spent on these six lines is negligible. So the only action that slows script startup is the shell. If the user sets and exports PERL_BADLANG=x, fast startup is possible by calling:
perl -x -S nice.pl perl -x DIR/nice.pl
In a Makefile, you should write:
export PERL_BADLANG=x goal: perl -x DIR/nice.pl
The command-line options -n and -p would fail with this header. This is not a serious problem because -n can be implemented as wrapping the code inside while (<>) { ... }, and -p can be changed to the wrapping while (<>) { ... } continue { print }.
I've implemented a Header Wizard that automatically adds the Magic Perl Header to existing Perl scripts. The Header Wizard is available from http://www.inf.bme.hu/~pts/Magic.Perl.Header/ magicc.pl.zip. [For convenience, we have also posted this at http://www.tpj.com/source/, though downloading from the author's site guarantees that you get the most recent version. -Ed.]
The easy recipe for the universally executable Perl script:
1. Write your Perl script as usual. You may call exit() and die() as you like.
2. Specify the #! ... perl line as usual. You may put any number of options, but the -T option must either be missing or specified alone (separated with spaces). Example: #! /dummy/perl -wi.bak -T. See perlrun(1) and perlsec(1) for more information about the -T option.
3. Run magicc.pl (the Header Wizard), which will prepend an eight-line magic header containing the right options to the script, and it will make the script file executable (with chmod +x ...). (The -T option will be moved after both exec perls, and other options will be moved after #!perl because Perl looks for switches only there.)
4. Run your script with a slash, but without sh or perl on the command line. For example: ./my_script.pl arg0 arg1 arg2. After you have moved the script into $PATH, run it just as my_script.pl arg0 arg1 arg2. (This avoids the locale warnings and makes options take effect.) Should these invocations fail on a UNIX system for whatever reason, please feel free to e-mail me. As a quick fix, run the script with perl -x -S ./my_script.pl arg0 arg1 arg2.
5. Note that on Win32 systems, perl -x -S is the only way to run the script. You may write a separate .bat file that does this.
6. Tell your users that they should run the script the way described in Step 4. There is a high chance that it will work even for those who don't follow the documentation.
For such a widely implemented language, Perl can be suprisingly hard to invoke reliably on a variety of platforms. I hope this Header Wizard helps you to write Perl scripts that will start with a minimum of fuss on just about any system.
TPJ