Article jul2006.tar

Practicing Best Perl

Randal L. Schwartz

Roughly a year ago, my friend Damian Conway published a hefty tome called Perl Best Practices. In it, he managed to gather 256 strongly suggested ideas and behaviors that had made his Perl hacking more successful for him and his customers over the years. As a reviewer on the book, I was happy enough with what I had seen to provide a quote that was eventually selected for the back cover:

    As a manager of a large Perl project, I'd ensure that every member of my team has a copy of Perl Best Practices on their desk, and use it as the basis for an in-house guide.

A year later, looking back, I'm still happy with what I've seen, including how some of my clients have taken my advice to heart. While I don't intend for this column to be a book review, I do want to provide some context for the rest of what I have to say this time around.

I've been writing computer programs for more than 35 years, including 25 years of doing it and getting paid for it. One of the hardest things to convey in little snippets of code and random Perlmonks postings is the larger picture of "don't do this because I got burned doing that a long time ago". Apparently, the young 'uns these days just want to get something hacked out, or figure that their problem is just completely unique and some advice I may be able to dish out in a one-liner can't possibly apply to them.

Or they think they know better. That's fine. We need the enthusiasm of the unscarred youth to explore new and better spaces. But time after time, many of them come to realize that maybe the old greybeards actually had some sane thing to say about their task.

For example, a frequent request comes along on how to have a variable name contain all or part of another variable name. In Perl, we can certainly accommodate access to the package variables using symbolic references and (with some difficulty) the lexicals with a well-formed eval-string operation.

But the caveat I include (with either my own posting or as a footnote to someone else's unqualified answer) is don't do that. For many people asking the question, it's often a puzzling response, because they see me giving, and yet taking away, in the same answer. My fear, of course, is that they listen to the "how" and completely ignore the "why not" and run off to write code that will be unmaintainable and might possibly expose some security holes.

But this is the difference between knowing how to code in Perl and knowing the best way to code in Perl. I know from my years of practice that code that blurs data and variable names will be hard to maintain and prone to problems. But I have to convey that in a way that seems more about intuition than by reeling off all those moments in the past that give the basis of my conclusions.

Naturally, the "Yeah, but there's more than one way to do it" war chant is often returned, but I think that's misunderstanding what Larry Wall means when he says that. Larry wants Perl to have the power of expression to suit the coder and situation, including perhaps having multiple ways to say the same thing to emphasize various aspects. He doesn't intend the phrase to imply "... and all ways are equally valid and suitable for every occasion".

This is where Damian Conway's book comes in to play. Damian has helped sort out the things that most Perl experts agree are more likely to produce better code faster and easier, narrowing down the many ways to do things into the ways people seem to get more things done. And although some of the things might be considered arbitrary, or perhaps even controversial, Damian makes strong arguments for each item. So, even if you disagree, you can say, "Hey, he's got a good point here."

To illustrate my point, let's look at a few of Damian's "Best Practices", albeit illustrated with my own examples when I think of them.

For example, in Chapter 2, we see "Never place two statements on the same line". Sure, it sounds simple. But there are some important implications of this advice.

To begin with, a statement in Perl is a logical step: the kind of thing that you'd want to add, remove, cut, or paste. If you have two statements on a line, it's harder to edit your program to have more steps.

But more importantly perhaps, the Perl debugger can place a breakpoint only on a line-by-line basis. So, although the second statement might be a logical stopping point during single-stepping or code evaluation, by putting the statement mid-line, we no longer have that option. While Perl normally doesn't care about increased or decreased whitespace, we see an important semantic change here by not following this (now hopefully motivated) advice.

When I first read that advice, it sat with me like "well, of course". But that's because I had already been burned by not being able to set a breakpoint on a mid-line statement, so I carry the scar, vowing never to get burned that way again. That's what makes a book like this have a great deal of value, giving others the chance to learn from old scars.

The very next advice, "Code in Paragraphs", is also something I do quite naturally and frequently, which you know if you've been reading my past columns and books. I like to use whitespace to create "paragraphs" of statements (considering the statement as a "sentence"). For example, in a subroutine call, I place an extra blank line after any code that sorts out the initial processing of @_:

sub marine { 
  my $wave = shift; 
  my $direction = shift; 


  ... more processing here ... 

} 
            
The extra blank line gives some "breathing room" to the eye, as well as suggesting that I'm "changing gears" a bit in the next section. The blank line costs only a single \n character, and yet I'm saving a bit of time for everyone reading the program. In addition to adding these blank lines every dozen or fewer code lines, I generally add a topic comment in front of the following chunk:

## compute the value 
... code here 
... to do 
... the computation 


## copy the data to the cache 
... more 
... code 


## update the cache freshness 
... code here 


## return the value 
return $the_value; 
Each comment begins with a double-hash ## so that my eye can immediately jump to it, and the comment describes the actions taken by the next few lines of code. I rarely write more than one line in these comments; consider them a "headline".

Again, it's a little thing, but it's amazing how much more readable the code is when you do these "little things" consistently.

Chapter 4 gives the advice "Use named constants, but don't use constant". I found that rather shocking, and initially (mockingly) offensive because the core module constant had been written by my fellow Stonehenge employee, Tom Phoenix. However, Damian goes on to describe the much more powerful and useful Readonly module (found in the CPAN), of which I had previously been unaware. Compare the following with use constant:

use constant PI => 3.2; 
print "In Indiana, Pi might have been @{[PI]}\n"; 
versus the equivalent with Readonly:

use Readonly; 
Readonly my $PI = 3.2; 
print "In Indiana, Pi might have been $PI\n"; 
Yes, the Readonly interface creates actual scalars (rather than subroutines as with use constant), which can be much more easily interpolated into strings, used as bareword keys, or even work nicely as readonly arrays and hashes.

So, even a beardless Perl "greybeard" like me can learn a new trick from a book like Perl Best Practices, and that's pretty cool. So, I suggest you go out immediately and add this book to your shelf (real or virtual), and until next time, enjoy!

Randal L. Schwartz is a two-decade veteran of the software industry -- skilled in software design, system administration, security, technical writing, and training. He has coauthored the "must-have" standards: Programming Perl, Learning Perl, Learning Perl for Win32 Systems, and Effective Perl Programming. He's also a frequent contributor to the Perl newsgroups, and has moderated comp.lang.perl.announce since its inception. Since 1985, Randal has owned and operated Stonehenge Consulting Services, Inc.