Article mar2006.tar

Inside-Out Objects

Randal L. Schwartz

In my previous article, "Generating Object Accessors", which appeared in the January 2006 issue, I created a traditional hash-based Perl object: a Rectangle with two attributes (width and height) using the constructor and accessors like so:

package Rectangle;
sub new {
  my $class = shift;
  my %args = @_;
  my $self = {
    width => $args{width} || 0;
    height => $args{height} || 0;
  };
  return bless $self, $class;
}
sub width {
  my $self = shift;
  return $self->{width};
}
sub set_width {
  my $self = shift;
  $self->{width} = shift;
}
sub height {
  my $self = shift;
  return $self->{height};
}
sub set_height {
  my $self = shift;
  $self->{height} = shift;
}
I can construct a 3-by-4 rectangle easily:

my $r = Rectangle->new(width => 3, height => 4);
At this point, $r is an object of type Rectangle, but it's also simply a hashref. For example, the code in set_width merely deferences a value like $r to gain access to the hash element with a key of width. But does Perl require such code to be located within the Rectangle package? No. As a user of the Rectangle class, I could easily say:

$r->{width} = 5;
and update the width from 3 to 5. This is "peering inside the box" and will lead to fragile code, because we've now exposed the implementation of the object, not just the interface.

For example, suppose we modify the set_width method to ensure that the width is never negative:

use Carp qw(croak);
sub set_width {
  my $self = shift;
  my $width = shift;
  croak "$self: width cannot be negative: $width"
    if $width < 0;
  $self->{width} = $width;
}
If the $width is less than 0, we croak, triggering a fatal exception, but blaming the caller of this method. (We don't blame ourselves, and croak is a great way to pass the blame.)

At this point, we'll trap erroneous settings:

$r->set_width(-3); # will die
But if someone has broken the box open, we get no fault:

$r->{width} = -3; # no death
This is bad, because the author of the Rectangle class no longer controls behavior for the objects, because the data implementation has been exposed.

Besides exposing the implementation, another problem is that I have to be careful of typos. Suppose in rewriting the set_width method, I accidentally transposed the last two letters of the hash key:

  $self->{widht} = $width;
This is perfectly legal Perl and would not throw any compile-time or run-time errors. Even use strict isn't helping here, because I'm not misspelling a variable name, just naming a "new" hash key. Without good unit tests and integration tests, I might not even catch this error. Yes, there are some solutions to ensure that a hash's keys come from only a particular set of permitted keys, but these generally slow down the hash access significantly.

We can solve both of these problems at once, without significantly impacting the performance of our programs by using what's come to be known as an inside-out object. First popularized by Damian Conway in the neoclassic Object-Oriented Perl book, an inside-out object creates a series of parallel hashes for the attributes (much like we had to do back in the Perl4 days before we had hashrefs). For example, instead of creating a single object for a rectangle that is 3 by 4:

my $r = { width => 3, height => 4 };
we can record its attributes in two separate hashes, keyed by some unique string:

my $r = "some unique string";
$width{$r} = 3;
$height{$r} = 4;
Now, to get the height of the rectangle, we use the unique string:

my $width = $width{$r};
and to update the height, we use that same string:

$height{$r} = 10;
When we turn on use strict and declare the %width and %height attribute hashes, this will trap any typos related to attribute names:

use strict;
my %width;
my %height;
...
my $r = "another unique string";
$height{$r} = 7; # ok
$widht{$r} = 3; # won't compile!
The typo on the width is now caught, because we don't have a %widht hash. Hooray. That solves the second problem. But how do we solve the first problem, and where do we get this "unique string", and how do we get methods on our object?

If I assign a blessed anonymous empty hash to $r:

my $r = bless {}, "Rectangle";
then when the value of $r is used as a string, I get a nice unique string:

Rectangle=HASH(0x400180FE)
where the number comes from the hex representation of the internal memory address of the object. As long as this reference is alive, that memory address will not be reused. Aha, there's our unique string:

sub new_7_by_3 {
  my $self = bless {}, shift;
  $height{$self} = 7;
  $width{$self} = 3;
  return $self;
}
And this is what our constructor does! By blessing the object, we'll return to the same package for methods. By having an anonymous hashref, we're guaranteed a unique number. And as long as the lexical %height and %width hashes are in scope, we can access and update the attributes.

But what are we returning? Sure, it's a hashref, but it's empty. There's no code that we can use to get from $r to the attribute hashes:

my $r = Rectangle->new_7_by_3;
The only way we can get the height is to have code in the same scope as the definitions of the attribute hashes:

sub height {
  my $self = shift;
  return $height{$self};
}
And then we can use that code in our main program:

my $height = $r->height;
The first parameter is $r, which gets used only for its unique string value, as a key into the lexical %height hash! It all Just Works.

Well, for some meaning of Works. We still have a couple of things to fix. First, there's really no reason to make an anonymous hash, because we never put anything into it, so we might as well make it a scalar:

my $self = bless \(my $dummy), shift;
Because Perl doesn't have a primitive anonymous scalar constructor, I'm cheating by making a $dummy variable.

Second, we've got some tidying up to do. When a value is no longer being referenced by any variable, we say it goes out of scope. When a traditional hashref-based object goes out of scope, any elements of the hash are also discarded, usually causing the values to also go out of scope (unless they are also referenced by some other live value). This all happens quite automatically and efficiently.

However, when our inside-out object goes out of scope, it doesn't "contain" anything. However, its address-as-a-string is being used in one or more attribute hashes, and we need to get rid of those to mimic the traditional object mechanism. So, we'll need to add a DESTROY method:

sub DESTROY {
  my $dead_body = $_[0];
  delete $height{$dead_body};
  delete $width{$dead_body};
  my $super = $dead_body->can("SUPER::DESTROY");
  goto &$super if $super;
}
Note that after deleting our attributes, we also call any superclass destructor, so that it has a chance to clean up, too.

Let's put it all together:

package Rectangle;
my %width;
my %height;
sub new {
  my $class = shift;
  my %args = @_;
  my $self = bless \(my $dummy), $class;
  $width{$self} = $args{width} || 0;
  $height{$self} = $args{height} || 0;
  return $self;
}
sub DESTROY {
  my $dead_body = $_[0];
  delete $height{$dead_body};
  delete $width{$dead_body};
  my $super = $dead_body->can("SUPER::DESTROY");
  goto &$super if $super;
}
sub width {
  my $self = shift;
  return $width{$self};
}
sub set_width {
  my $self = shift;
  $width{$self} = shift;
}
sub height {
  my $self = shift;
  return $height{$self};
}
sub set_height {
  my $self = shift;
  $height{$self} = shift;
}
Not bad! That's only slightly more complex than a traditional hashref implementation and a lot safer for the "outside". Of course, this is a lot of code to get right, so the best thing is to let someone else do the hard work. See Class::Std and Object::InsideOut for some budding frameworks to build these objects. Until next time, enjoy!

Randal L. Schwartz is a two-decade veteran of the software industry -- skilled in software design, system administration, security, technical writing, and training. He has coauthored the "must-have" standards: Programming Perl, Learning Perl, Learning Perl for Win32 Systems, and Effective Perl Programming. He's also a frequent contributor to the Perl newsgroups, and has moderated comp.lang.perl.announce since its inception. Since 1985, Randal has owned and operated Stonehenge Consulting Services, Inc.