Inside-Out
Objects
Randal L. Schwartz
In my previous article, "Generating Object Accessors", which appeared
in the January 2006 issue, I created a traditional hash-based Perl
object: a Rectangle with two attributes (width and height) using
the constructor and accessors like so:
package Rectangle;
sub new {
my $class = shift;
my %args = @_;
my $self = {
width => $args{width} || 0;
height => $args{height} || 0;
};
return bless $self, $class;
}
sub width {
my $self = shift;
return $self->{width};
}
sub set_width {
my $self = shift;
$self->{width} = shift;
}
sub height {
my $self = shift;
return $self->{height};
}
sub set_height {
my $self = shift;
$self->{height} = shift;
}
I can construct a 3-by-4 rectangle easily:
my $r = Rectangle->new(width => 3, height => 4);
At this point, $r is an object of type Rectangle, but it's also simply
a hashref. For example, the code in set_width merely deferences
a value like $r to gain access to the hash element with a key of width.
But does Perl require such code to be located within the Rectangle
package? No. As a user of the Rectangle class, I could easily
say:
$r->{width} = 5;
and update the width from 3 to 5. This is "peering inside the box"
and will lead to fragile code, because we've now exposed the implementation
of the object, not just the interface.
For example, suppose we modify the set_width method to
ensure that the width is never negative:
use Carp qw(croak);
sub set_width {
my $self = shift;
my $width = shift;
croak "$self: width cannot be negative: $width"
if $width < 0;
$self->{width} = $width;
}
If the $width is less than 0, we croak, triggering a fatal
exception, but blaming the caller of this method. (We don't blame
ourselves, and croak is a great way to pass the blame.)
At this point, we'll trap erroneous settings:
$r->set_width(-3); # will die
But if someone has broken the box open, we get no fault:
$r->{width} = -3; # no death
This is bad, because the author of the Rectangle class no longer controls
behavior for the objects, because the data implementation has been
exposed.
Besides exposing the implementation, another problem is that I
have to be careful of typos. Suppose in rewriting the set_width
method, I accidentally transposed the last two letters of the hash
key:
$self->{widht} = $width;
This is perfectly legal Perl and would not throw any compile-time
or run-time errors. Even use strict isn't helping here, because
I'm not misspelling a variable name, just naming a "new" hash key.
Without good unit tests and integration tests, I might not even catch
this error. Yes, there are some solutions to ensure that a hash's
keys come from only a particular set of permitted keys, but these
generally slow down the hash access significantly.
We can solve both of these problems at once, without significantly
impacting the performance of our programs by using what's come to
be known as an inside-out object. First popularized by Damian
Conway in the neoclassic Object-Oriented Perl book, an inside-out
object creates a series of parallel hashes for the attributes (much
like we had to do back in the Perl4 days before we had hashrefs).
For example, instead of creating a single object for a rectangle
that is 3 by 4:
my $r = { width => 3, height => 4 };
we can record its attributes in two separate hashes, keyed by some
unique string:
my $r = "some unique string";
$width{$r} = 3;
$height{$r} = 4;
Now, to get the height of the rectangle, we use the unique string:
my $width = $width{$r};
and to update the height, we use that same string:
$height{$r} = 10;
When we turn on use strict and declare the %width and
%height attribute hashes, this will trap any typos related
to attribute names:
use strict;
my %width;
my %height;
...
my $r = "another unique string";
$height{$r} = 7; # ok
$widht{$r} = 3; # won't compile!
The typo on the width is now caught, because we don't have a %widht
hash. Hooray. That solves the second problem. But how do we solve
the first problem, and where do we get this "unique string", and how
do we get methods on our object?
If I assign a blessed anonymous empty hash to $r:
my $r = bless {}, "Rectangle";
then when the value of $r is used as a string, I get a nice
unique string:
Rectangle=HASH(0x400180FE)
where the number comes from the hex representation of the internal
memory address of the object. As long as this reference is alive,
that memory address will not be reused. Aha, there's our unique string:
sub new_7_by_3 {
my $self = bless {}, shift;
$height{$self} = 7;
$width{$self} = 3;
return $self;
}
And this is what our constructor does! By blessing the object, we'll
return to the same package for methods. By having an anonymous hashref,
we're guaranteed a unique number. And as long as the lexical %height
and %width hashes are in scope, we can access and update the
attributes.
But what are we returning? Sure, it's a hashref, but it's empty.
There's no code that we can use to get from $r to the attribute
hashes:
my $r = Rectangle->new_7_by_3;
The only way we can get the height is to have code in the same scope
as the definitions of the attribute hashes:
sub height {
my $self = shift;
return $height{$self};
}
And then we can use that code in our main program:
my $height = $r->height;
The first parameter is $r, which gets used only for its unique
string value, as a key into the lexical %height hash! It all
Just Works.
Well, for some meaning of Works. We still have a couple of things
to fix. First, there's really no reason to make an anonymous hash,
because we never put anything into it, so we might as well make
it a scalar:
my $self = bless \(my $dummy), shift;
Because Perl doesn't have a primitive anonymous scalar constructor,
I'm cheating by making a $dummy variable.
Second, we've got some tidying up to do. When a value is no longer
being referenced by any variable, we say it goes out of scope.
When a traditional hashref-based object goes out of scope, any elements
of the hash are also discarded, usually causing the values to also
go out of scope (unless they are also referenced by some other live
value). This all happens quite automatically and efficiently.
However, when our inside-out object goes out of scope, it doesn't
"contain" anything. However, its address-as-a-string is being used
in one or more attribute hashes, and we need to get rid of those
to mimic the traditional object mechanism. So, we'll need to add
a DESTROY method:
sub DESTROY {
my $dead_body = $_[0];
delete $height{$dead_body};
delete $width{$dead_body};
my $super = $dead_body->can("SUPER::DESTROY");
goto &$super if $super;
}
Note that after deleting our attributes, we also call any superclass
destructor, so that it has a chance to clean up, too.
Let's put it all together:
package Rectangle;
my %width;
my %height;
sub new {
my $class = shift;
my %args = @_;
my $self = bless \(my $dummy), $class;
$width{$self} = $args{width} || 0;
$height{$self} = $args{height} || 0;
return $self;
}
sub DESTROY {
my $dead_body = $_[0];
delete $height{$dead_body};
delete $width{$dead_body};
my $super = $dead_body->can("SUPER::DESTROY");
goto &$super if $super;
}
sub width {
my $self = shift;
return $width{$self};
}
sub set_width {
my $self = shift;
$width{$self} = shift;
}
sub height {
my $self = shift;
return $height{$self};
}
sub set_height {
my $self = shift;
$height{$self} = shift;
}
Not bad! That's only slightly more complex than a traditional hashref
implementation and a lot safer for the "outside". Of course, this
is a lot of code to get right, so the best thing is to let someone
else do the hard work. See Class::Std and Object::InsideOut
for some budding frameworks to build these objects. Until next time,
enjoy!
Randal L. Schwartz is a two-decade veteran of the software
industry -- skilled in software design, system administration, security,
technical writing, and training. He has coauthored the "must-have"
standards: Programming Perl, Learning Perl, Learning
Perl for Win32 Systems, and Effective Perl Programming.
He's also a frequent contributor to the Perl newsgroups, and has
moderated comp.lang.perl.announce since its inception. Since 1985,
Randal has owned and operated Stonehenge Consulting Services, Inc. |