Cooking with Maypole, Part II

The Perl Journal June, 2004

By Simon Cozens

Simon is a freelance programmer and author, whose titles include Beginning Perl (Wrox Press, 2000) and Extending and Embedding Perl (Manning Publications, 2002). He's the creator of over 30 CPAN modules and a former Parrot pumpking. Simon can be reached at simon@ simon-cozens.org.

We began our gastronomic adventures with Maypole last time, when we constructed a recipe collection and a way to index and investigate the current state of food stocks in the house. Now we're going to combine the two concepts, and search for recipes that fit what we've got available to eat.

First, though, we'll take a brief look at how Maypole works and what it actually does.

How Maypole Works

In Part I, we saw Maypole primarily in terms of putting an interface onto an existing data structure and applying templates to this. However, to think of it like this is to miss the flexibility and extensibility of Maypole as a web application framework.

Perhaps the best way to think of Maypole is as a tool for mapping a URL onto an "action," where an action is specified as a method call and a template. So the URL "/recipe/view/12" is asking for the view action to be performed on a recipe class, with argument "12." Practically, this means that the view method will be called on the Larder::Recipe class on the object representing the 12th row in the table, and then the view template used to display the results.

This process is carried out by the gradual fleshing out of a "Maypole request object"—analogous to an Apache request object but at a much higher level. As well as containing the means to communicate with the web server (such as an Apache::Request object), it begins with the configuration and some idea of the path requested: /recipe/view/12.

Next, it decomposes the path down to its components:

{ table => "recipe",
  action => "view",
  args => [ 12 ]
}

Then it associates the table with the Larder::Recipe class and calls the view method. Finally, the output from the templating stage gets added to the request, and the request is sent back to the front-end Maypole class (usually Apache::MVC or CGI::Maypole) for eventual output to the browser.

Avoiding Rotten Tomatoes

We'll begin our extension of the Larder application by adding an action to the contents of our larder that are in imminent danger of going off. To help us do this, we'll write a utility method in the Larder::Contents class called ripe_food, which returns all the objects that need to be eaten. This is pure Class::DBI for the time being.

First, SQL dates are a bit of a pain to do anything sensible with, so we have Class::DBI automatically inflate them to Time::Piece objects:

Larder::Contents->has_a(use_by => 'Time::Piece',
     inflate => sub { Time::Piece->strptime(shift, "%Y-%m-%d") },
     deflate => 'ymd',
       );

Now, every time we call use_by, we get a Time::Piece object. This doesn't really affect our display because Time::Piece stringifies nicely.

We can now look for those Contents objects that have a use_by date under five days away:

use Time::Seconds;
use Time::Piece;

sub ripe_food {
  my $class = shift;
  my $deadline = localtime + 5 * ONE_DAY;
  grep { $_->use_by <= $deadline } $class->retrieve_all;
}

(If more efficiency is required, we could do the searching in the SQL by using Class::DBI::AbstractSearch, but my larders aren't big enough to warrant it.)

To turn this method into an action that can be called from the Web, we need to create an "Exported" method that places these objects into the Maypole request object:

sub must_eat :Exported {
  my ($self, $r) = @_;
  $r->{objects} = [ $self->ripe_food ];
}

We create a template in contents/must_eat, and we can now view these items from the URL /contents/must_eat. In order to suggest recipes that use these up, we need to link ingredients to recipes.

Categories and Ingredients

Our data loader in Part I looked like this:

use Larder;
use XML::Simple;
use File::Slurp;
for my $recipe (<xml/*>) {
  my $xml = read_file($recipe);
  my $structure = XMLin($xml, ForceArray => 1)->{recipe}->[0];
  my $name = $structure->{head}->[0]->{title}->[0];
  my @ingredients = @{$structure->{ingredients}[0]{ing}};
  my @cats = @{$structure->{head}[0]{categories}[0]{cat}};
  Larder::Recipe->find_or_create({
    name => $name,
    xml => $xml
  });
}

Now we're going to extend this, and our Larder class, to support the linkages between recipes and categories, and recipes and their ingredients. Class::DBI makes it easy for us to do this: We tell it the name of the accessor we want, the name of the mapping class, and the name of the accessor in that class that returns what we want.

In our case, we first have to tell Class::DBI about the relationship between the ingredient table and the food table:

Larder::Ingredient->has_a(food => "Larder::Food");

and vice versa:

Larder::Food->has_many(ingredients => "Larder::Ingredient");

So, in our case, we want the Larder::Recipe class to have an accessor called ingredients, which uses Larder::Ingredient to get a list of ingredients for a recipe and calls food on each one to return a Larder::Food object. The code for that looks like this:

Larder::Recipe->has_many(ingredients => [ Larder::Ingredient => "food" ]);

Similarly, we can get from a Larder::Food object to the recipes that contain it:

Larder::Ingredient->has_a(recipe => "Larder::Recipe");
Larder::Food->has_many(recipes => [ Larder::Ingredient => "recipe" ]);

And all the same for categories. Of course, we could do this a little more easily by using my module Class:DBI::Loader::Relationship. This uses Class::DBI::Loader (which Maypole also uses; this is no coincidence) to express the relationships between classes in more natural terms. It's not as powerful, but it is easier to remember:

Larder->config->{loader}->relationship(
  "A recipe has categories via categorizations"
);

With our relationships set up, we can tell the loader to associate a recipe with its categories and ingredients:

for (@categories) {
  my $cat = Larder::Category->find_or_create({ name => $_ });
  $recipe->add_to_categories({ category => $cat });
}

for (@ingredients) {
  my ($ing, $amt) = ($_->{item}[0], $_->{amt}[0]);
  my $ingredient = Larder::Food->find_or_create({ name => $ing });
  my $quantity = $amt->{qty}[0]. " " . $amt->{unit}[0];
  $recipe->add_to_ingredients({ food => $ingredient,
                  quantity => $quantity   });
  }
}

Now we can create a template that suggests some recipes to go with our moribund ingredients:

<h1> You need to use up some food! </h1>

[% FOR content = contents %]
  <h2> [% content.food.name %] </h2>

  <P> Needs to be used up by [% content.use_by %] </P>

  <P>
  Some recipes you could make with this:
  <P>
  <UL>
     [% FOR recipe = content.food.recipes %] 
    <LI> <A HREF="/recipe/view/[% recipe.id %]"> [% recipe %]</A>
    [% END]
  </UL>
[% END %]

Of course, these aren't necessarily the best recipes for the job; ideally, we're going to find the recipes that use up as many of the ingredients as possible. We can't do this with a plain database search. My initial plan was to gather up all the potential recipes, then score them based on the ingredients that they use and the immediacy of getting rid of the ingredient.

But then an even more interesting technology came along.

Plucene

Plucene is a Perl port of the Java Lucene search engine, a Jakarta project (see http://jakarta.apache.org/lucene/ for more information). Rather than a standalone search tool, it is a library with which you can construct your own indexing and searching tools. The easiest interface to it is through the Perl module Plucene::Simple, which we'll use for indexing the recipes.

Plucene works in terms of "documents," which are a little like pages in a book. When you're building an index to a book, the index will relate a word or phrase from the page (the index term) to a page number—it doesn't directly relate the word to the entire contents of the page, or the index would be exponentially longer than the book itself! Instead, the reader is responsible for turning the page number into the original page contents. If we're indexing a book with Plucene, we might create documents like this:

@documents = (
1 => {
  chapter_title => "Preface",
  text => "We have many emotions as we ..."
},
3 => {
  chapter_title => "Ethos",
  text => "We have made fundamental assumptions  ..."
},
...
);

We create a Plucene::Simple object that represents the index:

use Plucene::Simple;
my $index = Plucene::Simple->open("/home/simon/ow_book/index");

And we can add the documents:

$index->add(@documents);
$index->optimize;

The optimization step defragments the index once we've finished adding a lot of data at once. To run a search, we open the index again and call its search method:

use Plucene::Simple;
my $index = Plucene::Simple->open("/home/simon/ow_book/index");
my @results = $index->search("we");

Now a search for "we" would return "1" and "3," and we could narrow down our search by looking for "we chapter_title:Ethos," which would return only page 3. It's assumed that we have an easy way of turning the ID, "3," into the full text of the book's page.

However, we're not indexing a book, but a set of recipes. In this case, our documents are going to look like this:

52 => {
  title => "Aioli",
  categories => "Salads Condiment Classic",
  ingredients => "Garlic Mayonnaise",
}

where "52" is the ID of the recipe in the recipe table. We can, of course, look this up again by doing Larder::Recipe->retrieve(52). We're going to skip the process of indexing the directions part of the recipe because we're only interested in searching for particular ingredients at the moment. So, once again, we edit our recipe loader script, which currently has this at the end of the loop:

Larder::Recipe->find_or_create({
  name => $name,
  xml => $xml
});

We need to keep hold of that recipe's ID and build up our hash of things to index:

my $recipe = Larder::Recipe->find_or_create({
    name => $name,
    xml => $xml
  });
my $hash = {
  title => $name,
  categories => (join " ", @categories),
  ingredients => (join " ", map { $_->{item}[0] } @ingredients)
};
$index->add( $recipe->id => $hash );

Finally, we optimize the index outside of the loop once everything has been added:

$index->optimize;

Now we have, in addition to our database of recipes, an index by which we can look for entries in that database. How does this help us find good recipes for food that's going off?

Locating the Best Recipe

Once our index has been built up, we can now start searching for recipes by their ingredients, and Plucene automatically makes sure that those recipes that match "better"—that is, use more of the ingredients—are returned first. So, for instance, if we have some carrots, bacon, and mushrooms to use up, we can create a simple test search script like this:

use Plucene::Simple;
use Larder;
my $index = Plucene::Simple->open("pl_index");
for ($index->search("carrots bacon mushrooms")) {
  my $r = Larder::Recipe->retrieve($_);
  print $r->name,"\n", join ", ", map { $_->name } $r->ingredients;
  print "\n\n";
}

And we'll be given a list of recipes that contain any of those ingredients, but starting with the best matches:

24 Hour Vegetable Salad
Iceberg lettuce, Mushrooms, Peas, Carrots, Egg whites, Bacon,
Cheese, Fat-free mayonnaise, Lemon Juice

Beef Burgundy
Mushrooms, Onions, Butter, Bacon, Sirloin Steaks, Flour,
Burgundy, Beef Broth, Bay leaf, Garlic, Ground Thyme, Carrots, 
Salt And Pepper, Noodles, Chopped Parsley

Bacon Supper Snack
Gammon, Tomatoes, Stuffing, Butter, Mushrooms, Soft White
Bread Crumbs, Salt and pepper, Mixed Herbs, Egg

All-In-One-Breakfast
Whole Wheat Bread, Butter, Mushrooms, Tomatoes, Cheese, Bacon

...

The last two recipes shown here (and there were many more) don't contain all three ingredients, but do contain two; as we carry on down the list, we get less and less specific.

What we're doing, then, is using the built-in scoring techniques of a standard search engine to find the best recipes for us—web search engines are all about finding the most appropriate pages relating to the user's terms; we're using the same mechanism to find the most appropriate lunch relating to what's in the cupboard.

Now we can turn our "must eat" page into a page that helps us search for the best recipes to eat:

sub must_eat :Exported {
  my $index = Plucene::Simple->open("recipe_index");
  my @ripe = $self->ripe_food;
  $r->{objects} = \@ripe;
  my @terms = map { '"'. $_->name. '"' } @ripe;
  my @results = map { Larder::Recipe->retrieve($_) }
      $index->search(join " ", @terms);
  $r->{template_args}{recipes} = \@results;
  $r->{template_args}{highlight} = { map { $_->name => 1 } @ripe };
}

Notice that we surround our ingredient names in double quotes—Plucene understands the concept of phrase matches, as one would expect from a search engine: "fish fingers" searches for recipes containing fish fingers, whereas fish fingers (no quotes) will search for recipes that make use of both fish and fingers. (One can only hope that neither of those searches will turn up any hits in your recipes.) When we've retrieved the results from our search engine and turned them into recipe objects, we add them to our set of arguments to the template. We also add a hash of the ingredient names we're looking for—this will help us highlight the ingredients when we're producing a summary of the recipe. So, for instance, we want our page to look like Figure 1.

The associated template would go as follows:

<h2> You need to use up some food! </h2>
<P>
The following food is getting a bit ripe:
</P>
<UL>
[% FOR content = contents %]
  <LI> [% content.food.name %]
[% END %]
</UL>

<H2> Suggested recipes </H2>
<P>
These recipes will help you use up those ingredients:
</P>

Now we look at each recipe in our search results:

[% FOR recipe = recipes %]
  <h3> <A HREF="/recipe/view/[%recipe.id%]"> [% recipe.name %] </a> </h3>
  Requires:
  <p>
  [% FOR ingredient = recipe.ingredients;
     SET name = ingredient.food.name;

And now for each ingredient, we can show their names and check whether or not to highlight them:

     '<span class="searchresult">' if highlight.$name;
     name;
     '</span>' if highlight.$name;
     END;
  %]
  </p>
[% END %]

Hooray—now we not only know which recipes will use up the dying ingredients, but also which ones will include the most of them at once. There's one final touch we can add to our application before we head off to the kitchen—a sense of urgency.

If something needs to be used by today, we want a recipe that uses it today. Let's change ripe_food so that it returns us a hash of ingredients and a score representing the need to eat them:

sub ripe_food {
  my $class = shift;
  my $deadline = localtime + 5 * ONE_DAY;
  map { 5 - int(($deadline - $_->use_by) / ONE_DAY) }
  grep { $_->use_by <= $deadline } $class->retrieve_all;
}

Now if we have some cheese that really needs to be eaten today and some ham that has two days to go, we get:

( Cheese => 5, Ham => 3 )

We want Plucene to score up recipes that contain cheese, relative to those that contain ham. We can do this using a boost factor in the search term. Plucene allows us to search for "Cheese"^5 "Ham"^3—now it tries to find recipes that have both ham and cheese in it, then those that contain cheese, then those that contain ham. With a list of 10 or 20 ingredients to get rid of, this is more or less guaranteed to give us recipes that use up the widest range of the most desperate ingredients first, giving us the most economical ways to clean out our cupboards. We'll need to modify the must_eat action to understand the list returned:

sub must_eat :Exported {
  my $index = Plucene::Simple->open("recipe_index");
  my @ripe = $self->ripe_food;
  $r->{objects} = [];
  my @terms;
  while (my ($obj, $score) = splice(@ripe, 0, 2)) {
    push @{$r->{objects}}, $obj;
    push @terms, '"'. $obj->name. '"^'. $score;
    $r->{template_args}{highlight}{$obj->name}++;
  }

  my @results = map { Larder::Recipe->retrieve($_) }
      $index->search(join " ", @terms);
  $r->{template_args}{recipes} = \@results;
}

We're using a list of pairs instead of a real hash because the "keys" are objects, and Perl doesn't let us use objects as hash keys—they store just fine, but as they are stored, they get stringified and we can't use them as objects again when we retrieve from the hash. The technique of using

while (my ($key, $value) = splice(@list, 0, 2)) {

where you'd normally expect

while (my ($key, $value) = each %hash) {

is quite a common one you can use where you'd like to use objects as hash keys.

With this in place, Plucene scores each ingredient according to its freshness, in a nice simple way that frees us from having to think up a complicated algorithm to do the job. That's code re-use! And now, what's in the fridge?

Happy Cooking

I hope you've enjoyed our two-part foray into cooking with Perl; we've covered quite a lot of ground on the way. This time we've focused particularly on Maypole and showed how to turn it from a simple front-end to databases into a web application framework on which to base more complex applications. We've also taken a look at Plucene, a pure Perl search engine that allows us to index and search through all kinds of data—including recipes!

If you want to find out more about Maypole, there's a growing set of documentation at http://maypole.simon-cozens.org/docs/, including a large manual with several examples of real-life applications. Plucene can be downloaded from CPAN, and there's a longer introduction to it at http://www.perl.com/pub/a/2004/02/ 19/plucene.html. Finally, for a load of great recipes in RecipeML, try http://dsquirrel.tripod.com/recipeml/indexrecipes2.html.

Bon appetit!

TPJ