The Perl Journal July 2003
Most people now know what a Wiki isin essence, it is a website that can be edited by anyone who comes across it, using a simple form in a web browser. The original Wiki concept was the work of Ward Cunningham nearly a decade ago. Ward's original design principles for Wiki can be seen at http://c2.com/cgi/wiki?WikiDesignPrinciples.
After having worked on grubstreet, the open community guide to London, for several months, I started to feel that some of these design principles were more important to me than others. Concentrating on a few key principles allows "slippage" of the other principles, and you might end up with something quite different from what you started with. CGI::Wiki is an example of this. It was originally planned as a rewrite of UseModWiki in more modern Perl, but along the way, it acquired a lot more power and flexibility.
CGI::Wiki is designed to be flexible about storage and indexing of its data. The main constructor takes between one and three arguments. The only mandatory argument is a datastore object. An indexer object can also be supplied, to aid searching, and a formatter object can be supplied if you wish to use the Wiki syntax other than the default.
my $datastore = CGI::Wiki::Store::SQLite->new(dbname => "/home/wiki/store.db"); my $indexdb = Search::InvertedIndex::DB::DB_File_SplitHash->new( -map_name => "/home/wiki/indexes.db", -lock_mode => "EX" ); my $indexer = CGI::Wiki::Search::SII->new( indexdb => $indexdb ); my $formatter = CGI::Wiki::Formatter::UseMod->new; my $wiki = CGI::Wiki->new( store => $datastore, search => $indexer, formatter => $formatter );
The datastore types currently available are all stored in a databaseyour choice of Postgres, MySQL, or SQLite. You could write a flat-file backend if you liked; nobody has wanted one enough yet, though. The SQLite backend (see DBD::SQLite or http://www.hwaci.com/sw/sqlite/ for details) allows you to store a full RDBMS in a single file, so it will suit most situations where running an actual database server would be inconvenient.
The recommended search indexer is Search::InvertedIndex, though support for DBIx::FullTextSearch (which has phrase searching built in, but only works with MySQL) is also provided.
The formatters are perhaps the most fun bit. The default formatter uses the Text::WikiFormat formatting conventions, but a custom formatter is very easy to write. CGI::Wiki::Formatter::UseMod, which provides usemod-style syntax, and CGI::Wiki::Formatter::Pod, which allows you to write your Wiki entirely in POD, can both be found on CPAN, and should serve as examples if you want to write your own.
So what are the important differences between CGI::Wiki and most other Wiki implementations? For one thing, it's not actually a Wiki; it's a toolkit for building Wikisand things that resemble Wikis. You can't just install it and have an instant Wiki, though it hardly takes any time at all to create a simple one.
Another major difference is its support for metadata. Most Wiki implementations treat their content as an undifferentiated block of text; indexing and categorization are done by hand, and searching rarely gets any more sophisticated than phrase matching.
Wiki users have developed various conventions to get around these limitations. The most widely used one seems to be the convention of adding WikiWords such as adding "CategoryPattern" and "CategoryPerl" to the bottom of a page about a programming pattern implemented in Perl. The concept of WikiWords is much wider than the concept of faking up categories like this. All pages about Perl can then be found by searching the Wiki for pages that contain the word "CategoryPerl." A hierarchy of categories can be created by creating a page for CategoryPerl and putting "CategoryProgrammingLanguages" at the bottom; and so on.
CGI::Wiki allows you, as the writer of Wiki software, to attach any kind of metadata you like to a given node. A list of categories is an obvious choice; it's easy then to create a plug-in to look for everything in Category Foo or one of its subcategories. It's also very easy to provide your users with a macro to inline a category index into any given page. The index can even be a collapsible hierarchical listing, if you like. Other kinds of metadata that have proven very useful include location datalatitude and longitude. I'll discuss some other useful metadata below.
At the moment, the most fully developed CGI::Wiki application available is OpenGuides, a complete system for managing a collaboratively written guide to a city or town.
OpenGuides makes heavy use of CGI::Wiki's metadata support. It uses the ability to store and retrieve location data, so it can easily handle queries such as "find me all pubs within 300m of Holborn Station," or "find me all Chinese restaurants within 500m of Trafalgar Square." It also uses a locale field (more on this later) for grouping places into neighborhoods within a city.
CGI::Wiki provides simple metadata access directly; for example to find everything in Holborn we use the method:
my @nodes = $wiki->list_nodes_by_metadata( metadata_type => "locale", metadata_value => "Holborn" );
More complicated queries are done via plug-ins. Queries such as "pubs within 300m of Holborn Station" are handled by CGI::Wiki::Plugin::Locator::UK, for example. Anyone can write a CGI::Wiki plugin; more detailed guidelines are available in perldoc CGI::Wiki::Extending, but in essence your plug-in can choose between building on the simple access methods provided by CGI::Wiki, and accessing the database backend directly with SQL. The UK locator plug-in uses the latter technique for speed.
Many of OpenGuides' metadata fields are motivated by the desire to output RDFmachine-readable versions of the nodesso that different OpenGuides installs can communicate with each other, and so that other Internet applications can make use of OpenGuides data. For example, an IRC bot interface to the London OpenGuides install might use LWP::Simple to request a URI such as:
http://un.earth.li/~kake/cgi-bin/wiki.cgi?action=index; index_type=locale;index_value=Chinatown;format=rdf
and receive a response like
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:dc="http://purl.org/dc/1.0/"
xmlns:gs="http://the.earth.li/~kake/xmlns/gs/0.1/"
>
<gs:locale rdf:about="http://un.earth.li/~kake/cgi-bin/wiki.cgi?action=index;
index_type=locale;index_value=Chinatown;format=rdf">
<gs:name>Locale Chinatown</gs:name>
<gs:object>
<rdf:Description rdf:about="http://un.earth.li/~kake/cgi-bin/wiki.cgi?De_Hems,_W1D_5BW">
<dc:title>De Hems, W1D 5BW</dc:title>
<rdfs:seeAlso rdf:resource="http://un.earth.li/~kake/cgi-bin/wiki.cgi?id=De_Hems,_W1D_5BW;format=rdf" />
</rdf:Description>
</gs:object>
<gs:object>
<rdf:Description rdf:about="http://un.earth.li/~kake/cgi-bin/wiki.cgi?Golden_Harvest,_WC2H_7BE">
<dc:title>Golden Harvest, WC2H 7BE</dc:title>
<rdfs:seeAlso rdf:resource="http://un.earth.li/~kake/cgi-bin/wiki.cgi?id=Golden_Harvest,_WC2H_7BE;format=rdf" />
</rdf:Description>
</gs:object>
<
/gs:locale>
</rdf:RDF>
Parsing this with RDF::Core::Parser is simple:
use LWP::Simple;
use RDF::Core::Parser;
use URI::Escape;
my $locale = "Chinatown";
my $content = get("http://un.earth.li/~kake/cgi-bin/wiki.cgi?action=index;
format=rdf;index_type=locale;index_value=" . uri_escape($locale));
my $parser = RDF::Core::Parser->new( Assert => \&_assert, BaseURI => "foo" );
my (@finds, %name);
$parser->parse( $content );
my $return = "things in $locale: ";
if ( @finds ) {
$return .= join("; ", map { $name{$_} } @finds ) . "\n";
} else {
$return .= "none, sorry.\n";
}
print $return;
sub _assert {
my %triple = @_;
if ( $triple{predicate_uri} eq
"http://the.earth.li/~kake/xmlns/gs/0.1/object" ) {
push @finds, $triple{object_uri};
}
if ( $triple{predicate_uri} eq "http://purl.org/dc/1.0/title" ) {
$name{$triple{subject_uri}} = $triple{object_literal};
}
}
And now we can talk to our OpenGuides install over IRC:
15:12 <Kake> grotbot: things in Chinatown 15:12 <grotbot> OK, working on it <grotbot> Kake: things in Chinatown: Crispy Duck, W1D 6PR; De Hems, W1D 5BW; Golden Harvest, WC2H 7BE; HK Diner; Hung's, W1D 6PR; Misato, W1D 6PG; Tai, W1D 4DH; Tokyo Diner; Zipangu, WC2H 7JJ
So we have three ways to look at the same dataas an HTML page in a web browser, as RDF, or via an IRC bot. How about a fourtha Scalable Vector Graphics plot?
use strict;
use SVG::Plot;
use URI::Escape;
use CGI::Wiki;
use CGI::Wiki::Plugin::Locator::UK;
my $wiki = CGI::Wiki->new( ... );
my $locator = CGI::Wiki::Plugin::Locator::UK->new;
$wiki->register_plugin( plugin => $locator );
my @nodes = $wiki->list_nodes_by_metadata(
metadata_type => "locale",
metadata_value => "Chinatown" );
my @points;
foreach my $node (@nodes) {
my ($x, $y) = $locator->coordinates( node => $node );
if ($x and $y) {
my $uri = "http://un.earth.li/~kake/cgi-bin/wiki.cgi?" . uri_escape($node);
push @points, [ $x, $y, $uri ];
}
}
my $plot = SVG::Plot->new( points => \@points,
max_width => 800,
max_height => 600,
point_size => 3 );
my $svg = $plot->plot;
print "Content-Type: image/svg+xml\n\n";
print $svg;
Unlike a traditional Wiki, a metadata-enabled Wiki allows you to work with aggregated data. One excellent application of this is collaborative mapping of physical spacesnot just the objective statistics such as latitude and longitude, but subjective measures like psychological distance between two places, or membership in a neighborhood.
Consider neighborhoods in London; places like Holborn, Bloomsbury, Chelsea, Fulham, Islington. Unlike boroughs, which have clear boundaries defined by the extent of local council responsibility, neighborhoods are vague and fuzzy. They overlap. Is my office in Holborn or is it in Bloomsbury? Is it in both? Do I live in Hammersmith or Fulham? Or both? Or neither?
The OpenGuides node edit form includes a field where one or more "locales" can be entered for a page. If a later editor thinks you've got the locales wrong, they can add or delete other locales. So each time a given locale is left alone during an edit could be considered to be some kind of "vote" for that place being in that locale. Here's how you'd write a plug-in to track these votes.
package OpenGuides::LocaleVote; use strict; use vars qw( $VERSION @ISA $plugin_key ); $VERSION = '0.01'; $plugin_key = "og_localevote"; use Carp "croak"; use CGI::Wiki::Plugin; @ISA = qw( CGI::Wiki::Plugin );
We inherit from CGI::Wiki::Plugin for easy access to the Wiki's backend datastore. We define a $plugin_key to identify the namespace for the tables we'll be writing to.
use CGI::Wiki;
use OpenGuides::LocaleVote;
my $wiki = CGI::Wiki->new( ... );
my $ballot = OpenGuides::LocaleVote->new;
$wiki->register_plugin( plugin => $ballot );
$wiki->write_node( "Calthorpe Arms", "nice pub", undef, { locale => [ "Holborn", "Bloomsbury" ] } );
my %locales = $ballot->get_locales( node => "Calthorpe Arms" );
print "Votes for Holborn: $locales{Holborn}";
The API is nice and simplecreate a Wiki object, create a plugin object, register the plugin with the Wiki, write some data, and pull out the votes for a given node.
sub new {
my $class = shift;
my $self = { table => "p_" . $plugin_key . "_votes" };
bless $self, $class;
return $self;
}
We follow the convention described in CGI::Wiki::Extending for our table name, and store it on instantiation, for convenience.
sub on_register {
my $self = shift;
my $table = $self->{table};
my $datastore = $self->datastore;
my $dbh = $self->datastore->dbh
or croak "Not implemented for non-database datastores";
my $store_class = ref $datastore;
$store_class =~ s/CGI::Wiki::Store:://;
# Check table is set up.
if ( $store_class eq "Pg" ) {
my $sth = $dbh->prepare
"SELECT count(*) FROM pg_tables WHERE tablename=?" );
$sth->execute($table);
my ($table_ok) = $sth->fetchrow_array;
$sth->finish;
unless ($table_ok) {
$dbh->do( "CREATE TABLE $table (node varchar(200), locale text, votes integer )" );
}
} elsif ( $store_class eq "MySQL" ) {
...
} elsif ( $store_class eq "SQLite" ) {
...
} else {
croak "Store class $store_class unknown";
}
}
The on_register method will be called when the plug-in is registered by calling register_plugin on the Wiki object. on_register is our chance to check that the table we need to use has been set up in the backend database. We are careful to allow for the possibility that our plug-in may be used by Wikis with nondatabase datastores, if anyone ever writes one.
Since we are inheriting from CGI::Wiki::Plugin, before on_register is called, CGI::Wiki will store its datastore, indexer and formatter in our object, so we can get hold of the database handle by calling $self->datastore->dbh, and then we have full access to the database. Following the conventions in CGI::Wiki::Extending, we only write to tables labeled with our $plugin_key, though we can read from any table.
sub post_write {
my ($self, %args) = @_;
my $table = $self->{table};
my $dbh = $self->datastore->dbh
or croak "Not implemented for non-database datastores";
my $node = $args{node};
my $metadata = $args{metadata};
my $locs = $metadata->{locale} or return 1;
my @locales = ref $locs ? @$locs : ( $locs );
foreach my $locale (@locales) {
my $sth = $dbh->prepare(
"SELECT votes FROM $table WHERE node=? AND locale=?" );
$sth->execute( $node, $locale );
my ($votes) = $sth->fetchrow_array;
$sth->finish;
if ($votes) {
$votes++;
my $sth = $dbh->prepare(
"UPDATE $table SET votes=? WHERE node=? AND locale=?" );
$sth->execute( $votes, $node, $locale );
} else {
my $sth = $dbh->prepare(
"INSERT INTO $table (node, locale, votes) VALUES(?,?,1)");
$sth->execute( $node, $locale );
}
}
return 1;
}
post_write is called after each node is written. This is where we do the actual storing of the votes.
sub get_locales {
my ($self, %args) = @_;
my $table = $self->{table};
my $node = $args{node};
my $dbh = $self->datastore->dbh
or croak "Not implemented for non-database datastores";
my $sth = $dbh->prepare( "SELECT locale, votes FROM $table WHERE node=?" );
$sth->execute($node);
my %locales;
while ( my ($locale, $vote) = $sth->fetchrow_array ) {
$locales{$locale} = $vote;
}
return %locales;
}
Finally, get_locales just does a straight SQL query to find the locales a given node has been placed in, and the number of votes for each locale.
What else can we do with this? I really don't know the extent of it. It seems that every time I talk with a group of programmers I come up with a new thing I can use CGI::Wiki for. As mentioned at the start of this article, I feel some of the original Wiki design principles are more immutable than others. In particular, I feel that almost anything calling itself a Wiki is greatly enhanced by being two things:
Everything else is just there waiting for you to stretch it in interesting ways.
TPJ