| WebFetch documentation | Contained in the WebFetch distribution. |
WebFetch::Output::TWiki - WebFetch output to TWiki web site
This is an output module for WebFetch which places the data in pages on a TWiki web site. Some of its configuration information is read from a TWiki page. Calling or command-line parameters point to the TWiki page which has the configuration and a search key to locate the correct line in a table.
From the command line...
perl -w -I$libdir -MWebFetch::Input::Atom -MWebFetch::Output::TWiki -e "&fetch_main" -- --dir "/path/to/fetch/worskspace" --source "http://search.twitter.com/search.atom?q=%23twiki" --dest=twiki --twiki_root=/var/www/twiki --config_topic=Feeds.WebFetchConfig --config_key=twiki
From Perl code...
use WebFetch;
my $obj = WebFetch->new(
"dir" => "/path/to/fetch/workspace",
"source" => "http://search.twitter.com/search.atom?q=%23twiki",
"source_format" => "atom",
"dest" => "twiki",
"dest_format" = "twiki",
"twiki_root" => "/var/www/twiki",
"config_topic" => "Feeds.WebFetchConfig",
"config_key" => "twiki",
);
$obj->do_actions; # process output
$obj->save; # save results
The configuration information on feeds is kept in a TWiki page. You can
specify any page with a web and topic name, for example --config_topic=Feeds.WebFetchConfig .
The contents of that configuration page could look like this, though with any feeds you want to configure. The "Key" field matches the --config_key command-line parameter, and then brings in the rest of the configuration info from that line. An example is shown below.
The following table is used by !WebFetch to configure news feeds
%STARTINCLUDE%
| *Key* | *Web* | *Parent* | *Prefix* | *Template* | *Form* | *Options* | *Modul
e* | *Source* |
| ikluft-twitter | Feeds | TwitterIkluftFeed | TwitterIkluft | AtomFeedTemplate | AtomFeedForm | separate_topics | Atom | http://twitter.com/statuses/user_timeline/37786023.rss |
| twiki-twitter | Feeds | TwitterTwikiFeed | TwitterTwiki | AtomFeedTemplate | AtomFeedForm | separate_topics | Atom | http://search.twitter.com/search.atom?q=%23twiki |
| cnn | Feeds | RssCnn | RssCnn | RssFeedTemplate | RssFeedForm | separate_topics | RSS | http://rss.cnn.com/rss/cnn_topstories.rss |
%STOPINCLUDE%
=back
The %STARTINCLUDE% and %STOPINCLUDE% are not required. However, if
present, they are used as boundaries for the inclusion like in a normal
INCLUDE operation on TWiki.
TWiki is a wiki (user-editable web site) with features enabling collaboration in an enterprise environment. It implements the concept of a "structured wiki", allowing structure and automation as needed and retaining the informality of a wiki. Automated input/updates such as from WebFetch::Output::TWiki is one example.
See http://twiki.org/ for the Open Source community-maintained software or http://twiki.net/ for enterprise support.
WebFetch::Output::TWiki was developed for TWiki Inc (formerly TWiki.Net).
WebFetch was written by Ian Kluft
Send patches, bug reports, suggestions and questions to
maint@webfetch.org.
Please report any bugs or feature requests to bug-webfetch-output-twiki at rt.cpan.org, or through
the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=WebFetch-Output-TWiki. I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.
| WebFetch documentation | Contained in the WebFetch distribution. |
# # WebFetch::Output::TWiki - save data into a TWiki web site # # Copyright (c) 2009 Ian Kluft. This program is free software; you can # redistribute it and/or modify it under the terms of the GNU General Public # License Version 3. See http://www.webfetch.org/GPLv3.txt package WebFetch::Output::TWiki; use warnings; use strict; use WebFetch; use base "WebFetch"; use DB_File; # define exceptions/errors use Exception::Class ( "WebFetch::Output::TWiki::Exception::NoRoot" => { isa => "WebFetch::Exception", alias => "throw_twiki_no_root", description => "WebFetch::Output::TWiki needs to be provided " ."a twiki_root parameter", }, "WebFetch::Output::TWiki::Exception::NotFound" => { isa => "WebFetch::Exception", alias => "throw_twiki_not_found", description => "the directory in the twiki_root parameter " ."doesn't exist or doesn't have a lib subdirectory", }, "WebFetch::Output::TWiki::Exception::Require" => { isa => "WebFetch::Exception", alias => "throw_twiki_require", description => "failed to import TWiki or TWiki::Func modules", }, "WebFetch::Output::TWiki::Exception::NoConfig" => { isa => "WebFetch::Exception", alias => "throw_twiki_no_config", description => "WebFetch::Output::TWiki needs to be provided " ."a config_topic parameter", }, "WebFetch::Output::TWiki::Exception::ConfigMissing" => { isa => "WebFetch::Exception", alias => "throw_twiki_config_missing", description => "WebFetch::Output::TWiki is missing a required " ."configuration parameter", }, "WebFetch::Output::TWiki::Exception::Oops" => { isa => "WebFetch::Exception", alias => "throw_twiki_oops", description => "WebFetch::Output::TWiki returned errors from " ."saving one or more entries", }, "WebFetch::Output::TWiki::Exception::FieldNotSpecified" => { isa => "WebFetch::Exception", alias => "throw_field_not_specified", description => "a required field was not defined or found", }, );
# globals/defaults our @Options = ( "twiki_root=s", "config_topic=s", "config_key=s" ); our $Usage = "--twiki_root path-to-twiki --config_topic web.topic " ."--config_key keyword"; our @default_field_names = ( qw( key web parent prefix template form options )); # no user-servicable parts beyond this point # register capabilities with WebFetch __PACKAGE__->module_register( "cmdline", "output:twiki" );
# read the TWiki configuation sub get_twiki_config { my $self = shift; WebFetch::debug "in get_twiki_config"; # find the TWiki modules if ( ! exists $self->{twiki_root}) { throw_twiki_no_root( "TWiki root directory not defined" ); } if (( ! -d $self->{twiki_root}) or ( ! -d $self->{twiki_root}."/lib" )) { throw_twiki_not_found( "can't find TWiki root or lib at " .$self->{twiki_root}); } # load the TWiki modules WebFetch::debug "loading TWiki modules"; push @INC, $self->{twiki_root}."/lib"; eval { require TWiki; require TWiki::Func; }; if ( $@ ) { throw_twiki_require ( $@ ); } # initiate TWiki library, create session as user "WebFetch" $self->{twiki_obj} = TWiki->new( "WebFetch" ); # get the contents of the TWiki topic which contains our configuration if ( !exists $self->{config_topic}) { throw_twiki_no_config( "TWiki configuration page for WebFetch " ."not defined" ); } my ( $web, $topic ) = split /\./, $self->{config_topic}; WebFetch::debug "config_topic: ".$self->{config_topic} ." -> $web, $topic"; if (( ! defined $web ) or ( ! defined $topic )) { throw_twiki_no_config( "TWiki configuration page for WebFetch " ."must be defined in the format web.topic" ); } # check if a config_key was specified before we read the configuration if ( !exists $self->{config_key}) { throw_twiki_no_config( "TWiki configuration key for WebFetch " ."not defined" ); } # read the configuration info my $config = TWiki::Func::readTopic( $web, $topic ); # if STARTINCLUDE and STOPINCLUDE are present, use only what's between if ( $config =~ /%STARTINCLUDE%\s*(.*)\s*%STOPINCLUDE%/s ) { $config = $1; } # parse the configuration WebFetch::debug "parsing configuration"; my ( @fnames, $line ); $self->{twiki_config_all} = []; $self->{twiki_keys} = {}; foreach $line ( split /\r*\n+/s, $config ) { if ( $line =~ /^\|\s*(.*)\s*\|\s*$/ ) { my @entries = split /\s*\|\s*/, $1; WebFetch::debug "read entries: ".join( ', ', @entries ); # first line contains field headings if ( ! @fnames) { # save table headings as field names my $field; foreach $field ( @entries ) { my $tmp = lc($field); $tmp =~ s/\W//g; push @fnames, $tmp; } next; } WebFetch::debug "field names: ".join " ", @fnames; # save the entries # it isn't a heading row if we got here # transfer array @entries to named fields in %config WebFetch::debug "data row: ".join " ", @entries; my ( $i, $key, %config ); for ( $i=0; $i < scalar @fnames; $i++ ) { $config{ $fnames[$i]} = $entries[$i]; if ( $fnames[$i] eq "key" ) { $key = $entries[$i]; } } # save the %config row in @{$self->{twiki_config_all}} if (( defined $key ) and ( !exists $self->{twiki_keys}{$key})) { push @{$self->{twiki_config_all}}, \%config; $self->{twiki_keys}{$key} = ( scalar @{$self->{twiki_config_all}}) - 1; } } } # select the line which is for this request if ( ! exists $self->{twiki_keys}{$self->{config_key}}) { throw_twiki_no_config "no configuration found for key " .$self->{config_key}; } $self->{twiki_config} = $self->{twiki_config_all}[$self->{twiki_keys}{$self->{config_key}}]; WebFetch::debug "twiki_config: ".join( " ", %{$self->{twiki_config}}); } # write to a TWiki page sub write_to_twiki { my $self = shift; my ( $config, $name ); # get config variables $config = $self->{twiki_config}; # parse options my ( $option ); $self->{twiki_options} = {}; foreach $option ( split /\s+/, $self->{twiki_config}{options}) { if ( $option =~ /^([^=]+)=(.*)/ ) { $self->{twiki_options}{$1} = $2; } else { $self->{twiki_options}{$option} = 1; } } # determine unique identifier field my $id_field; if ( exists $self->{twiki_options}{id_field}) { $id_field = $self->{twiki_options}{id_field}; } if ( ! defined $id_field ) { $id_field = $self->wk2fname( "id" ); } if ( ! defined $id_field ) { $id_field = $self->wk2fname( "url" ); } if ( ! defined $id_field ) { $id_field = $self->wk2fname( "title" ); } if ( ! defined $id_field ) { throw_field_not_specified "identifier field not specified"; } $self->{id_field} = $id_field; # determine from options whether each item is making metadata or topics if ( exists $self->{twiki_options}{separate_topics}) { $self->write_to_twiki_topics; } else { $self->write_to_twiki_metadata; } } # write to separate TWiki topics sub write_to_twiki_topics { my $self = shift; # get config variables my $config = $self->{twiki_config}; my $name; foreach $name ( qw( key web parent prefix template form )) { if ( !exists $self->{twiki_config}{$name}) { throw_twiki_config_missing( "missing config parameter " .$name ); } } # get text of template topic my ($meta, $template ) = TWiki::Func::readTopic( $config->{web}, $config->{template}); # open DB file for tracking unique IDs of articles already processed my %id_index; tie %id_index, 'DB_File', $self->{dir}."/".$config->{key}."_id_index.db", &DB_File::O_CREAT|&DB_File::O_RDWR, 0640; # determine initial topic name my ( %topics, @topics ); @topics = TWiki::Func::getTopicList( $config->{web}); foreach ( @topics ) { $topics{$_} = 1; } my $tnum_counter = 0; my $tnum_format = $config->{prefix}."-%07d"; # create topics with metadata from each WebFetch data record my $entry; my @oopses; my $id_field = $self->{id_field}; $self->data->reset_pos; while ( $entry = $self->data->next_record ) { # check that this entry hasn't already been forwarded to TWiki if ( exists $id_index{$entry->byname( $id_field )}) { next; } $id_index{$entry->byname( $id_field )} = time; # select topic name my $topicname = sprintf $tnum_format, $tnum_counter; while ( exists $topics{$topicname}) { $tnum_counter++; $topicname = sprintf $tnum_format, $tnum_counter; } $tnum_counter++; $topics{$topicname} = 1; my $text = $template; WebFetch::debug "write_to_twiki_topics: writing $topicname"; # create topic metadata #my $meta = TWiki::Meta->new ( $self->{twiki_obj}, $config->{web}, $topicname ); $meta->put( "TOPICPARENT", { name => $config->{parent}}); $meta->put( "FORM", { name => $config->{form}}); my $fnum; for ( $fnum = 0; $fnum <= $self->data->num_fields; $fnum++ ) { WebFetch::debug "meta: " .$self->data->field_bynum($fnum) ." = ".$entry->bynum($fnum); ( defined $self->data->field_bynum($fnum)) or next; ( $self->data->field_bynum($fnum) eq "xml") and next; ( defined $entry->bynum($fnum)) or next; WebFetch::debug "meta: OK"; $meta->putKeyed( "FIELD", { name => $self->data->field_bynum($fnum), value => $entry->bynum($fnum)}); } # save a special title field for TWiki indexes my $index_title = $entry->title; $index_title =~ s/[\t\r\n\|]+/ /gs; $index_title =~ s/^\s*//; $index_title =~ s/\s*$//; if ( length($index_title) > 60 ) { substr( $index_title, 56 ) = "..."; } WebFetch::debug "title: $index_title"; $meta->putKeyed( "FIELD", { name => "IndexTitle", title => "Indexing title", value => $index_title }); # save the topic my $oopsurl = TWiki::Func::saveTopic( $config->{web}, $topicname, $meta, $text ); if ( $oopsurl ) { WebFetch::debug "write_to_twiki_topics: " ."$topicname - $oopsurl"; push @oopses, $entry->title." -> " .$topicname." ".$oopsurl; } } # check for errors if ( @oopses ) { throw_twiki_oops( "TWiki saves failed:\n".join "\n", @oopses ); } } # write to successive items of TWiki metadata sub write_to_twiki_metadata { my $self = shift; # get config variables my $config = $self->{twiki_config}; my $name; foreach $name ( qw( key web parent )) { if ( !exists $self->{twiki_config}{$name}) { throw_twiki_config_missing( "missing config parameter " .$name ); } } # determine metadata title field my $title_field; if ( exists $self->{twiki_options}{title_field}) { $title_field = $self->{twiki_options}{title_field}; } if ( ! defined $title_field ) { $title_field = $self->wk2fname( "title" ); } if ( ! defined $title_field ) { throw_field_not_specified "title field not specified"; } # determine metadata value field my $value_field; if ( exists $self->{twiki_options}{value_field}) { $value_field = $self->{twiki_options}{value_field}; } if ( ! defined $value_field ) { $value_field = $self->wk2fname( "summary" ); } if ( ! defined $value_field ) { throw_field_not_specified "value field not specified"; } # open DB file for tracking unique IDs of articles already processed my %id_index; tie %id_index, 'DB_File', $self->{dir}."/".$config->{key}."_id_index.db", &DB_File::O_CREAT|&DB_File::O_RDWR, 0640; # get text of topic my ($meta, $text) = TWiki::Func::readTopic( $config->{web}, $config->{parent}); # start metadata line counter my $mnum_counter = 0; my $mnum_format = "line-%07d"; # create metadata lines for each entry my $entry; my @oopses; my $id_field = $self->{id_field}; $self->data->reset_pos; while ( $entry = $self->data->next_record ) { # check that this entry hasn't already been forwarded to TWiki if ( exists $id_index{$entry->byname( $id_field )}) { next; } $id_index{$entry->byname( $id_field )} = time; # select metadata field name my ( $value, $metaname ); $value = $meta->get( "FIELD", $metaname = sprintf( $mnum_format, $mnum_counter )); while ( defined $value ) { $value = $meta->get( "FIELD", $metaname = sprintf( $mnum_format, ++$mnum_counter )); } # write the value $meta->putKeyed( "FIELD", { name => $metaname, title => $entry->byname( $title_field ), value => $entry->byname( $value_field ), }); } # save the topic my $oopsurl = TWiki::Func::saveTopic( $config->{web}, $config->{parent}, $meta, $text ); if ( $oopsurl ) { throw_twiki_oops "TWiki saves failed: " .$config->{parent}." ".$oopsurl; } } # TWiki format handler sub fmt_handler_twiki { my $self = shift; my $filename = shift; # get configuration from TWiki $self->get_twiki_config; # write to TWiki topic $self->write_to_twiki; # no savables - mark it OK so WebFetch::save won't call it an error $self->no_savables_ok; 1; }
1; # End of WebFetch::Output::TWiki