| Text-KwikiFormatish documentation | Contained in the Text-KwikiFormatish distribution. |
<strong> instead of <b><code> instead of <tt>like---this<blockquote> tagsText::KwikiFormatish - convert Kwikitext into XML-compliant HTML
use Text::KwikiFormatish; my $xhtml = Text::KwikiFormatish::format($text);
NOTE: This module is based off of the old CGI::Kwiki formatter. Ideally, Text::KwikiFormat would be written to use the new the new Kwiki formatter.
CGI::Kwiki includes a formatter (CGI::Kwiki::Formatter) for converting Kwikitext (a nice form of wikitext) to HTML. Unfortunately, it isn't easy to use the formatter outside the CGI::Kwiki environment. Additionally, the HTML produced by the formatter isn't XHTML-1 compliant. This module aims to fix both of these issues and provide an interface similar to Text::WikiFormat.
Essentially, this module is the code from Brian Ingerson's
CGI::Kwiki::Formatter with a format subroutine, code relating to slides
removed, tweaked subroutinesa, and more.
Since the wikitext spec for input wikitext for this module differs a little from the default Kwiki formatter, I thought it best to call it "Formatish" instead of *the* Kwiki Format.
format() takes one or two arguments, with the first always being the
wikitext to translate. The second is a hash of options, but currently the only
option supported is prefix in case you want to prefix wiki links with
sommething. For example,
my $xml = Text::KwikiFormatish::format(
$text,
prefix => '/wiki/',
);
CGI::Kwiki::Formatter was designed to be subclassable so that the formatting engine could be easily customized. Information on how the Kwiki formatter works can be found at HowKwikiFormatterWorks.
For example, say you wanted to override the markup for strong (bold) text. You
decide that it would make much more sense to write strong text as HEYthis is
bold textHEY. You would subclass Text::KwikiFormatish and use it like so:
package My::Formatter;
use base 'Text::KwikiFormatish';
# I simply copied this from Text/KwikiFormatish.pm and tweaked it
sub bold {
my ($self, $text) = @_;
$text =~ s#(?<![$WORD])HEY(\S.*?\S|\S)HEY(?![$WORD])#<strong>$1</strong>#g;
return $text;
}
package main;
my $data = join '', <>;
print My::Formatter->new->process( $data );
Process the given TEXT as KwikiText and return XHTML.
process_order() returns a list of the formatting rules that will be applied
when format is called for this object. If called with a set of formatting
rules (names of class methods), that set of formatting rules will supercede the
default set.
These are the methods you'll probably override most often. They define the regular expressions that the formatter uses to split text as well as what to do with each chunk.
Many of these methods have corrosponding format_xxx methods, which take the
chunk and format it as XHTML.
The default user functions are icon, img and glyph. In the default markup, plugins are entered in the form of [&name arg1 arg2 ...].
[&icon /icons/fun.png]
[&img some_image.jpg]
[&img another_image.jpg This image will have a title attribute]
<strong> instead of <b><code> instead of <tt>like---this<blockquote> tagsI've included two plugins, img and icon, to do basic image support besides the standard operation of including an image when the URL ends with a common image extension.
Here's some kwiki text. (Compare with KwikiFormattingRules.)
= Level 1 Header
== Level 2 with optional trailing equals ==
Kwikitext provides a bit more flexibility than regular wikitext.
All HTML code is <escaped>. Horizontal rules are four or more hyphens:
----
While you can add an mdash---like this.
##
## you can add comments in the kwikitext which appear as XML comments
##
== Links
=== Itemized Lists
* Fruit
** Oranges
** Apples
* Eggs
* Salad
=== Enumerated Lists
##
## below are zero's, not "oh's"
##
0 One
0 Two
0 Three
* Comments in the wikitext
* Easier:
** Bold/strong
** Italic/emphasized
== More Markup
*strong or bold text*
//emphasized or italic text//
indented text is verbatim (good for code)
== Links
WikiLink
!NotAWikiLink
http://www.kwiki.org/
[Kwiki named link http://www.kwiki.org/]
== Images
http://search.cpan.org/s/img/cpan_banner.png
== KwikiFormatish plugins
This inserts an image with the CSS class of "icon" -- good for inserting a right-aligned image for text to wrap around.
[&icon /images/logo.gif]
The following inserts an image with an optional caption:
[&img /images/graph.gif Last Month's Earnings]
Maintained by Ian Langworth - ian@cpan.org
Based on CGI::Kwiki::Formatter by Brian Ingerson.
Thanks to Jesse Vincent for the
process_order patch, related documentation and testing.
Additional thanks to Mike Burns, Ari Pollak and Ricardo SIGNES for additional testing.
This is free software. You may use it and redistribute it under the same terms as perl itself.
| Text-KwikiFormatish documentation | Contained in the Text-KwikiFormatish distribution. |
package Text::KwikiFormatish; use strict; use warnings; our $VERSION = '1.11'; use CGI::Util qw(escape unescape);
use vars qw($UPPER $LOWER $ALPHANUM $WORD $WIKIWORD @DEFAULTPROCESSORDER); $UPPER = '\p{UppercaseLetter}'; $LOWER = '\p{LowercaseLetter}'; $ALPHANUM = '\p{Letter}\p{Number}'; $WORD = '\p{Letter}\p{Number}\p{ConnectorPunctuation}'; $WIKIWORD = "$UPPER$LOWER\\p{Number}\\p{ConnectorPunctuation}"; @DEFAULTPROCESSORDER = qw( function header_1 header_2 header_3 header_4 header_5 header_6 escape_html horizontal_line comment lists code paragraph named_http_link no_http_link http_link no_mailto_link mailto_link no_wiki_link force_wiki_link wiki_link inline negation bold italic underscore mdash table );
sub format { my ( $raw, %args ) = @_; # create instance of formatter my $f = __PACKAGE__->new(); # translate Text::Wikiformat args to Kwiki formatter args $f->{_node_prefix} = $args{prefix} if exists $args{prefix}; # do the deed return $f->process($raw); }
sub process { my ( $self, $wiki_text ) = @_; my $array = []; push @$array, $wiki_text . "\n"; for my $method ( $self->process_order ) { $array = $self->_dispatch( $array, $method ); } return $self->_combine_chunks($array); }
sub process_order { my $self = shift; @{ $self->{'process_order'} } = @_ if (@_); return ( @{ $self->{'process_order'} } ); } sub _dispatch { my ( $self, $old_array, $method ) = @_; return $old_array unless $self->can($method); my $new_array; for my $chunk (@$old_array) { if ( ref $chunk eq 'ARRAY' ) { push @$new_array, $self->_dispatch( $chunk, $method ); } else { if ( ref $chunk ) { push @$new_array, $chunk; } else { push @$new_array, $self->$method($chunk); } } } return $new_array; } sub _combine_chunks { my ( $self, $chunk_array ) = @_; my $formatted_text = ''; for my $chunk (@$chunk_array) { $formatted_text .= ( ref $chunk eq 'ARRAY' ) ? $self->_combine_chunks($chunk) : ( ref $chunk ) ? $$chunk : $chunk; } return $formatted_text; }
sub new { my ( $class, %args ) = @_; my $self = {}; bless $self, $class; my %defs = ( node_prefix => './', ); my %collated = ( %defs, %args ); foreach my $k ( keys %defs ) { $self->{ "_" . $k } = $collated{$k}; } $self->process_order(@DEFAULTPROCESSORDER); $self->init(%args); return $self; }
sub init { }
sub split_method { my ( $self, $text, $regexp, $method ) = @_; my $i = 0; map { $i++ % 2 ? \$self->$method($_) : $_ } split $regexp, $text; }
sub escape_html { my ( $self, $text ) = @_; $text =~ s/&/&/g; $text =~ s/</</g; $text =~ s/>/>/g; $text; }
sub function { my ( $self, $text ) = @_; $self->split_method( $text, qr{\[\&(\w+\b.*?)\]}, '_function_format', ); } sub _function_format { my ( $self, $text ) = @_; my ( $method, @args ) = split; $self->_isa_function($method) ? $self->$method(@args) : "<!-- Function not supported here: $text -->\n"; } sub _isa_function { my ( $self, $function ) = @_; defined { map { ( $_, 1 ) } $self->user_functions }->{$function} and $self->can($function); }
sub table { my ( $self, $text ) = @_; my @array; while ( $text =~ /(.*?)(^\|[^\n]*\|\n.*)/ms ) { push @array, $1; my $table; ( $table, $text ) = $self->_parse_table($2); push @array, $table; } push @array, $text if length $text; return @array; } sub _parse_table { my ( $self, $text ) = @_; my $error = ''; my $rows; while ( $text =~ s/^(\|(.*)\|\n)// ) { $error .= $1; my $data = $2; my $row = []; for my $datum ( split /\|/, $data ) { $datum =~ s/^\s*(.*?)\s*$/$1/; if ( $datum =~ s/^<<(\S+)$// ) { my $marker = $1; while ( $text =~ s/^(.*\n)// ) { my $line = $1; $error .= $line; if ( $line eq "$marker\n" ) { $marker = ''; last; } $datum .= $line; } if ( length $marker ) { return ( $error, $text ); } } push @$row, $datum; } push @$rows, $row; } return ( $self->format_table($rows), $text ); }
sub format_table { my ( $self, $rows ) = @_; my $cols = 0; for (@$rows) { $cols = @$_ if @$_ > $cols; } my $table = qq{<table border="1">\n}; for my $row (@$rows) { $table .= qq{<tr valign="top">\n}; for ( my $i = 0; $i < @$row; $i++ ) { my $colspan = ''; if ( $i == $#{$row} and $cols - $i > 1 ) { $colspan = ' colspan="' . ( $cols - $i ) . '"'; } my $cell = $self->escape_html( $row->[$i] ); $cell = qq{<pre>$cell</pre>\n} if $cell =~ /\n/; $cell = ' ' unless length $cell; $table .= qq{<td$colspan>$cell</td>\n}; } $table .= qq{</tr>\n}; } $table .= qq{</table>\n}; return \$table; }
sub no_wiki_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{!([$UPPER](?=[$WORD]*[$UPPER])(?=[$WORD]*[$LOWER])[$WORD]+)}, 'no_wiki_link_format', ); }
sub no_wiki_link_format { my ( $self, $text ) = @_; return $text; }
sub wiki_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{([$UPPER](?=[$WORD]*[$UPPER])(?=[$WORD]*[$LOWER])[$WORD]+)}, 'wiki_link_format', ); }
sub force_wiki_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{(?<!\!)\[([$ALPHANUM\-:]+)\]}, 'wiki_link_format', ); }
sub wiki_link_format { my ( $self, $text ) = @_; my $url = $self->escape($text); my $wiki_link = qq{<a href="./$url">$text</a>}; return $wiki_link; }
sub no_http_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{(!(?:https?|ftp|irc):\S+?)}m, 'no_http_link_format', ); }
sub no_http_link_format { my ( $self, $text ) = @_; $text =~ s#!##; return $text; }
sub http_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{((?:https?|ftp|irc):\S+?(?=[),.:;]?\s|$))}m, 'http_link_format', ); }
sub http_link_format { my ( $self, $text ) = @_; if ( $text =~ /^http.*\.(?i:jpg|gif|jpeg|png)$/ ) { return $self->img_format($text); } else { return $self->link_format($text); } }
sub no_mailto_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{(![$ALPHANUM][$WORD\-\.]*@[$WORD][$WORD\-\.]+)}m, 'no_mailto_link_format', ); }
sub no_mailto_link_format { my ( $self, $text ) = @_; $text =~ s#!##; return $text; }
sub mailto_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{([$ALPHANUM][$WORD\-\.]*@[$WORD][$WORD\-\.]+)}m, 'mailto_link_format', ); }
sub mailto_link_format { my ( $self, $text ) = @_; my $dot = ( $text =~ s/\.$// ) ? '.' : ''; qq{<a href="mailto:$text">$text</a>$dot}; }
sub img_format { my ( $self, $url ) = @_; return qq{<img src="$url" />}; }
sub link_format { my ( $self, $text ) = @_; $text =~ s/(^\s*|\s+(?=\s)|\s$)//g; my $url = $text; $url = $1 if $text =~ s/(.*?) +//; $url =~ s/https?:(?!\/\/)//; return qq{<a href="$url">$text</a>}; }
sub named_http_link { my ( $self, $text ) = @_; $self->split_method( $text, qr{(?<!\!)\[([^\[\]]*?(?:https?|ftp|irc):\S.*?)\]}, 'named_http_link_format', ); }
sub named_http_link_format { my ( $self, $text ) = @_; if ( $text =~ m#(.*)((?:https?|ftp|irc):.*)# ) { $text = "$2 $1"; } return $self->link_format($text); }
sub inline { my ( $self, $text ) = @_; $self->split_method( $text, qr{(?<!\!)\[=(.*?)(?<!\\)\]}, 'inline_format', ); }
sub inline_format { my ( $self, $text ) = @_; "<code>$text</code>"; }
sub negation { my ( $self, $text ) = @_; $text =~ s#\!(?=\[)##g; return $text; }
sub bold { my ( $self, $text ) = @_; $text =~ s#(?<![$WORD])\*(\S.*?\S|\S)\*(?![$WORD])#<strong>$1</strong>#g; return $text; }
sub italic { my ( $self, $text ) = @_; $text =~ s#(?<![$WORD<])//(\S.*?\S|\S)//(?![$WORD])#<em>$1</em>#g; return $text; }
sub underscore { my ( $self, $text ) = @_; $text =~ s#(?<![$WORD])_(\S.*?\S)_(?![$WORD])#<u>$1</u>#g; return $text; }
sub code { my ( $self, $text ) = @_; $self->split_method( $text, qr{(^ +[^ \n].*?\n)(?-ms:(?=[^ \n]|$))}ms, 'code_format', ); }
sub code_format { my ( $self, $text ) = @_; $self->_code_postformat( $self->_code_preformat($text) ); } sub _code_preformat { my ( $self, $text ) = @_; my ($indent) = sort { $a <=> $b } map {length} $text =~ /^( *)\S/mg; $text =~ s/^ {$indent}//gm; #return $self->escape_html($text); ## already done in process order return $text; } sub _code_postformat { my ( $self, $text ) = @_; return "<pre>$text</pre>\n"; }
sub lists { my ( $self, $text ) = @_; my $switch = 0; return map { my $level = 0; my @tag_stack; if ( $switch++ % 2 ) { my $text = ''; my @lines = /(.*\n)/g; for my $line (@lines) { $line =~ s/^([0\*]+) //; my $new_level = length($1); my $tag = ( $1 =~ /0/ ) ? 'ol' : 'ul'; if ( $new_level > $level ) { for ( 1 .. ( $new_level - $level ) ) { push @tag_stack, $tag; $text .= "<$tag>\n"; } $level = $new_level; } elsif ( $new_level < $level ) { for ( 1 .. ( $level - $new_level ) ) { $tag = pop @tag_stack; $text .= "</$tag>\n"; } $level = $new_level; } $text .= "<li>$line</li>"; } for ( 1 .. $level ) { my $tag = pop @tag_stack; $text .= "</$tag>\n"; } $_ = $self->lists_format($text); } $_; } split m!(^[0\*]+ .*?\n)(?=(?:[^0\*]|$))!ms, $text; }
sub lists_format { my ( $self, $text ) = @_; return $text; }
sub paragraph { my ( $self, $text ) = @_; my $switch = 0; return map { unless ( $switch++ % 2 ) { $_ = $self->paragraph_format($_); } $_; } split m!(\n\s*\n)!ms, $text; }
sub paragraph_format { my ( $self, $text ) = @_; return '' if $text =~ /^[\s\n]*$/; return $text if $text =~ /^<(o|u)l>/i; return "<p>\n$text\n</p>\n"; }
sub horizontal_line { my ( $self, $text ) = @_; $self->split_method( $text, qr{^(----+)\s*$}m, 'horizontal_line_format', ); }
sub horizontal_line_format { my ($self) = @_; my $text = "<hr/>\n"; return $text; }
sub mdash { my ( $self, $text ) = @_; $text =~ s/([$WORD])-{3}([$WORD])/$1—$2/g; return $text; }
sub comment { my ( $self, $text ) = @_; $self->split_method( $text, qr{^\#\#(.*)$}m, 'comment_line_format', ); }
sub comment_line_format { my ( $self, $text ) = @_; return "<!-- $text -->\n"; }
for my $num ( 1 .. 6 ) { no strict 'refs'; *{"header_$num"} = sub { my ( $self, $text ) = @_; $self->split_method( $text, qr#^={$num} (.*?)(?: =*)?\n#m, "header_${num}_format", ); }; *{"header_${num}_format"} = sub { my ( $self, $text ) = @_; $text =~ s/=+\s*$//; $text = $self->escape_html($text); return "<h$num>$text</h$num>\n"; }; }
sub user_functions { qw( icon img glyph ); }
sub icon { my ( $self, $href ) = @_; return qq( <img src="$href" class="icon" alt="(icon)" /> ); }
sub img { my ( $self, $href, @title ) = @_; my $title = join( ' ', @title ) || ''; my $output = qq( <p style="text-align:center;"><img src="$href" alt="(see caption below)" title="$title" align="middle" border="0" /> ); $output .= @title ? "<br/><small>$title</small>" : ''; return $output . '</p>'; }
sub glyph { # FIXME - BROKEN! Plugins like to separate the paragraphs my ( $self, $href, @title ) = @_; my $title = join( ' ', @title ) || '*'; return qq( <img src="$href" alt="$title" title="$title" align="middle" border="0" /> ); }
1;