| YAPE-Regex documentation | Contained in the YAPE-Regex distribution. |
YAPE MODULESYAPE::Regex::ElementYAPE::Regex::anchorYAPE::Regex::macroYAPE::Regex::octYAPE::Regex::hexYAPE::Regex::utf8hexYAPE::Regex::backrefYAPE::Regex::ctrlYAPE::Regex::namedYAPE::Regex::CcharYAPE::Regex::slashYAPE::Regex::anyYAPE::Regex::classYAPE::Regex::hexYAPE::Regex::altYAPE::Regex::commentYAPE::Regex::whitespaceYAPE::Regex::flagsYAPE::Regex::cutYAPE::Regex::lookaheadYAPE::Regex::lookbehindYAPE::Regex::conditionalYAPE::Regex::groupYAPE::Regex::captureYAPE::Regex::codeYAPE::Regex::laterYAPE::Regex::closeYAPE::Regex::Element - sub-classes for YAPE::Regex elements
This document refers to YAPE::Regex::Element version 4.00.
use YAPE::Regex 'MyExt::Mod'; # this sets up inheritence in MyExt::Mod # see YAPE::Regex documentation
YAPE MODULESThe YAPE hierarchy of modules is an attempt at a unified means of parsing
and extracting content. It attempts to maintain a generic interface, to
promote simplicity and reusability. The API is powerful, yet simple. The
modules do tokenization (which can be intercepted) and build trees, so that
extraction of specific nodes is doable.
This module provides the classes for the YAPE::Regex objects. The base
class for these objects is YAPE::Regex::Element. The objects classes are
numerous.
YAPE::Regex::ElementThis class contains fallback methods for the other classes.
my $str = $obj->text;Returns a string representation of the content of the regex node itself, not
any nodes contained in it. This is undef for non-text nodes.
my $str = $obj->string;Returns a string representation of the regex node itself, not any nodes contained in it.
my $str = $obj->fullstring;Returns a string representation of the regex node, including any nodes contained in it.
my $quant = $obj->quant;Returns a string with the quantity, and a ? if the node is non-greedy. The
quantity is one of *, +, ?, {M,N}, or an empty string.
my $ng = $obj->ngreed;Returns a ? if the node is non-greedy, and an empty string otherwise.
YAPE::Regex::anchorThis class represents anchors. Objects have the following methods:
my $anchor = YAPE::Regex::anchor->new($type,$q,$ng);Creates a YAPE::Regex::anchor object. Takes three arguments: the anchor
(^, \A, $, \Z, \z, \B, \b, or \G), the quantity, and
the non-greedy flag. The quantity should be an empty string.
my $anc = YAPE::Regex::anchor->new('\A', '', '?');
# /\A?/
my $type = $anchor->type;Returns the string anchor.
YAPE::Regex::macroThis class represents character-class macros. Objects have the following methods:
my $macro = YAPE::Regex::macro->new($type,$q,$ng);Creates a YAPE::Regex::macro object. Takes three arguments: the macro
(w, W, d, D, s, or S), the quantity, and the non-greedy
flag.
my $macro = YAPE::Regex::macro->new('s', '{3,5}');
# /\s{3,5}/
my $text = $macro->text;Returns the macro.
print $macro->text; # '\s'
my $type = $macro->type;Returns the string macro.
YAPE::Regex::octThis class represents octal escapes. Objects have the following methods:
my $oct = YAPE::Regex::oct->new($type,$q,$ng);Creates a YAPE::Regex::oct object. Takes three arguments: the octal number
(as a string), the quantity, and the non-greedy flag.
my $oct = YAPE::Regex::oct->new('040');
# /\040/
my $text = $oct->text;Returns the octal escape.
print $oct->text; # '\040'
my $type = $oct->type;Returns the string oct.
YAPE::Regex::hexThis class represents hexadecimal escapes. Objects have the following methods:
my $hex = YAPE::Regex::hex->new($type,$q,$ng);Creates a YAPE::Regex::hex object. Takes three arguments: the hexadecimal
number (as a string), the quantity, and the non-greedy flag.
my $hex = YAPE::Regex::hex->new('20','{2,}');
# /\x20{2,}/
my $text = $hex->text;Returns the hexadecimal escape.
print $hex->text; # '\x20'
my $type = $hex->type;Returns the string hex.
YAPE::Regex::utf8hexThis class represents UTF hexadecimal escapes. Objects have the following methods:
my $hex = YAPE::Regex::utf8hex->new($type,$q,$ng);Creates a YAPE::Regex::utf8hex object. Takes three arguments: the
hexadecimal number (as a string), the quantity, and the non-greedy flag.
my $utf8hex = YAPE::Regex::utf8hex->new('beef','{0,4}');
# /\x{beef}{2,}/
my $text = $utf8hex->text;Returns the hexadecimal escape.
print $utf8hex->text; # '\x{beef}'
my $type = $utf8hex->type;Returns the string utf8hex.
YAPE::Regex::backrefThis class represents back-references. Objects have the following methods:
my $bref = YAPE::Regex::bref->new($type,$q,$ng);Creates a YAPE::Regex::bref object. Takes three arguments: the number of
the back-reference, the quantity, and the non-greedy flag.
my $bref = YAPE::Regex::bref->new(2,'','?'); # /\2?/
my $text = $bref->text;Returns the backescape.
print $bref->text; # '\2'
my $type = $bref->type;Returns the string backref.
YAPE::Regex::ctrlThis class represents control character escapes. Objects have the following methods:
my $ctrl = YAPE::Regex::ctrl->new($type,$q,$ng);Creates a YAPE::Regex::ctrl object. Takes three arguments: the control
character, the quantity, and the non-greedy flag.
my $ctrl = YAPE::Regex::ctrl->new('M');
# /\cM/
my $text = $ctrl->text;Returns the control character escape.
print $ctrl->text; # '\cM'
my $type = $ctrl->type;Returns the string ctrl.
YAPE::Regex::namedThis class represents named characters. Objects have the following methods:
my $ctrl = YAPE::Regex::named->new($type,$q,$ng);Creates a YAPE::Regex::named object. Takes three arguments: the name of the
character, the quantity, and the non-greedy flag.
my $named = YAPE::Regex::named->new('GREEK SMALL LETTER BETA');
# /\N{GREEK SMALL LETTER BETA}/
my $text = $named->text;Returns the character escape text.
print $named->text; # '\N{GREEK SMALL LETTER BETA}'
my $type = $named->type;Returns the string named.
YAPE::Regex::CcharThis class represents C characters. Objects have the following methods:
my $ctrl = YAPE::Regex::Cchar->new($q,$ng);Creates a YAPE::Regex::Cchar object. Takes two arguments: the quantity and
the non-greedy flag.
my $named = YAPE::Regex::Char->new(2);
# /\C{2}/
my $text = $Cchar->text;Returns the escape sequence.
print $Cchar->text; # '\C'
my $type = $Cchar->type;Returns the string Cchar.
YAPE::Regex::slashThis class represents any other escaped characters. Objects have the following methods:
my $slash = YAPE::Regex::slash->new($type,$q,$ng);Creates a YAPE::Regex::slash object. Takes three arguments: the backslashed
character, the quantity, and the non-greedy flag.
my $slash = YAPE::Regex::slash->new('t','','?');
# /\t?/
my $text = $slash->text;Returns the escaped character.
print $slash->text; # '\t'
my $type = $slash->type;Returns the string slash.
YAPE::Regex::anyThis class represents the dot metacharacter. Objects have the following methods:
my $any = YAPE::Regex::any->new($q,$ng);Creates a YAPE::Regex::any object. Takes two arguments: the quantity, and
the non-greedy flag.
my $any = YAPE::Regex::any->new('{1,3}');
# /.{1,3}/
my $type = $any->type;Returns the string any.
YAPE::Regex::classThis class represents character classes. Objects have the following methods:
my $class = YAPE::Regex::class->new($chars,$neg,$q,$ng);Creates a YAPE::Regex::class object. Takes four arguments: the characters
in the class, a ^ if the class is negated (an empty string otherwise), the
quantity, and the non-greedy flag.
my $class = YAPE::Regex::class->new('aeiouy','^');
# /[^aeiouy]/
my $text = $class->text;Returns the character class.
print $class->text; # [^aeiouy]
my $type = $class->type;Returns the string class.
YAPE::Regex::hexThis class represents hexadecimal escapes. Objects have the following methods:
my $text = YAPE::Regex::text->new($text,$q,$ng);Creates a YAPE::Regex::text object. Takes three arguments: the text, the
quantity, and the non-greedy flag. The quantity and non-greedy modifier should
only be present for single-character text, because of the way the parser
renders the quantity and non-greedy modifier.
my $text = YAPE::Regex::text->new('alphabet','');
# /alphabet/
my $text = YAPE::Regex::text->new('x','?','?');
# /x??/
my $type = $text->type;Returns the string text.
YAPE::Regex::altThis class represents alternation. Objects have the following methods:
my $alt = YAPE::Regex::alt->new;Creates a YAPE::Regex::alt object.
my $alt = YAPE::Regex::alt->new; # /|/
my $type = $oct->type;Returns the string alt.
YAPE::Regex::commentThis class represents in-line comments. Objects have the following methods:
my $comment = YAPE::Regex::comment->new($comment,$x);Creates a YAPE::Regex::comment object. Takes two arguments: the text of the
comment, and whether or not the /x regex modifier is in effect for this
comment. Note that Perl's regex engine will stop a (?#...) comment at the
first ), regardless of what you do.
my $comment = YAPE::Regex::comment->new(
"match an optional string of digits"
);
# /(?#match an optional string of digits)/
my $comment = YAPE::Regex::comment->new(
"match an optional string of digits",
1
);
# /# match an optional string of digits/
my $type = $comment->type;Returns the string comment.
my $x_on = $comment->xcomm;Returns true or false, depending on whether the comment is under the /x regex
modifier.
YAPE::Regex::whitespaceThis class represents whitespace under the /x regex modifier. Objects have
the following methods:
my $ws = YAPE::Regex::whitespace->new($text);Creates a YAPE::Regex::whitespace object. Takes one argument: the text of
the whitespace.
my $ws = YAPE::Regex::whitespace->new(' ');
# / /x
my $text = $ws->text;Returns the whitespace.
print $ws->text; # ' '
my $type = $ws->type;Returns the string whitespace.
YAPE::Regex::flagsThis class represents (?ismx) flags. Objects have the following methods:
my $flags = YAPE::Regex::flags->new($add,$sub);Creates a YAPE::Regex::flags object. Takes two arguments: a string of the
modes to have on, and a string of the modes to explicitly turn off. The flags
are displayed in alphabetical order.
my $flags = YAPE::Regex::flags->new('is','m');
# /(?is-m)/
my $type = $flags->type;Returns the string flags.
YAPE::Regex::cutThis class represents the cut assertion. Objects have the following methods:
my $look = YAPE::Regex::cut->new(\@nodes);Creates a YAPE::Regex::cut object. Takes one arguments: a reference to an
array of objects to be contained in the cut.
my $REx = YAPE::Regex::class->new('aeiouy','','+');
my $look = YAPE::Regex::cut->new(0,[$REx]);
# /(?>[aeiouy]+)/
my $type = $cut->type;Returns the string cut.
YAPE::Regex::lookaheadThis class represents lookaheads. Objects have the following methods:
my $look = YAPE::Regex::lookahead->new($pos,\@nodes);Creates a YAPE::Regex::lookahead object. Takes two arguments: a boolean
value indicating whether or not the lookahead is positive, and a reference to an
array of objects to be contained in the lookahead.
my $REx = YAPE::Regex::class->new('aeiouy');
my $look = YAPE::Regex::lookahead->new(0,[$REx]);
# /(?![aeiouy])/
my $pos = $look->pos;Returns true if the lookahead is positive.
print $look->pos ? 'pos' : 'neg'; # 'neg'
my $type = $look->type;Returns the string lookahead(pos) or lookahead(neg).
YAPE::Regex::lookbehindThis class represents lookbehinds. Objects have the following methods:
my $look = YAPE::Regex::lookbehind->new($pos,\@nodes);Creates a YAPE::Regex::lookbehind object. Takes two arguments: a boolean
value indicating whether or not the lookbehind is positive, and a reference to
an array of objects to be contained in the lookbehind.
my $REx = YAPE::Regex::class->new('aeiouy','^');
my $look = YAPE::Regex::lookbehind->new(1,[$REx]);
# /(?<=[^aeiouy])/
my $pos = $look->pos;Returns true if the lookbehind is positive.
print $look->pos ? 'pos' : 'neg'; # 'pos'
my $type = $look->type;Returns the string lookbehind(pos) or lookbehind(neg).
YAPE::Regex::conditionalThis class represents conditionals. Objects have the following methods:
my $cond = YAPE::Regex::conditional->new($br,$t,$f,$q,$ng);Creates a YAPE::Regex::hex object. Takes five arguments: the number of the
back-reference (that's all that's supported in the current version), an array
reference to the "true" pattern, an array reference to the "false" pattern, and
the quantity and non-greedy flag.
my $cond = YAPE::Regex::conditional->new(
2,
[],
[ YAPE::Regex::text->new('foo') ],
'?',
);
# /(?(2)|foo)?/
my $br = $cond->backref;Returns the number of the back-reference the conditional depends on.
print $br->backref; # 2
my $type = $cond->type;Returns the string conditional(N), where N is the number of the
back-reference.
YAPE::Regex::groupThis class represents non-capturing groups. Objects have the following methods:
my $group = YAPE::Regex::group->new($on,$off,\@nodes,$q,$ng);Creates a YAPE::Regex::group object. Takes five arguments: the modes turned
on, the modes explicitly turned off, a reference to an array of objects in the
group, the quantity, and the non-greedy flag. The modes are displayed in
alphabetical order.
my $group = YAPE::Regex::group->new(
'i',
's',
[
YAPE::Regex::macro->new('d', '{2}'),
YAPE::Regex::macro->new('s'),
YAPE::Regex::macro->new('d', '{2}'),
],
'?',
);
# /(?i-s:\d{2}\s\d{2})?/
my $type = $group->type;Returns the string group.
YAPE::Regex::captureThis class represents capturing groups. Objects have the following methods:
my $capture = YAPE::Regex::capture->new(\@nodes,$q,$ng);Creates a YAPE::Regex::capture object. Takes three arguments: a reference
to an array of objects in the group, the quantity, and the non-greedy flag.
my $capture = YAPE::Regex::capture->new(
[
YAPE::Regex::macro->new('d', '{2}'),
YAPE::Regex::macro->new('s'),
YAPE::Regex::macro->new('d', '{2}'),
],
);
# /(\d{2}\s\d{2})/
my $type = $capture->type;Returns the string capture.
YAPE::Regex::codeThis class represents code blocks. Objects have the following methods:
my $code = YAPE::Regex::code->new($block);Creates a YAPE::Regex::code object. Takes one arguments: a string holding
a block of code.
my $code = YAPE::Regex::code->new(q({ push @poss, $1 }));
# /(?{ push @poss, $1 })/
my $type = $code->type;Returns the string code.
YAPE::Regex::laterThis class represents closed parentheses. Objects have the following methods:
my $later = YAPE::Regex::later->new($block);Creates a YAPE::Regex::later object. Takes one arguments: a string holding
a block of code.
my $later = YAPE::Regex::later->new(q({ push @poss, $1 }));
# /(?{{ push @poss, $1 }})/
my $type = $later->type;Returns the string later.
YAPE::Regex::closeThis class represents closed parentheses. Objects have the following methods:
my $close = YAPE::Regex::close->new($q,$ng);Creates a YAPE::Regex::close object. Takes two arguments: the quantity, and
the non-greedy flag. This object is never needed in the tree; however, they are
returned in the parsing stage, so that you know when they've been reached.
my $close = YAPE::Regex::close->new('?','?');
# /)??/
my $type = $close->type;Returns the string close.
This is a listing of things to add to future versions of this module.
Following is a list of known or reported bugs.
Visit YAPE's web site at http://www.pobox.com/~japhy/YAPE/.
The YAPE::Regex documentation, for information on the main class.
The original author is Jeff "japhy" Pinyan (CPAN ID: PINYAN).
Gene Sullivan (gsullivan@cpan.org) is a co-maintainer.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.
| YAPE-Regex documentation | Contained in the YAPE-Regex distribution. |
package YAPE::Regex::Element; $VERSION = '4.00'; sub text { exists $_[0]{TEXT} ? $_[0]{TEXT} : "" } sub string { $_[0]->text . "$_[0]{QUANT}$_[0]{NGREED}" } sub fullstring { $_[0]->string } sub quant { "$_[0]{QUANT}$_[0]{NGREED}" } sub ngreed { $_[0]{NGREED} } package YAPE::Regex::anchor; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub type { 'anchor' } package YAPE::Regex::macro; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\$_[0]{TEXT}" } sub type { 'macro' } package YAPE::Regex::oct; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\$_[0]{TEXT}" } sub type { 'oct' } package YAPE::Regex::hex; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\x$_[0]{TEXT}" } sub type { 'hex' } package YAPE::Regex::backref; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\$_[0]{TEXT}" } sub type { 'backref' } package YAPE::Regex::ctrl; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\c$_[0]{TEXT}" } sub type { 'ctrl' } package YAPE::Regex::named; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\N{$_[0]{TEXT}}" } sub type { 'named' } package YAPE::Regex::Cchar; sub new { my ($class,$q,$ng) = @_; bless { QUANT => $q, NGREED => $ng }, $class; } sub text { '\C' } sub type { 'Cchar' } package YAPE::Regex::slash; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub text { "\\$_[0]{TEXT}" } sub type { 'slash' } package YAPE::Regex::any; sub new { my ($class,$q,$ng) = @_; bless { TEXT => '.', QUANT => $q, NGREED => $ng }, $class; } sub type { 'any' } package YAPE::Regex::class; sub new { my ($class,$match,$neg,$q,$ng) = @_; bless { TEXT => $match, NEG => $neg, QUANT => $q, NGREED => $ng }, $class; } sub text { $_[0]{NEG} =~ /[pP]/ ? "\\$_[0]{NEG}\{$_[0]{TEXT}\}" : "[$_[0]{NEG}$_[0]{TEXT}]" } sub type { 'class' } package YAPE::Regex::text; sub new { my ($class,$match,$q,$ng) = @_; bless { TEXT => $match, QUANT => $q, NGREED => $ng }, $class; } sub type { 'text' } package YAPE::Regex::alt; sub new { bless { NGREED => "", QUANT => "" }, $_[0] } sub text { '' } sub string { '|' } sub type { 'alt' } package YAPE::Regex::comment; sub new { my ($class,$text,$X) = @_; bless { TEXT => $text, XCOMM => $X }, $class; } sub string { $_[0]{XCOMM} ? " # $_[0]{TEXT}" : "(?#$_[0]{TEXT})" } sub xcomm { $_[0]{XCOMM} } sub type { 'comment' } package YAPE::Regex::whitespace; sub new { my ($class,$text) = @_; bless { TEXT => $text }, $class; } sub type { 'whitespace' } sub string { $_[0]{TEXT} } package YAPE::Regex::flags; sub new { my ($class,$add,$sub) = @_; my %mode = map { $_ => 1 } split //, $add ||= ""; delete @mode{split //, $sub ||= ""}; $add = join "", sort split //, $add; $sub = join "", sort split //, $sub; bless { MODE => \%mode, ON => $add, OFF => $sub }, $class; } sub string { "(?$_[0]{ON}" . ($_[0]{OFF} && "-$_[0]{OFF}") . ')' } sub type { 'flags' } package YAPE::Regex::cut; sub new { bless { CONTENT => $_[1] || [], QUANT => $_[2] || "", NGREED => $_[3] || "", }, $_[0] } sub fullstring { join "", $_[0]->string, map($_->fullstring, @{ $_[0]{CONTENT} }), ")$_[0]{QUANT}$_[0]{NGREED}" } sub string { '(?>' } sub type { 'cut' } package YAPE::Regex::lookahead; sub new { bless { POS => $_[1], CONTENT => $_[2] || [] }, $_[0] } sub fullstring { join "", $_[0]->string, map($_->fullstring, @{ $_[0]{CONTENT} }), ')' } sub string { '(?' . ('!','=')[$_[0]{POS}] } sub type { 'lookahead' } sub pos { $_[0]{POS} } package YAPE::Regex::lookbehind; sub new { bless { POS => $_[1], CONTENT => $_[2] || [] }, $_[0] } sub fullstring { join "", $_[0]->string, map($_->fullstring, @{ $_[0]{CONTENT} }), ')' } sub string { '(?<' . ('!','=')[$_[0]{POS}] } sub type { 'lookbehind' } sub pos { $_[0]{POS} } package YAPE::Regex::conditional; sub new { bless { OPTS => 1, CONTENT => $_[1] || [], TRUE => $_[2] || [], FALSE => $_[3] || [], QUANT => $_[4] || "", NGREED => $_[5] || "", }, $_[0]; } sub fullstring { join "", $_[0]->string, map($_->fullstring, @{ $_[0]{TRUE} }), $_[0]{OPTS} == 2 ? ( '|', map($_->fullstring, @{ $_[0]{FALSE} }), ) : (), ")$_[0]{QUANT}$_[0]{NGREED}"; } sub string { '(?' . (ref $_[0]{CONTENT} ? $_[0]{CONTENT}[0]->fullstring : "($_[0]{CONTENT})" ) } sub backref { $_[0]{CONTENT} } sub type { 'cond' } package YAPE::Regex::group; sub new { my $on = join "", sort split //, $_[1] || ""; my $off = join "", sort split //, $_[2] || ""; bless { ON => $on, OFF => $off, CONTENT => $_[3] || [], QUANT => $_[4] || "", NGREED => $_[5] || "", }, $_[0] } sub fullstring { join "", $_[0]->string, map($_->fullstring, @{ $_[0]{CONTENT} }), ")$_[0]{QUANT}$_[0]{NGREED}" } sub string { $_[0]{OFF} ? "(?$_[0]{ON}-$_[0]{OFF}:" : "(?$_[0]{ON}:" } sub type { 'group' } package YAPE::Regex::capture; sub new { bless { CONTENT => $_[1] || [], QUANT => $_[2] || "", NGREED => $_[3] || "", }, $_[0] } sub fullstring { join "", $_[0]->string, map($_->fullstring, @{ $_[0]{CONTENT} }), ")$_[0]{QUANT}$_[0]{NGREED}" } sub string { '(' } sub type { 'capture' } package YAPE::Regex::code; sub new { bless { CONTENT => $_[1], QUANT => "", NGREED => "", }, $_[0] } sub text { "(?$_[0]{CONTENT})" } sub type { 'code' } package YAPE::Regex::later; sub new { bless { CONTENT => $_[1], QUANT => "", NGREED => "", }, $_[0] } sub text { "(??$_[0]{CONTENT})" } sub type { 'later' } package YAPE::Regex::close; sub new { bless { QUANT => $_[1] || "", NGREED => $_[2] || "" }, $_[0] } sub string { ")$_[0]{QUANT}$_[0]{NGREED}" } sub type { 'close' } 1;