Regexp::Common::balanced - provide regexes for strings with balanced


Regexp-Common documentation Contained in the Regexp-Common distribution.

Index


Code Index:

NAME

Top

Regexp::Common::balanced -- provide regexes for strings with balanced parenthesized delimiters or arbitrary delimiters.

SYNOPSIS

Top

    use Regexp::Common qw /balanced/;

    while (<>) {
        /$RE{balanced}{-parens=>'()'}/
                                   and print q{balanced parentheses\n};
    }




DESCRIPTION

Top

Please consult the manual of Regexp::Common for a general description of the works of this interface.

Do not use this module directly, but load it via Regexp::Common.

$RE{balanced}{-parens}

Returns a pattern that matches a string that starts with the nominated opening parenthesis or bracket, contains characters and properly nested parenthesized subsequences, and ends in the matching parenthesis.

More than one type of parenthesis can be specified:

        $RE{balanced}{-parens=>'(){}'}

in which case all specified parenthesis types must be correctly balanced within the string.

If we are using C{-keep} (See Regexp::Common):

$1

captures the entire expression

$RE{balanced}{-begin => "begin"}{-end => "end"}

Returns a pattern that matches a string that is properly balanced using the begin and end strings as start and end delimiters. Multiple sets of begin and end strings can be given by separating them by |s (which can be escaped with a backslash).

    qr/$RE{balanced}{-begin => "do|if|case"}{-end => "done|fi|esac"}/

will match properly balanced strings that either start with do and end with done, start with if and end with fi, or start with case and end with esac.

If -end contains less cases than -begin, the last case of -end is repeated. If it contains more cases than -begin, the extra cases are ignored. If either of -begin or -end isn't given, or is empty, -begin => '(' and -end => ')' are assumed.

If we are using C{-keep} (See Regexp::Common):

$1

captures the entire expression

SEE ALSO

Top

Regexp::Common for a general description of how to use this interface.

AUTHOR

Top

Damian Conway (damian@conway.org)

MAINTAINANCE

Top

This package is maintained by Abigail (regexp-common@abigail.be).

BUGS AND IRRITATIONS

Top

Bound to be plenty.

For a start, there are many common regexes missing. Send them in to regexp-common@abigail.be.

LICENSE and COPYRIGHT

Top


Regexp-Common documentation Contained in the Regexp-Common distribution.

package Regexp::Common::balanced; {

use Regexp::Common qw /pattern clean no_defaults/;

use strict;
use warnings;

use vars qw /$VERSION/;
$VERSION = '2010010201';

my %closer = ( '{'=>'}', '('=>')', '['=>']', '<'=>'>' );
my $count = -1;
my %cache;

sub nested {
    my ($start, $finish) = @_;

    return $Regexp::Common::balanced [$cache {$start} {$finish}]
            if exists $cache {$start} {$finish};

    $count ++;
    my $r = '(??{$Regexp::Common::balanced ['. $count . ']})';

    my @starts   = map {s/\\(.)/$1/g; $_} grep {length}
                        $start  =~ /([^|\\]+|\\.)+/gs;
    my @finishes = map {s/\\(.)/$1/g; $_} grep {length}
                        $finish =~ /([^|\\]+|\\.)+/gs;

    push @finishes => ($finishes [-1]) x (@starts - @finishes);

    my @re;
    local $" = "|";
    foreach my $begin (@starts) {
        my $end = shift @finishes;

        my $qb  = quotemeta $begin;
        my $qe  = quotemeta $end;
        my $fb  = quotemeta substr $begin => 0, 1;
        my $fe  = quotemeta substr $end   => 0, 1;

        my $tb  = quotemeta substr $begin => 1;
        my $te  = quotemeta substr $end   => 1;

        use re 'eval';

        my $add;
        if ($fb eq $fe) {
            push @re =>
                   qr /(?:$qb(?:(?>[^$fb]+)|$fb(?!$tb)(?!$te)|$r)*$qe)/;
        }
        else {
            my   @clauses =  "(?>[^$fb$fe]+)";
            push @clauses => "$fb(?!$tb)" if length $tb;
            push @clauses => "$fe(?!$te)" if length $te;
            push @clauses =>  $r;
            push @re      =>  qr /(?:$qb(?:@clauses)*$qe)/;
        }
    }

    $cache {$start} {$finish} = $count;
    $Regexp::Common::balanced [$count] = qr/@re/;
}


pattern name    => [qw /balanced -parens=() -begin= -end=/],
        create  => sub {
            my $flag = $_[1];
            unless (defined $flag -> {-begin} && length $flag -> {-begin} &&
                    defined $flag -> {-end}   && length $flag -> {-end}) {
                my @open  = grep {index ($flag->{-parens}, $_) >= 0}
                             ('[','(','{','<');
                my @close = map {$closer {$_}} @open;
                $flag -> {-begin} = join "|" => @open;
                $flag -> {-end}   = join "|" => @close;
            }
            my $pat = nested @$flag {qw /-begin -end/};
            return exists $flag -> {-keep} ? qr /($pat)/ : $pat;
        },
        version => 5.006,
        ;

}

1;

__END__