| Regex-PreSuf documentation | view source | Contained in the Regex-PreSuf distribution. |
Regex::PreSuf - create regular expressions from word lists
use Regex::PreSuf; my $re = presuf(qw(foobar fooxar foozap)); # $re should be now 'foo(?:zap|[bx]ar)'
The presuf() subroutine builds regular expressions out of 'word lists', lists of strings. The regular expression matches the same words as the word list. These regular expressions normally run faster than a simple-minded '|'-concatenation of the words.
Examples:
'foobar fooxar' => 'foo[bx]ar'
'foobar foozap' => 'foo(?:bar|zap)'
'foobar fooar' => 'foob?ar'
The downsides:
.*?+{}[]^$, they are just plain ordinary
boring characters.For the second downside there is an exception. The module has some rudimentary grasp of how to use the 'any character' metacharacter. If you call presuf() like this:
my $re = presuf({ anychar=>1 }, qw(foobar foo.ar fooxar));
# $re should be now 'foo.ar'
The module finds out the common prefixes and suffixes of the words and then recursively looks at the remaining differences. However, by default only common prefixes are used because for many languages (natural or artificial) this seems to produce the fastest matchers. To allow also for suffixes use
my $re = presuf({ suffixes=>1 }, ...);
To use only suffixes use
my $re = presuf({ prefixes=>0 }, ...);
(this implicitly enables suffixes)
In case you want to flood your session without debug messages you can turn on debugging by saying
Regex::PreSuf::debug(1);
How to turn them off again is left as an exercise for the kind reader.
Jarkko Hietaniemi
This code is distributed under the same copyright terms as Perl itself.
| Regex-PreSuf documentation | view source | Contained in the Regex-PreSuf distribution. |