String::Tokeniser - Perl extension for, uhm, tokenising strings.


String-Tokeniser documentation  | view source Contained in the String-Tokeniser distribution.

Index


NAME

Top

String::Tokeniser - Perl extension for, uhm, tokenising strings.

SYNOPSIS

Top

  use String::Tokeniser;

DESCRIPTION

Top

String::Tokeniser provides an interface to a tokeniser class, allowing one to manipulate strings on a token-by-token basis without having to keep track of list element numbers and so on.

CONSTRUCTOR

Top

new ( $sentence, [0|-1|$regexp], [$exception...] )

Create a String::Tokeniser, tokenises $sentence and resets the token counter.

The next argument determines how a ``token'' is defined: a value of 0 or undef determines that underscores are included in a token; -1 states that they are not. Alternatively, you can supply your own regular expression which will be fed to a split to determine the tokens.

Then may optionally follow a list of exceptions: tokens that would be split in two, but should be treated as one.

METHODS

Top

moretokens

Tells you if you have any more tokens left to deal with.

skiptoken([n])

Move the `pointer' forward one (or n) tokens.

thistoken

Return the current token; that is, the token under the `pointer'.

lasttoken

Return the previous token; that is, the one just past the `pointer'.

gettoken

Equivalent to skiptoken;gettoken - the usual way of grabbing the next token in the list in turn.

nexttoken

Looks ahead one token, but does not change the `pointer' position.

lookahead([n])

Returns a string composed of the next n tokens, but does not change the `pointer' position.

gimme($string)

Assuming a string of tokens will end in $string, returns everything from the current `pointer' position until the string is found. Returns a two-element list: firsly, why the search terminated, (either EOF meaning we hit the end of the token list without success, or FOUND meaning $string was found.) and the rest of the tokens upto and including $string (or the end of the list, whichever was soonest).

save

Saves one's pointer position. Can be used multiply as a save stack.

restore

Restores a previously saved position.

FEATURES

Top

At present, there is no support for exceptions which spread over three or more tokens, although this is planned.

AUTHOR

Top

Originaly written by Simon Cozens; Maintained by Alberto Simões <ambs@cpan.org>

SEE ALSO

Top

WEBPerl::Changetie


String-Tokeniser documentation  | view source Contained in the String-Tokeniser distribution.