Hailo::Role::Tokenizer - A role representing a L tokenizer


Hailo documentation  | view source Contained in the Hailo distribution.

Index


NAME

Top

Hailo::Role::Tokenizer - A role representing a Hailo tokenizer

METHODS

Top

new

This is the constructor. It takes no arguments.

make_tokens

Takes a line of input and returns an array reference of tokens. A token is an array reference containing two elements: a spacing attribute and the token text. The spacing attribute is an integer which will be stored along with the token text in the database. The following values are currently being used:

0 - normal token
1 - prefix token (no whitespace follows it)
2 - postfix token (no whitespace precedes it)
3 - infix token (no whitespace follows or precedes it)

make_output

Takes an array reference of tokens and returns a line of output. A token is an array reference as described in make_tokens|/make_tokens. The tokens will be joined together into a sentence according to the whitespace attributes associated with the tokens, as well as any formatting provided by the tokenizer implementation.

AUTHORS

Top

Hinrik Örn Sigurðsson, hinrik.sig@gmail.com

Ævar Arnfjörð Bjarmason <avar@cpan.org>

LICENSE AND COPYRIGHT

Top


Hailo documentation  | view source Contained in the Hailo distribution.