Hailo::Tokenizer::Words - A tokenizer for L which splits


Hailo documentation  | view source Contained in the Hailo distribution.

Index


NAME

Top

Hailo::Tokenizer::Words - A tokenizer for Hailo which splits on whitespace and word boundaries, mostly.

DESCRIPTION

Top

This tokenizer does its best to handle various languages. It knows about most apostrophes, quotes, and sentence terminators.

AUTHOR

Top

Hinrik Örn Sigurðsson, hinrik.sig@gmail.com

LICENSE AND COPYRIGHT

Top


Hailo documentation  | view source Contained in the Hailo distribution.