Text::Sentence - module for splitting text into sentences


HTML-Summary documentation  | view source Contained in the HTML-Summary distribution.

Index


NAME

Top

Text::Sentence - module for splitting text into sentences

SYNOPSIS

Top

    use Text::Sentence qw( split_sentences );
    use locale;
    use POSIX qw( locale_h );

    setlocale( LC_CTYPE, 'iso_8859_1' );
    @sentences = split_sentences( $text );

DESCRIPTION

Top

The Text::Sentence module contains the function split_sentences, which splits text into its constituent sentences, based on a fairly approximate regex. If you set the locale before calling it, it will deal correctly with locale dependant capitalization to identify sentence boundaries. Certain well know exceptions, such as abreviations, may cause incorrect segmentations.

FUNCTIONS

Top

split_sentences( $text )

The split sentences function takes a scalar containing ascii text as an argument and returns an array of sentences that the text has been split into.

    @sentences = split_sentences( $text );

SEE ALSO

Top

    locale
    POSIX

AUTHOR

Top

Ave Wrigley <wrigley@cre.canon.co.uk>

COPYRIGHT

Top


HTML-Summary documentation  | view source Contained in the HTML-Summary distribution.