| Lingua-ZH-Keywords documentation | view source | Contained in the Lingua-ZH-Keywords distribution. |
Lingua::ZH::Keywords - Extract keywords from Chinese text
# Exports keywords() by default
use Lingua::ZH::Keywords;
print join(",", keywords($text)); # Prints five keywords
print join(",", keywords($text, 10)); # Prints ten keywords
This is a very simple algorithm which removes stopwords from the
text, and then counts up what it considers to be the most important
keywords. The keywords subroutine returns a list of keywords
in order of relevance.
The stopwords list is accessible as @Lingua::ZH::Keywords::StopWords.
If the input $text is an Unicode string, the returned keywords
will also be Unicode strings; otherwise they are assumed to be
Big5-encoded bytestrings.
Algorithm adapted from the Lingua::EN::Keywords module by Simon Cozens, <simon@simon-cozens.org<gt>.
Autrijus Tang <autrijus@autrijus.org>
Copyright 2003 by Autrijus Tang <autrijus@autrijus.org>.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
| Lingua-ZH-Keywords documentation | view source | Contained in the Lingua-ZH-Keywords distribution. |