WebService::GoogleHack::Text - This module implements some basic text processing such as parsing data etc.


WebService-GoogleHack documentation  | view source Contained in the WebService-GoogleHack distribution.

Index


SYNOPSIS

Top

    use WebService::GoogleHack::Text;

    #create an object of type Text

    my $text = GoogleHack::Text->new(); 

    # returns an hash words

    %results=$text->getWords("file location");

    # returns an hash of 3 word sentences

    %results=$text->getSentences("file location", 3); 

    # this function reads the configuration file

    %results=$text->readConfig("location of configuration file");

    #removes HTML tags

    %results=$text->removeHTML("string");




DESCRIPTION

Top

This is a simple Text processing package which aids GoogleHack and Rate modules. Given a file of words, it retreives the words in the file and stores it in a simple hash format. In addition, given a file of text, it can also form n word sentences.

PACKAGE METHODS

Top

__METHOD__->new()

Purpose: This function creates an object of type Text and returns a blessed reference.

__METHOD__->init(Params Given Below)

Purpose: This this function can used to inititalize the member variables.

Valid arguments are :

__METHOD__->getSentences(file_name,sentence_length,trace_file)

Purpose: Given a file of text or a variable containing text, this function tries to retrieve sentences from it.

Valid arguments are :

Returns: Returns an array of strings.

__METHOD__->getSentences(file_name,trace_file)

Purpose:Given a file of text this function tries to retrieve words from it.

Valid arguments are :

Returns: Returns a hash of words.

__METHOD__->getSentences(text)

Purpose: Remove XML tags. Package XML::TokeParser must be installed

Valid arguments are :

Returns: Returns a XML less text.

__METHOD__->getSentences(text)

Purpose: Remove HTML tags. Package HTML::TokeParser must be installed

Valid arguments are :

Returns: Returns a HTML less text.

__METHOD__->getSurroundingWords(filename,stemmer)

Purpose: this function is used to read a configuration file containing informaiton such as the Google-API key, the words list etc.

Valid arguments are :

returns : Returns an object which contains the parsed information.

__METHOD__->readConfig(filename)

Purpose: this function is used to read a configuration file containing informaiton such as the Google-API key, the words list etc.

Valid arguments are :

returns : Returns an object which contains the parsed information.

AUTHOR

Top

Pratheepan Raveendranathan, <rave0029@d.umn.edu>

Ted Pedersen, <tpederse@d.umn.edu>

BUGS

Top

SEE ALSO

Top

GoogleHack home page - http://google-hack.sourceforge.net

Pratheepan Raveendranathan - http://www.d.umn.edu/~rave0029/research

Ted Pedersen - www.d.umn.edu./~tpederse

Google-Hack Maling List <google-hack-users@lists.sourceforge.net>

AUTHOR

Top

Pratheepan Raveendranathan, <rave0029@d.umn.edu>

Ted Pedersen, <tpederse@d.umn.edu>

COPYRIGHT AND LICENSE

Top


WebService-GoogleHack documentation  | view source Contained in the WebService-GoogleHack distribution.