Net::YASA - Interface to YASA (Yet Another Suffix Array)


Net-YASA documentation Contained in the Net-YASA distribution.

Index


Code Index:

NAME

Top

Net::YASA - Interface to YASA (Yet Another Suffix Array)

VERSION

Top

Version 0.03

SYNOPSIS

Top

This module will submit content to the YASA WebService to return a list of terms and corresponding frequencies.

    use Net::YASA;

    my $foo = Net::YASA->new();
    my $termset = $foo->extract(<some_of_utf8_words>);
    print 'TermSet 1: ', $$termset[0], "\n";
    print 'TermSet 2: ', $$termset[1], "\n";
    ...

EXPORT

Top

A list of functions that can be exported. You can delete this section if you don't export anything, such as for a purely object-oriented module.

METHODS

Top

new

extract

AUTHOR

Top

Cheng-Lung Sung, <clsung at FreeBSD.org>

BUGS

Top

Please report any bugs or feature requests to bug-net-yasa at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Net-YASA. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SEE ALSO

Top

YASA (Yet Another Suffix Array) web site: http://yasa.newzilla.org

SUPPORT

Top

You can find documentation for this module with the perldoc command.

    perldoc Net::YASA

You can also look for information at:

* AnnoCPAN: Annotated CPAN documentation

http://annocpan.org/dist/Net-YASA

* CPAN Ratings

http://cpanratings.perl.org/d/Net-YASA

* RT: CPAN's request tracker

http://rt.cpan.org/NoAuth/Bugs.html?Dist=Net-YASA

* Search CPAN

http://search.cpan.org/dist/Net-YASA

ACKNOWLEDGEMENTS

Top

COPYRIGHT & LICENSE

Top


Net-YASA documentation Contained in the Net-YASA distribution.
package Net::YASA;

use warnings;
use strict;
use utf8;

use Encode qw/encode decode/;
use LWP::UserAgent;

our $VERSION = '0.03';
our $AUTOLOAD;
our %ok_field;

for my $attr ( qw(content minfreq minlength ) ) { $ok_field{$attr}++; }

sub AUTOLOAD {
    my $self = shift;
    my $attr = $AUTOLOAD;
    $attr =~ s/.*:://;
    return if $attr eq 'DESTROY';

    if ($ok_field{$attr}) {
	$self->{lc $attr} = shift if @_;
	return $self->{lc $attr};
    } else {
	my $superior = "SUPER::$attr";
	$self->$superior(@_);
    }
}

use constant YASA_WEB_URL => 'http://yasa.newzilla.org/run/';
sub new {
    my $class = shift;
    my $self = {
	_ua => undef,
	minlength => 1,
	minfreq => 2,
	output => 'xml',
	_content => undef
    };
    if(@_) {
	my %arg = @_;

	foreach (keys %arg) {
	    $self->{lc($_)} = $arg{$_};
	}
    }
    $self->{_ua} = LWP::UserAgent->new;
    $self->{_ua}->timeout(30);
    $self->{_ua}->agent('CPAN::Net::YASA');
    bless($self, $class);
    return($self);
}

sub extract {
    my ($self, $content) = @_;
    die 'No content specified' unless $content ne "";
    my $ua = $self->{_ua};
    my $response = $ua->post(
	YASA_WEB_URL.$self->{output}."/",
	{   
	    'min' => $self->minlength,
	    'freq' => $self->minfreq,
	    'content' => encode("utf8",$content), 
	}
    );
    die "Error in extracting data from YASA!\n" unless $response->is_success();
    if ($self->{output} eq "json" and eval {
	    require JSON::Any;
	    1;
	}) {
	my $result = $response->content();
        my $j = JSON::Any->new;

	my $data = $j->decode($result);
	return $data;
    } 
    else {
	my $xml = decode("utf8",$response->content());
	my @results = ();
	while ($xml =~ m#<Term>([^<]*)</Term><Freq>(\d+)</Freq>#g) {
	    push @results, $1."\t".$2;
	}
        return \@results;
    }
}

1; # End of Net::YASA