WWW::Pastebin::UbuntuNlOrg::Retrieve - retrieve pastes from http://paste.ubuntu-nl.org/ website


WWW-Pastebin-UbuntuNlOrg-Retrieve documentation Contained in the WWW-Pastebin-UbuntuNlOrg-Retrieve distribution.

Index


Code Index:

NAME

Top

WWW::Pastebin::UbuntuNlOrg::Retrieve - retrieve pastes from http://paste.ubuntu-nl.org/ website

SYNOPSIS

Top

    use strict;
    use warnings;

    use WWW::Pastebin::UbuntuNlOrg::Retrieve;

    my $paster = WWW::Pastebin::UbuntuNlOrg::Retrieve->new;

    my $res_ref = $paster->retrieve('http://paste.ubuntu-nl.org/60877/');

    printf "The paste was posted on %s by %s\nIt is written in %s\n%s\n",
                @$res_ref{ qw(posted_on  name  lang  content) };

DESCRIPTION

Top

The module provides interface to retrieve pastes from http://paste.ubuntu-nl.org/ website via Perl.

CONSTRUCTOR

Top

new

    my $paster = WWW::Pastebin::UbuntuNlOrg::Retrieve->new;

    my $paster = WWW::Pastebin::UbuntuNlOrg::Retrieve->new(
        timeout => 10,
    );

    my $paster = WWW::Pastebin::UbuntuNlOrg::Retrieve->new(
        ua => LWP::UserAgent->new(
            timeout => 10,
            agent   => 'PasterUA',
        ),
    );

Constructs and returns a brand new juicy WWW::Pastebin::UbuntuNlOrg::Retrieve object. Takes two arguments, both are optional. Possible arguments are as follows:

timeout

    ->new( timeout => 10 );

Optional. Specifies the timeout argument of LWP::UserAgent's constructor, which is used for retrieving. Defaults to: 30 seconds.

ua

    ->new( ua => LWP::UserAgent->new( agent => 'Foos!' ) );

Optional. If the timeout argument is not enough for your needs of mutilating the LWP::UserAgent object used for retrieving, feel free to specify the ua argument which takes an LWP::UserAgent object as a value. Note: the timeout argument to the constructor will not do anything if you specify the ua argument as well. Defaults to: plain boring default LWP::UserAgent object with timeout argument set to whatever WWW::Pastebin::UbuntuNlOrg::Retrieve's timeout argument is set to as well as agent argument is set to mimic Firefox.

METHODS

Top

retrieve

    my $results_ref = $paster->retrieve('http://paste.ubuntu-nl.org/60877/')
        or die $paster->error;

    my $results_ref = $paster->retrieve('60877')
        or die $paster->error;

Instructs the object to retrieve a paste specified in the argument. Takes one mandatory argument which can be either a full URI to the paste you want to retrieve or just its ID. On failure returns either undef or an empty list depending on the context and the reason for the error will be available via error() method. On success returns a hashref with the following keys/values:

    $VAR1 = {
        'lang' => 'Perl',
        'posted_on' => 'March 24th 18:15',
        'content' => '{ test => 1 }, [ foo => \'bar\' ]',
        'name' => 'Zoffix'
    };

content

    { 'content' => '{ test => 1 }, [ foo => \'bar\' ]' }

The content key will contain the actual content of the paste. See also the content() method which is overloaded for this class.

posted_on

    { 'posted_on' => 'March 24th 18:15' }

The posted_on key will contain the date/time indicating when the paste was created.

lang

    { 'lang' => 'Perl' }

The lang key will contain the (computer) language of the paste (as was specified by the poster).

name

    { 'name' => 'Zoffix' }

The name key will contain the name of the person who created the paste.

error

    $paster->retrieve('60877')
        or die $paster->error;

On failure retrieve() returns either undef or an empty list depending on the context and the reason for the error will be available via error() method. Takes no arguments, returns an error message explaining the failure.

id

    my $paste_id = $paster->id;

Must be called after a successful call to retrieve(). Takes no arguments, returns a paste ID number of the last retrieved paste irrelevant of whether an ID or a URI was given to retrieve()

uri

    my $paste_uri = $paster->uri;

Must be called after a successful call to retrieve(). Takes no arguments, returns a URI object with the URI pointing to the last retrieved paste irrelevant of whether an ID or a URI was given to retrieve()

results

    my $last_results_ref = $paster->results;

Must be called after a successful call to retrieve(). Takes no arguments, returns the exact same hashref the last call to retrieve() returned. See retrieve() method for more information.

content

    my $paste_content = $paster->content;

    print "Paste content is:\n$paster\n";

Must be called after a successful call to retrieve(). Takes no arguments, returns the actual content of the paste. Note: this method is overloaded for this module for interpolation. Thus you can simply interpolate the object in a string to get the contents of the paste.

ua

    my $old_LWP_UA_obj = $paster->ua;

    $paster->ua( LWP::UserAgent->new( timeout => 10, agent => 'foos' );

Returns a currently used LWP::UserAgent object used for retrieving pastes. Takes one optional argument which must be an LWP::UserAgent object, and the object you specify will be used in any subsequent calls to retrieve().

SEE ALSO

Top

LWP::UserAgent, URI

AUTHOR

Top

'Zoffix, <'zoffix at cpan.org'> (http://zoffix.com/, http://haslayout.net/)

BUGS

Top

Please report any bugs or feature requests to bug-www-pastebin-ubuntunlorg-retrieve at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=WWW-Pastebin-UbuntuNlOrg-Retrieve. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

Top

You can find documentation for this module with the perldoc command.

    perldoc WWW::Pastebin::UbuntuNlOrg::Retrieve

You can also look for information at:

* RT: CPAN's request tracker

http://rt.cpan.org/NoAuth/Bugs.html?Dist=WWW-Pastebin-UbuntuNlOrg-Retrieve

* AnnoCPAN: Annotated CPAN documentation

http://annocpan.org/dist/WWW-Pastebin-UbuntuNlOrg-Retrieve

* CPAN Ratings

http://cpanratings.perl.org/d/WWW-Pastebin-UbuntuNlOrg-Retrieve

* Search CPAN

http://search.cpan.org/dist/WWW-Pastebin-UbuntuNlOrg-Retrieve

COPYRIGHT & LICENSE

Top


WWW-Pastebin-UbuntuNlOrg-Retrieve documentation Contained in the WWW-Pastebin-UbuntuNlOrg-Retrieve distribution.

package WWW::Pastebin::UbuntuNlOrg::Retrieve;

use warnings;
use strict;

our $VERSION = '0.001';

use base 'WWW::Pastebin::Base::Retrieve';
use HTML::TokeParser::Simple;
use HTML::Entities;
use URI;

sub _make_uri_and_id {
    my ( $self, $what ) = @_;
    my ( $id ) = $what =~ m{
                (?:http://)? (?:www\.)? paste\.ubuntu-nl\.org/ (\d+) /?
        }xi;

    $id = $what
        unless defined $id;

    $id =~ s/^\s+|\s+$//g;

    return ( URI->new("http://paste.ubuntu-nl.org/$id/"), $id );
}

sub _parse {
    my ( $self, $content ) = @_;
    
    my $parser = HTML::TokeParser::Simple->new( \$content );
    
    my %data;
    my %nav;
    @nav{ qw(start level get_name_date  get_lang  get_content) } = (0) x 5;
    while ( my $t = $parser->get_token ) {
        if ( $nav{start} == 0 and $t->is_start_tag('h1') ) {
            @nav{ qw(start  level  get_name_date) } = (1, 1, 1);
        }
        elsif ( $nav{get_name_date} == 1 and $t->is_text ) {
            @nav{ qw(level get_name_date) } = (2, 0);
            @data{ qw(name  posted_on) } = $t->as_is
            =~ /Posted by (.+) on (.+)\s*/;
        }
        elsif ( $t->is_start_tag('option') 
            and defined $t->get_attr('selected')
        ) {
            @nav{ qw(level  get_lang) } = ( 3, 1 );
        }
        elsif ( $nav{get_lang} == 1 and $t->is_text ) {
            @nav{ qw(level  get_lang) } = ( 4, 0 );
            $data{lang} = $t->as_is;
        }
        elsif ( $t->is_start_tag('textarea')
            and defined $t->get_attr('name')
            and $t->get_attr('name') eq 'content'
        ) {
            @nav{ qw(level  get_content) } = ( 5, 1 );
        }
        elsif ( $nav{get_content} == 1 and $t->is_text ) {
            $nav{is_success} = 1;
            $data{content} = $t->as_is;
            last;
        }
    }
    
    unless ( $nav{is_success} ) {
        return $self->_set_error(
            "Parser error: $nav{level}\nContent:\n$content\n"
        );
    }
    
    decode_entities $_ for values %data;
    
    $self->content( $data{content} );
    return \%data;
}

1;
__END__