WWW::Scraper::Beaucoup - Scrapes Beaucoup's Super Search


Scraper documentation Contained in the Scraper distribution.

Index


Code Index:

NAME

Top

WWW::Scraper::Beaucoup - Scrapes Beaucoup's Super Search

SYNOPSIS

Top

    use WWW::Scraper;
    use WWW::Scraper::Response::Job;

    $search = new WWW::Scraper('Beaucoup');

    $search->setup_query($query, {options});

    while ( my $response = $scraper->next_response() ) {
        # $response is a WWW::Scraper::Response::Job.
    }

DESCRIPTION

Top

Beaucoup extends WWW::Scraper.

It handles making and interpreting Beaucoup searches of http://www.Beaucoup.com.

OPTIONS

Top

loc

Many, many strings are allowed. Locations are categorized by state. See Beaucoup.com for these option values ("3648 locations!" as of June 2001)

cat
      --- All Categories ---
      Clerical/Administrative
      Computing/MIS
      Customer Service/Support
      Education/Training
      Engineering
      Financial Services
      Government/Non Profit
      Health Care
      Human Resources
      Manufacturing/Business Operations
      Marketing/Advertising
      Media
      Other
      Professional Services
      Sales
      Travel/Hospitality

To this you need to add a "-" and the "job function", or you may specify "All Job Functions in Category" by leaving off the "-" and "job function".

The options for job function are dependant on the Job Category, so for some of the categories the functions are:

Clerical/Administrative
    Other

Computing/MIS
    Database Administration
    Internet Development
    Network/System Administration
    Other
    Quality Assurance/Testing
    Software Development
    Systems Analysis
    Technical Support/Help Desk

Customer Service/Support
    Other

Education/Training
  Colleges/Universities
  K to 12 Education
  Other
  Technical/Trade Schools
  Training   

Engineering
  Chemical
  Civil
  Design/Industrial
  Electrical/Hardware
  Mechanical
  Operations
  Other   

Financial Services
  Accounting
  Banking
  Finance
  Insurance
  Other
  Securities/Asset Management   

Government/Non Profit
    Other

Health Care
  Administration
  Medical
  Nursing
  Other
  Pharmaceutical   

Human Resources
    Other

Manufacturing/Business Operations
  Construction/Trades
  Facilities Management
  Logistics/Distribution
  Manufacturing
  Other
  Program/Project Management
  Purchasing   

Marketing/Advertising
  Advertising
  Market Research
  Marketing Communications
  Other
  Product Management
  Public Relations   

Media
  Broadcasting
  Graphic Arts/Design
  Journalism
  Other
  Publishing/Technical Writing   

Other
    Other

Professional Services
  Legal Services
  Management Consulting
  Other   

Sales
  Account Management
  Business Development
  Direct Sales
  Merchandising/Retail
  Other   

Travel/Hospitality
  Other
  Restaurant/Food Services
  Travel/Recreation/Lodging   

AUTHOR

Top

WWW::Scraper::Beaucoup is written and maintained by Glenn Wood, http://search.cpan.org/search?mode=author&query=GLENNWOOD.

COPYRIGHT

Top


Scraper documentation Contained in the Scraper distribution.

package WWW::Scraper::Beaucoup;


#####################################################################
use strict;
use vars qw(@ISA $VERSION);
@ISA = qw(WWW::Scraper);
$VERSION = sprintf("%d.%02d", q$Revision: 1.7 $ =~ /(\d+)\.(\d+)/);

use WWW::Scraper(qw(1.48 trimLFs trimLFLFs));

# SAMPLE
# http://www.Beaucoup.com/js/jobsearch-results.html?loc=CA-San+Jose+Area&cat=Computing%2FMIS-Software+Development&srch=Perl&job=1
my $scraperRequest = 
   { 
      'type' => 'FORM'       # Type of query generation is 'QUERY'
     # This is the basic URL on which to build the query.
     ,'url' => 'http://Beaucoup.com'
     # This is the Scraper attributes => native input fields mapping
     ,'nativeQuery' => 'q'
     ,'nativeDefaults' =>
                      {    'query'   => undef
#                          ,'phrases' => 'off'
#                          ,'rpp'     => '10'
#                          ,'cb'      => 'Beaucoup'
#                          ,'qtype'   => '0'
#                          ,'lang'    => '1'
#                          ,'timeout' => '4'
                          ,'Search.x' => 1
                          ,'Search.y' => 1
                      }
     ,'fieldTranslations' =>
             {
                 '*' =>
                     {    '*'             => '*'
                     }
             }
      # Some more options for the Scraper operation.
     ,'cookies' => 0
   };

my $scraperFrame =
[ 'HTML', 
    [ 
        # This page shows <B>1-10</B> out of a total of <B>20</B> results for:
        [ 'COUNT', 'There are (\d+) results for:' ]
       ,[ 'COUNT', '\d+-\d+ out of (\d+) for ' ]
       ,[ 'COUNT', '\d+-\d+ out of a total of (\d+) results for' ]
       ,[ 'NEXT', 'next' ]
       ,[ 'HIT*',
          [ 
['REGEX','\d+\.\s<a href="([^"]*)"[^>]*>(.*?)<br>(.*?)<br>', 'url', 'title','description']

#1. <a href="/j.php?rc=40&atag=2,,4,da&j=http%3A%2F%2Fgta.sky-scraper.net%2F&kw=scraper">sky-scraper.net: best sites for GRAND THEFT AUTO</a>
#<br>sky-scraper.net,  Visit Casino On Net and receive up to $200 sign up bonus. Home. Sat,  12 Apr 2003 GMT. ...
#<br><i>http://gta.sky-scraper.net/</i>  <b><font size=1>(Netscape)</font></b><br><small><a href="/j.php?rc=40&atag=2,,4,da&j=http%3A%2F%2Fgta.sky-scraper.net%2F&kw=scraper" target=_new>Open link in new window</a></small><br><br>
          ]
        ]
    ]
];


sub testParameters {
    my ($self) = @_;
    
    if ( ref $self ) {
        $self->{'isTesting'} = 1;
    }
    
    return {
                 'SKIP' => &WWW::Scraper::TidyXML::isNotTestable() 
                ,'testNativeQuery' => 'scraper'
                ,'expectedOnePage' => 5
                ,'expectedMultiPage' => 20
                ,'expectedBogusPage' => 2000
           };
}

# Access methods for the structural declarations of this Scraper engine.
sub scraperRequest { $scraperRequest }
sub scraperFrame { $_[0]->SUPER::scraperFrame($scraperFrame); }
sub scraperDetail{ undef }

1;

__END__