NAME
CPAN::Visitor - Generic traversal of distributions in a CPAN repository
VERSION
version 0.002
SYNOPSIS
use CPAN::Visitor;
my $visitor = CPAN::Visitor->new( cpan => "/path/to/cpan" );
# Prepare to visit all distributions
$visitor->select();
# Or a subset of distributions
$visitor->select(
subtrees => [ qr{D/DA}, qr{A/AD} ], # relative to authors/id/
exclude => qr{/Acme-}, # No Acme- dists
match => qr{/Test-} # Only Test- dists
);
# Action is specified via a callback
$visitor->iterate(
visit => sub {
my $job = shift;
print $job->{distfile} if -f 'Build.PL'
}
);
# Or start with a list of files
$visitor = CPAN::Visitor->new(
cpan => "/path/to/cpan",
files => \@distfiles, # e.g. ANDK/CPAN-1.94.tar.gz
);
$visitor->iterate( visit => \&callback );
# Iterate in parallel
$visitor->iterate( visit => \&callback, jobs => 5 );
DESCRIPTION
A very generic, callback-driven program to iterate over a CPAN repository.
Needs better documentation and tests, but is provided for others to examine, use or contribute to.
USAGE
new
my $visitor = CPAN::Visitor->new( @args );
Object attributes include:
select
$visitor->select( @args );
Valid arguments include:
The "select" method returns a count of files selected.
iterate
$visitor->iterate( @args );
Valid arguments include:
See "ACTION CALLBACKS" for more. Generally, you only need to provide the "visit" callback, which is called from inside the unpacked distribution directory.
The "iterate" method always returns true.
ACTION CALLBACKS
Each selected distribution is processed with a series of callback functions. These are each passed a hash-ref with information about the particular distribution being processed.
sub myvisit {
my $job = shift;
# do stuff
}
The job hash-ref is initialized with the following fields:
The "result" field is used to accumulate the return values from action callbacks. For example, the return value from the default 'extract' action is the unpacked distribution directory:
$job->{result}{extract} # distribution directory path
You do not need to store the results yourself -- the "iterate" method takes care of it for you.
Callbacks occur in the following order. Some callbacks skip further processing if the return value is false.
These allow complete customization of the iteration process. For example, one could do something like this:
This could potentially speed up iteration if only the file names within the distribution are of interest and not the contents of the actual files.
BUGS
Please report any bugs or feature requests using the CPAN Request
Tracker web interface at
<http://rt.cpan.org/Dist/Display.html?Queue=CPAN-Visitor>
When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.
SEE ALSO
AUTHOR
David Golden <dagolden@cpan.org>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2010 by David Golden.
This is free software, licensed under:
The Apache License, Version 2.0, January 2004