CPAN::Cache - Abstract locally-cached logical subset of a CPAN mirror


CPAN-Cache documentation Contained in the CPAN-Cache distribution.

Index


Code Index:

NAME

Top

CPAN::Cache - Abstract locally-cached logical subset of a CPAN mirror

DESCRIPTION

Top

There have been any number of scripts and modules written that contain as part of their functionality some form of locally stored partial mirror of the CPAN dataset.

CPAN::Cache does the same thing, except that in addition it has the feature that the downloading and storage of CPAN data is all that it does, so it should not introduce any additional dependencies or bloat, and should be much easier to reuse that existing modules, which generally are more task-specific.

The intent is that this module will be usable by everything that is in the business of pulling modules from CPAN, storing them locally, and doing something with them.

In this way, it really does little other than mirror data from a remote URI, except that CPAN::Cache also provides some additional intelligence about which files are and are not static (will never change) which aren't, and is typed specifically as a mirror of CPAN, instead of any other sort of mirror.

By building this module as a seperate distribution, it is hoped we can improve seperation of concerns in the CPAN-related modules and ensure cleaner, smaller, and more robust tools that interact with the CPAN in the most correct ways.

METHODS

Top

new

  my $cache = CPAN::Cache->new(
      remote_uri => 'http://search.cpan.org/CPAN/',
      local_dir  => '/tmp/cpan',
      );

remote_uri

The remote_uri accessor returns a URI object for the remote CPAN repository.

local_dir

The local_dir accessor returns the filesystem path for the root root directory of the CPAN cache.

file path/to/file.txt

The file method takes the path of a file within the repository, and returns a URI::ToDisk object representing it's location on both the server, and on the local filesystem.

Paths should always be provided in unix/web format, not the local filesystem's format.

Returns a HTML::ToDisk or throws an exception if passed a bad path.

get path/to/file.txt

The get method takes the path of a file within the repository, and fetches it from the remote repository, storing it at the appropriate local path.

Paths should always be provided in unix/web format, not the local filesystem's format.

Returns the URI::ToDisk for the file if retrieved successfully, false false if the file does not exist within the repository, or throws an exception on error.

mirror path/to/file.txt

The mirror method takes the path of a file within the repository, and mirrors it from the remote repository, storing it at the appropriate local path.

Using this method if preferable for items like indexs for which want to ensure you have the current version, but do not want to freshly download each time.

Paths should always be provided in unix/web format, not the local filesystem's format.

Returns the URI::ToDisk for the file if mirrored successfully, false if the file did not exist in the repository, or throws an exception on error.

static

The static method determines whether a given path within CPAN is able to change or not.

In the CPAN, some files such as index files and checksum can change, while other files such as the tarball files will be static, and once committed to the repository will never be changed (altough they may be deleted).

In a caching scenario, this means that if the file exists locally, we will never need to return to the server to check for a new version, we enables additional optimisations for CPAN-related algorithms.

Returns true if the file will never change, false if not, or throws an exception on error.

TO DO

Top

- Write a proper test suite, not just a compile test (even though this was taken from working JSAN code)

SUPPORT

Top

Bugs should be reported via the CPAN bug tracker

http://rt.cpan.org/NoAuth/ReportBug.html?Queue=CPAN-Cache

For other issues, contact the author.

AUTHOR

Top

Adam Kennedy <adamk@cpan.org>

SEE ALSO

Top

CPAN::Index, CPAN::Mini, DBIx::Class

COPYRIGHT

Top


CPAN-Cache documentation Contained in the CPAN-Cache distribution.
package CPAN::Cache;

use 5.005;
use strict;
use Carp          ();
use File::Spec    ();
use File::Path    ();
use File::HomeDir ();
use URI::ToDisk   ();
use Params::Util  '_INSTANCE';
use LWP::Simple   ();

use vars qw{$VERSION};
BEGIN {
	$VERSION = '0.02';
}





#####################################################################
# Constructor and Accessors

sub new {
	my $class = shift;
	my $self  = bless { @_ }, $class;

	# Apply boolean flags cleanly
	$self->{verbose}  = !! $self->{verbose};
	$self->{readonly} = !! $self->{readonly};

	# More thorough checking for the 
	my $uri  = $self->{remote_uri}
	           || 'http://search.cpan.org/CPAN/';
	my $path = $self->{local_dir}
	           || File::Spec->catdir(
			File::HomeDir->my_data, '.perl', 'CPAN-Cache'
			);

	# Strip superfluous trailing slashes
	$path =~ s/\/+$//;
	$uri  =~ s/\/+$//;

	# Create the mirror_local path if needed
	-e $path or File::Path::mkpath($path);
	-d $path or Carp::croak("mirror_local: Path '$path' is not a directory");
	-w $path or Carp::croak("mirror_local: No write permissions to path '$path'");

	# Create the mirror object and save the updated values
	$self->{_mirror} = URI::ToDisk->new( $path => $uri )
		or Carp::croak("Unexpected error creating HTML::Location object");

	$self;
}

sub remote_uri {
	$_[0]->{_mirror}->URI;
}

sub local_dir {
	$_[0]->{_mirror}->path;
}

# Undocumented until it is usable
sub trace {
	$_[0]->{trace};
}

# Undocumented until it is usable
sub verbose {
	$_[0]->{verbose};
}

# Undocumented until it is usable
sub readonly {
	$_[0]->{readonly};
}





#####################################################################
# Interface Methods

sub file {
	my $self = shift;
	my $path = $self->_path(shift);

	# Split into parts and find the location for it.
	$self->{_mirror}->catfile( split /\//, $path );
}

sub get {
	my $self = shift;
	my $file = $self->file(shift);

	# Check local dir exists
	my $dir = File::Basename::dirname($file->path);
	-d $dir or File::Path::mkpath($dir);

	# Fetch the file from the server
	my $rc = LWP::Simple::getstore( $file->uri, $file->path );
	if ( LWP::Simple::is_success($rc) ) {
		return $file;
	} elsif ( $rc == LWP::Simple::RC_NOT_FOUND ) {
		return undef;
	} else {
		Carp::croak("$rc error retrieving " . $file->uri);
	}
}

sub mirror {
	my $self = shift;
	my $path = $self->_path(shift);
	my $file = $self->file($path);

	# If any only if a path is "stable" and the file already exists,
	# it is guarenteed not to change, and we don't have to do the
	# mirroring operation.
	if ( $self->_static($path) and -f $file->path ) {
		return $file;
	}

	# Check local dir exists
	my $dir = File::Basename::dirname($file->path);
	-d $dir or File::Path::mkpath($dir);

	# Fetch the file from the server
	my $rc = LWP::Simple::mirror( $file->uri => $file->path );
	if ( LWP::Simple::is_success($rc) ) {
		return $file;
	} elsif ( $rc == LWP::Simple::RC_NOT_MODIFIED ) {
		return $file;
	} elsif ( $rc == LWP::Simple::RC_NOT_FOUND ) {
		return '';
	} else {
		Carp::croak("HTTP $rc error mirroring " . $file->uri);
	}
}

sub static {
	my $self = shift;
	my $path = $self->_path(shift);

	# All checksum files will change
	if ( $path =~ m~/CHECKSUMS$~ ) {
		return '';
	}

	# The .readme files can apparently be changed
	if ( $path =~ m~.readme$~ ) {
		return '';
	}

	# The authors directory is otherwise immutable
	if ( $path =~ m~^authors/~ ) {
		return 1;
	}

	# The safe option is to default to false for the rest
	return '';
}





#####################################################################
# Support Methods

# Validate a CPAN file path
sub _path {
	my $self = shift;
	my $path = shift or Carp::croak("No CPAN path provided");

	# Strip any leading slash
	$path =~ s(^\/)();

	$path;
}

1;