File::Info - Store file information persistently for fast lookup


File-Info documentation  | view source Contained in the File-Info distribution.

Index


NAME

Top

File::Info - Store file information persistently for fast lookup

SYNOPSIS

Top

  use File::Info qw( $PACKAGE $VERSION );

  my $info = File::Info->new($dir);
  # $fn is "basename"; contains no directory portion
  my $hex  = $info->md5hex($fn);  # Reads cached data if possible

DESCRIPTION

Top

This package stores per-file information for speedy lookup later. It is intended to store file info that takes a significant time to determine --- e.g., the MD5 sum of a large file, to avoid uneccessarily recalculation. This may be particularly helpful for searching across many files for some specific property.

File statistics are recalculated on demand. If the file size or modification time have changed since the calculations were last made, then they will be purged and recalculated.

File information is stored on a per-directory basis. Each file info file is stored in a directory; the files to which it refers are in the same directory, and are referred as names without paths.

CLASS CONSTANTS

Top

TYPE_CONSTANTS

As returned by the type|"type" method. These constants are exported by request, either individually, or together with the ':types' tag.

TYPE_UNKNOWN

File type not identified

TYPE_JPEG

A 'JPEG' image file.

TYPE_PAR

A 'par' (parity archive) file.

CLASS COMPONENTS

Top

CLASS HIGHER-LEVEL FUNCTIONS

Top

CLASS HIGHER-LEVEL PROCEDURES

Top

add_global_lookup

Add a lookup function to the. A method with the same name will be created, to provide the cached lookup.

ARGUMENTS

name

The name may consist only of letters, digits, and underscore characters. The first character must be a letter, and at least one digit or lower-case must be present.

builtin names will always be lower-case. If you stick to this, then you will need to make no change if your identifier should get absorbed into the core. On the other hand, if you use some upper-case letters (e.g., StudlyCaps), then you are assured that you will never clash will internal names.

These other names are reserved:

  add_local_lookup add_global_lookup isa import new dirname

code

The code to call to calculate the value. The code will be passed the absolute name of the file to lookup, and is expected to return a suitable value. The value will be cached.

INSTANCE CONSTRUCTION

Top

new

Create & return a new thing.

ARGUMENTS

_dirname

Name of the directory represented

INSTANCE COMPONENTS

Top

INSTANCE HIGHER-LEVEL FUNCTIONS

Top

dirname

The name of the directory to which this instance refers

STANDARD LOOKUPS

Top

Each of the following functions takes a filename (without path, relative to the directory of the instance), and returns the relevant value for the file.

Alternatively, they may be called as class methods, in which case the filename value must be absolute. This mode will never invoke a local method (see add_local_lookup|"add_local_lookup", and is less efficient if multiple lookups are made on files in the same directory.

md5_hex

The MD5 signature of the file, as 16 pairs of hex characters. The Digest::MD5 module (version 2 or above) is required to be present.

md5

The MD5 signature of the file, as a 16-byte binary value. The Digest::MD5 module (version 2 or above) is required to be present.

md5_16khex

The MD5 signature of the first 16k of the file, as 16 pairs of hex characters. The Digest::MD5 module (version 2 or above) is required to be present.

md5_16k

The MD5 signature of the first 16k of the file, file, as a 16-byte binary value. The Digest::MD5 module (version 2 or above) is required to be present.

line_count

The number of lines in the file. More acurrately, the number of "\n" characters in the file (as for wc). No attempt is made to guess the line terminator of the running system; for that would lead to inconsistent results on the same file on a (say) Samba-mounted drive accessed from both Windoze and UN*X.

type

The file type, as determined by reading the file itself. This is similar in intent to the file command under UN*X, with the following distinctions:

The returned value is a TYPE_x constant.

par_set_hash

Behaviour is defined only for files whose type|"type" is TYPE_PAR.

This is the hash used to identify par files that belong to a single set. It is a 16-byte binary file.

par_set_hash_hex

Behaviour is defined only for files whose type|"type" is TYPE_PAR.

As for par_set_hash|"par_set_hash", but a 16 pairs of hex characters representing the 16 bytes.

INSTANCE HIGHER-LEVEL PROCEDURES

Top

add_local_lookup

Add a lookup function to this instance only. A method with the same name will be created, to provide the cached lookup.

This method will only work on this instance. Any other instances with their own local methods will be respected. The local method will override any global method of the same name. However, using the class interface (e.g., File::Info->local($absname) will always invoke the global instance, if any (and fail, if not).

ARGUMENTS

name

The name may consist only of letters, digits, and underscore characters. The first character must be a letter, and at least one digit or lower-case must be present.

builtin names will always be lower-case. If you stick to this, then you will need to make no change if your identifier should get absorbed into the core. On the other hand, if you use some upper-case letters (e.g., StudlyCaps), then you are assured that you will never clash will internal names.

These other names are reserved:

  add_local_lookup add_global_lookup isa import new dirname

code

The code to call to calculate the value. The code will be passed the absolute name of the file to lookup, and is expected to return a suitable value. The value will be cached.

EXAMPLES

Top

BUGS

Top

REPORTING BUGS

Top

Email the author.

AUTHOR

Top

Martyn J. Pearce fluffy@cpan.org

COPYRIGHT

Top

SEE ALSO

Top


File-Info documentation  | view source Contained in the File-Info distribution.