BuildCache - subroutines for handling the makepp build cache


makepp documentation  | view source Contained in the makepp distribution.

Index


NAME

Top

BuildCache -- subroutines for handling the makepp build cache

SYNOPSIS

Top

    $bc = new BuildCache("/path/to/build_cache", $create_flags_hash);
    $bc->cache_file($file_info_of_file_to_archive, $file_key);
    $bc->cleanup();	 # Clean out files that haven't been used for a while.
    $bc_entry = $bc->lookup_file($file_key);

    $build_info = $bc_entry->build_info;
    $bc_entry->copy_from_cache($output_finfo);

The BuildCache package

Top

The BuildCache is a cache system that makepp uses to store the results of compilation so that they can be used later. If a file with the same input signature is needed, it can be fetched again immediately instead of rebuilt. This can cut down compilation time significantly in a number of cases. For example:

Cache format

The cache is actually a directory hierarchy where the filename of each file is the build cache key. For example, if the build cache key of a file is 0123456789abcdef, the actual file name might be 01/234/56789abcdef_xyz.o. On some file systems, performance suffers if there are too many files per directory, so BuildCache can automatically break them up into directories as shown.

It remembers the key that it was given, which is presumably some sort of hash of all the inputs that went into building the file. BuildCache does remember the build info structure for the file. This is intended to help in the very rare case where there is a collision in the key, and several files have the same key. BuildCache cannot store multiple files with the same key, but by storing the build information it is at least possible to determine that the given file is the wrong file.

Use of FileInfo

We do not use the FileInfo class to store information about the files in the build cache. The reason is that we don't want to waste the memory storing all the results. Typically things are looked up once in the build cache and never examined again, so it's a waste of memory to build up the FileInfo structures for them. For this reason, for any files in the build cache directories, we do the stat and other operations directly instead of calling the FileInfo subroutines.

We do use the FileInfo subroutines for files stored elsewhere, however.

new BuildCache("/path/to/cache");

Opens an existing build cache.

cache_file

   $build_cache->cache_file($file_info, $file_key, $build_info);

Copies or links the file into the build cache with the given file key. Also the build information is stored alongside the file so that when it is retrieved we can verify that in fact it is exactly what we want.

Returns a true value if the operation succeded, false if any part failed. If anything failed in updating the build cache, the cache is cleaned up and left in a consistent state.

lookup_file

  $bc_entry = $bc->lookup_file($file_key);

Lookup a file by its cache key. Returns undef if the file does not exist in the cache. Returns a BuildCache::Entry structure if it does exist. You can query the BuildCache::Entry structure to see what the build info is, or to copy the file into the current directory.

copy_check_md5

    my $md5;
    my $result = copy_check_md5("in", "out", \$md5, $setmode);

Assuming that the input file is atomically generated and removed, copy_check_md5 will either copy the file as-is or return undef with $! set, even if the input file is unlinked and/or re-created concurrently, even over NFS. Mode bits are copied as well if $mode is true. Copy_check_md5 will instead die if it detects that the input file is not being written atomically, or if it detects something that it can't explain.

If a Digest object is provided as a third argument, then the file's content is added to it. It may be modified even if the copy fails. See Digest(3pm).

A successful copy will return a 2-element array consisting of the size and modification time of the input file.

If the return value is an empty array, then $! is set as follows:

ENOENT

The input file was removed while it was being read.

ESTALE

The output file was removed while it was being written, or the directory containing the input file was removed.

Others

Many other errors are possible, such as EACCES, EINTR, EIO, EISDIR, ENFILE EMFILE, EFBIG, ENOSPC, EROFS, EPIPE, ENAMETOOLONG, ENOSTR. In most cases, these are non-transient conditions that require manual intervention, and should therefore cause the program to terminate.

The BuildCache::Entry package

Top

A BuildCache::Entry is an object returned by BuildCache::lookup_file. You can do the following with the object:

absolute_filename

   $bc_entry->absolute_filename

Returns the name of the file in the build cache.

copy_from_cache

  $bc_entry->copy_from_cache($output_finfo, $rule, \$reason);

Replaces the file in $output_finfo with the file from the cache, and updates all the FileInfo data structures to reflect this change. The build info signature is checked against the target file in the cache, and if $::md5check_bc is set, then the MD5 checksum is also verified.

Returns true if the file was successfully restored from the cache, false if not. (I think the only reason it wouldn't be successfully restored is that someone deleted the file from cache between the time it was returned from lookup_file and the time copy_from_cache is invoked.) If it returns false, then $reason is set to a string that explains why. If $reason ends with '(OK)', then the failure could have been due to legitimate concurrent access of the build cache. If it fails, then the output target is unlinked.


makepp documentation  | view source Contained in the makepp distribution.