######################################################################

Archive::Tar::Wrapper 0.14
######################################################################

NAME

Archive::Tar::Wrapper - API wrapper around the 'tar' utility

SYNOPSIS

use Archive::Tar::Wrapper;

my $arch = Archive::Tar::Wrapper->new();

            # Open a tarball, expand it into a temporary directory
        $arch->read("archive.tgz");

            # Iterate over all entries in the archive
        $arch->list_reset(); # Reset Iterator
                             # Iterate through archive
        while(my $entry = $arch->list_next()) {
            my($tar_path, $phys_path) = @$entry;
            print "$tar_path\n";
        }

            # Get a huge list with all entries
        for my $entry (@{$arch->list_all()}) {
            my($tar_path, $real_path) = @$entry;
            print "Tarpath: $tar_path Tempfile: $real_path\n";
        }

            # Add a new entry
        $arch->add($logic_path, $file_or_stringref);

            # Remove an entry
        $arch->remove($logic_path);

            # Find the physical location of a temporary file
        my($tmp_path) = $arch->locate($tar_path);

            # Create a tarball
        $arch->write($tarfile, $compress);

DESCRIPTION

Archive::Tar::Wrapper is an API wrapper around the 'tar' command line utility. It never stores anything in memory, but works on temporary directory structures on disk instead. It provides a mapping between the logical paths in the tarball and the 'real' files in the temporary directory on disk.

It differs from Archive::Tar in two ways:

METHODS

my $arch = Archive::Tar::Wrapper->new()

        Constructor for the tar wrapper class. Finds the "tar" executable by
        searching "PATH" and returning the first hit. In case you want to
        use a different tar executable, you can specify it as a parameter:

            my $arch = Archive::Tar::Wrapper->new(tar => '/path/to/tar');

        Since "Archive::Tar::Wrapper" creates temporary directories to store
        tar data, the location of the temporary directory can be specified:

            my $arch = Archive::Tar::Wrapper->new(tmpdir => '/path/to/tmpdir');

        Tremendous performance increases can be achieved if the temporary
        directory is located on a ram disk. Check the "Using RAM Disks"
        section below for details.

        Additional options can be passed to the "tar" command by using the
        "tar_read_options" and "tar_write_options" parameters. Example:

             my $arch = Archive::Tar::Wrapper->new(
                           tar_read_options => "p"
                        );

        will use "tar xfp archive.tgz" to extract the tarball instead of
        just "tar xf archive.tgz". Gnu tar supports even more options, these
        can be passed in via

             my $arch = Archive::Tar::Wrapper->new(
                            tar_gnu_read_options => ["--numeric-owner"],
                        );

        By default, the "list_()" functions will return only file entries.
        Directories will be suppressed. To have "list_()" return
        directories as well, use

             my $arch = Archive::Tar::Wrapper->new(
                           dirs  => 1
                        );

        If more files are added to a tarball than the command line can
        handle, "Archive::Tar::Wrapper" will switch from using the command

            tar cfv tarfile file1 file2 file3 ...

        to

            tar cfv tarfile -T filelist

        where "filelist" is a file containing all file to be added. The
        default for this switch is 512, but it can be changed by setting the
        parameter "max_cmd_line_args":

             my $arch = Archive::Tar::Wrapper->new(
                 max_cmd_line_args  => 1024
             );

$arch->read("archive.tgz")

        "read()" opens the given tarball, expands it into a temporary
        directory and returns 1 on success und "undef" on failure. The
        temporary directory holding the tar data gets cleaned up when $arch
        goes out of scope.

        "read" handles both compressed and uncompressed files. To find out
        if a file is compressed or uncompressed, it tries to guess by
        extension, then by checking the first couple of bytes in the
        tarfile.

        If only a limited number of files is needed from a tarball, they can
        be specified after the tarball name:

            $arch->read("archive.tgz", "path/file.dat", "path/sub/another.txt");

        The file names are passed unmodified to the "tar" command, make sure
        that the file paths match exactly what's in the tarball, otherwise
        "read()" will fail.

$arch->list_reset()

        Resets the list iterator. To be used before the first call to
        $arch-list_next()>.

my($tar_path, $phys_path, $type) = $arch->list_next()

        Returns the next item in the tarfile. It returns a list of three
        scalars: the relative path of the item in the tarfile, the physical
        path to the unpacked file or directory on disk, and the type of the
        entry (f=file, d=directory, l=symlink). Note that by default,
        Archive::Tar::Wrapper won't display directories, unless the "dirs"
        parameter is set when running the constructor.

my $items = $arch->list_all()

        Returns a reference to a (possibly huge) array of items in the
        tarfile. Each item is a reference to an array, containing two
        elements: the relative path of the item in the tarfile and the
        physical path to the unpacked file or directory on disk.

        To iterate over the list, the following construct can be used:

                # Get a huge list with all entries
            for my $entry (@{$arch->list_all()}) {
                my($tar_path, $real_path) = @$entry;
                print "Tarpath: $tar_path Tempfile: $real_path\n";
            }

        If the list of items in the tarfile is big, use "list_reset()" and
        "list_next()" instead of "list_all".

$arch->add($logic_path, $file_or_stringref, [$options])

        Add a new file to the tarball. $logic_path is the virtual path of
        the file within the tarball. $file_or_stringref is either a scalar,
        in which case it holds the physical path of a file on disk to be
        transferred (i.e. copied) to the tarball. Or it is a reference to a
        scalar, in which case its content is interpreted to be the data of
        the file.

        If no additional parameters are given, permissions and user/group id
        settings of a file to be added are copied. If you want different
        settings, specify them in the options hash:

            $arch->add($logic_path, $stringref, 
                       { perm => 0755, uid => 123, gid => 10 });

        If $file_or_stringref is a reference to a Unicode string, the
        "binmode" option has to be set to make sure the string gets written
        as proper UTF-8 into the tarfile:

            $arch->add($logic_path, $stringref, { binmode => ":utf8" });

$arch->remove($logic_path)

        Removes a file from the tarball. $logic_path is the virtual path of
        the file within the tarball.

$arch->locate($logic_path)

        Finds the physical location of a file, specified by $logic_path,
        which is the virtual path of the file within the tarball. Returns a
        path to the temporary file "Archive::Tar::Wrapper" created to
        manipulate the tarball on disk.

$arch->write($tarfile, $compress)

        Write out the tarball by tarring up all temporary files and
        directories and store it in $tarfile on disk. If $compress holds a
        true value, compression is used.

$arch->tardir()

        Return the directory the tarball was unpacked in. This is sometimes
        useful to play dirty tricks on "Archive::Tar::Wrapper" by
        mass-manipulating unpacked files before wrapping them back up into
        the tarball.

$arch->is_gnu()

        Checks if the tar executable is a GNU tar by running 'tar --version'
        and parsing the output for "GNU".

Using RAM Disks

On Linux, it's quite easy to create a RAM disk and achieve tremendous speedups while untarring or modifying a tarball. You can either create the RAM disk by hand by running

       # mkdir -p /mnt/myramdisk
       # mount -t tmpfs -o size=20m tmpfs /mnt/myramdisk

and then feeding the ramdisk as a temporary directory to Archive::Tar::Wrapper, like

my $tar = Archive::Tar::Wrapper->new( tmpdir => '/mnt/myramdisk' );

or using Archive::Tar::Wrapper's built-in option 'ramdisk':

       my $tar = Archive::Tar::Wrapper->new( 
           ramdisk => { 
               type => 'tmpfs',
               size => '20m',   # 20 MB
           },
       );

Only drawback with the latter option is that creating the RAM disk needs to be performed as root, which often isn't desirable for security reasons. For this reason, Archive::Tar::Wrapper offers a utility functions that mounts the ramdisk and returns the temporary directory it's located in:

          # Create new ramdisk (as root):
        my $tmpdir = Archive::Tar::Wrapper->ramdisk_mount(
            type => 'tmpfs',
            size => '20m',   # 20 MB
        );

          # Delete a ramdisk (as root):
        Archive::Tar::Wrapper->ramdisk_unmount();

Optionally, the "ramdisk_mount()" command accepts a "tmpdir" parameter pointing to a temporary directory for the ramdisk if you wish to set it yourself instead of letting Archive::Tar::Wrapper create it automatically.

KNOWN LIMITATIONS

BUGS

Archive::Tar::Wrapper doesn't currently handle filenames with embedded newlines.

LEGALESE

Copyright 2005 by Mike Schilli, all rights reserved. This program is free software, you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR

2005, Mike Schilli <cpan@perlmeister.com>