Compress::BraceExpansion - create a human-readable compressed string suitable for shell brace expansion


Compress-BraceExpansion documentation  | view source Contained in the Compress-BraceExpansion distribution.

Index


NAME

Top

Compress::BraceExpansion - create a human-readable compressed string suitable for shell brace expansion

VERSION

Top

This document describes Compress::BraceExpansion version 0.1.5. This is a beta release.

SYNOPSIS

Top

    use Compress::BraceExpansion;

    # output: ab{c,d}
    print Compress::BraceExpansion->new( qw( abc abd ) )->shrink();

    # output: aabb{cc,dd}
    print Compress::BraceExpansion->new( qw( aabbcc aabbdd ) )->shrink();

    # output: aa{bb{cc,dd},eeff}
    print Compress::BraceExpansion->new( qw( aabbcc aabbdd aaeeff ) )->shrink();




DESCRIPTION

Top

Shells such as bash and zsh have a feature call brace expansion. These allow users to specify an expression to generate a series of strings that contain similar patterns. For example:

  $ echo a{b,c}
  ab ac

  $ echo aa{bb,xx}cc
  aabbcc aaxxcc

  $ echo a{b,x}c{d,y}e
  abcde abcye axcde axcye

  $ echo a{b,x{y,z}}c
  abc axyc axzc

This module was designed to take a list of strings with similar patterns (e.g. the output of a shell expansion) and generate an un-expanded expression. Given a reasonably sized array of similar strings, this module will generate a single compressed string that can be comfortably parsed by a human.

The current algorithm only works for groups of input strings that start with and/or end with similar characters. See BUGS AND LIMITATIONS section for more details.

WHY?

Top

My initial motivation to write this module was to compress the number of characters that are necessary to display a list of server names, e.g. to send in the subject of a text message to a pager/mobile phone. If I start with a long list of servers that follow a standard naming convention, e.g.:

    app-dc-srv01 app-dc-srv02 app-dc-srv03 app-dc-srv04 app-dc-srv05
    app-dc-srv06 app-dc-srv07 app-dc-srv08 app-dc-srv09 app-dc-srv10

After running through this module, they can be displayed much more efficiently on a pager as:

    app-dc-srv{0{1,2,3,4,5,6,7,8,9},10}

The algorithm can also be useful for directories:

    /usr/local/{bin,etc,lib,man,sbin}




BRACE EXPANSION?

Top

Despite the name, this module does not perform brace expansion. If it did, it probably should have been located in the Shell:: heirarchy. It attempts to do the opposite which might be referred to as 'brace compression', hence the location it in the Compress:: heirarchy. The strings it generates could be used in a shell, but are more likely useful to make a (potentially) human-readable compressed string. I chose the name BraceExpansion since that's the common term, so hopefully it will be more recognizable than if it were named BraceCompression.

CONSTRUCTOR

Top

new( )

Returns a reference to a new Compress::BraceExpansion object.

May be initialized with a hash of options:

    Compress::BraceExpansion->new( { strings => [ qw( abc abd ) ] } );

Or with an array ref:

    Compress::BraceExpansion->new( [ qw( abc abd ) ] );

Or with an array:

    Compress::BraceExpansion->new( qw( abc abd ) );

This is an inside-out perl class. For more info, see "Perl Best Practices" by Damian Conway

METHODS

Top

shrink( )

Perform brace compression on strings. Returns a string that is suitable for brace expansion by the shell.

This method has not been designed being called multiple times on the same Compress::BraceExpansion object. If you call shrink() more than once on the same object, you're on your own.

enable_debug( )

Enable various internal data structures to be printed to stdout.

BUGS AND LIMITATIONS

Top

The current algorithm is pretty ugly, and will only compress strings that start and/or end with similar text. I've been working on a new algorithm that uses a weighted trie.

If multiple identical strings are supplied as input, they will only be represented once in the resulting compressed string. For example, if "aaa aaa aab" was supplied as input to shrink(), then the result would simply be "aa{a,b}".

This module has reasonably fast performance to at least 1000 inputs strings. I've run several tests where I cut a 10k word slice from /usr/share/dict/words and have consistently achieved around 50% compression. However, even for strings that are very similar, the output rapidly loses human readability beyond a couple hundred characters.

Please report problems to VVu@geekfarm.org.

Patches and suggestions are welcome!

SEE ALSO

Top

  - brace-compress - included command line script in scripts/ directory

  - http://www.gnu.org/software/bash/manual/bashref.html#SEC27

  - http://zsh.sourceforge.net/Doc/Release/zsh_13.html#SEC60




AUTHOR

Top

Alex White <vvu@geekfarm.org>

LICENCE AND COPYRIGHT

Top


Compress-BraceExpansion documentation  | view source Contained in the Compress-BraceExpansion distribution.