Hash::MultiKey - hashes whose keys can be multiple


Hash-MultiKey documentation Contained in the Hash-MultiKey distribution.

Index


Code Index:

NAME

Top

Hash::MultiKey - hashes whose keys can be multiple

SYNOPSIS

Top

  use Hash::MultiKey;

  # tie first
  tie %hmk, 'Hash::MultiKey';

  # store
  $hmk{['foo', 'bar', 'baz']} = 1;

  # fetch
  $v = $hmk{['foo', 'bar', 'baz']};

  # exists
  exists $hmk{['foo', 'bar', 'baz']}; # true

  # each
  while (($mk, $v) = each %hmk) {
      @keys = @$mk;
      # ...
  }

  # keys
  foreach $mk (keys %hmk) {
      @keys = @$mk;
      # ...
  }

  # values
  foreach $v (values %hmk) {
      $v =~ s/foo/bar/g; # alias, modifies value in %hmk
      # ...
  }

  # delete
  $rmed_value = delete $hmk{['foo', 'bar', 'baz']};

  # clear
  %hmk = ();

  # syntactic sugar, but see risks below
  $hmk{'foo', 'bar', 'baz', 'zoo'} = 2;

  # finally, untie
  untie %hmk;

DESCRIPTION

Top

Hash::MultiKey provides hashes that accept arrayrefs of strings as keys.

Two multi-keys are regarded as being equal if their contents are equal, there is no need to use the same reference to refer to the same hash entry:

    $hmk{['foo', 'bar', 'baz']} = 1;
    exists $hmk{['foo', 'bar', 'baz']}; # different arrayref, but true

A given hash can have multi-keys of different lengths:

    $hmk{['foo']}               = 1; # length 1
    $hmk{['foo', 'bar', 'baz']} = 3; # length 3, no problem

In addition, multi-keys cannot be empty:

    $hmk{[]} = 1; # ERROR

The next sections document how hash-related operations work in a multi-key hash. Some parts have been copied from standard documentation, since everything has standard semantics.

tie

Once you have tied a hash variable to Hash::MultiKey as in

    tie my (%hmk), 'Hash::MultiKey';

you've got a hash whose keys are arrayrefs of strings. Having that in mind everything works as expected.

store

Assignment is this easy:

    $hmk{['foo', 'bar', 'baz']} = 1;

fetch

So is fetching:

    $v = $hmk{['foo', 'bar', 'baz']};

exists

Testing for existence works as usual:

    exists $hmk{['foo', 'bar', 'baz']}; # true

Only whole multi-keys as they were used in assigments have entries. Sub-chains do not exist unless they were assigned some value.

For instance, ['foo'] is a sub-chain of ['foo', 'bar', 'baz'], but if it has no entry in %hmk so far

    exists $hmk{['foo']}; # false

each

As with everyday each(), when called in list context returns a 2-element list consisting of the key and value for the next element of the hash, so that you can iterate over it. When called in scalar context, returns only the key for the next element in the hash.

Remember keys are arrayrefs of strings here:

    while (($mk, $v) = each %hmk) {
        @keys = @$mk;
        # ...
    }

The order in which entries are returned is guaranteed to be the same one as either the keys() or values() function would produce on the same (unmodified) hash.

When the hash is entirely read, a null array is returned in list context (which when assigned produces a false (0) value), and undef in scalar context. The next call to each() after that will start iterating again.

There is a single iterator for each hash, shared by all each(), keys(), and values() function calls in the program.

Adding or deleting entries while we're iterating over the hash results in undefined behaviour. Nevertheless, it is always safe to delete the item most recently returned by each(), which means that the following code will work:

    while (($mk, $v) = each %hmk) {
        print "@$mk\n";
        delete $hmk{$mk}; # this is safe
    }

keys

Returns a list consisting of all the keys of the named hash. (In scalar context, returns the number of keys.) The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the values() or each() function produces (given that the hash has not been modified). As a side effect, it resets hash's iterator.

Remember keys are arrayrefs of strings here:

    foreach $mk (keys %hmk) {
        @keys = @$mk;
        # ...
    }

There is a single iterator for each hash, shared by all each(), keys(), and values() function calls in the program.

The returned values are copies of the original keys in the hash, so modifying them will not affect the original hash. Compare values().

values

Returns a list consisting of all the values of the named hash. (In a scalar context, returns the number of values.) The values are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the keys() or each() function would produce on the same (unmodified) hash.

Note that the values are not copied, which means modifying them will modify the contents of the hash:

   s/foo/bar/g foreach values %hmk;       # modifies %hmk's values
   s/foo/bar/g foreach @hash{keys %hash}; # same

As a side effect, calling values() resets hash's internal iterator.

There is a single iterator for each hash, shared by all each(), keys(), and values() function calls in the program.

delete

Deletes the specified element(s) from the hash. Returns each element so deleted or the undefined value if there was no such element.

The following (inefficiently) deletes all the values of %hmk:

    foreach $mk (keys %hmk) {
        delete $hmk{$mk};
    }

And so do this:

    delete @hmk{keys %hmk};

But both methods are slower than just assigning the empty list to %hmk:

    %hmk = (); # clear %hmk, the efficient way

untie

Untie the variable when you're done:

    untie %hmk;

SYNTACTIC SUGAR

Top

Hash::MultiKey supports also this syntax:

    $hash{'see', '$;', 'in', 'perldoc', 'perlvar'} = 1;

If the key is a string instead of an arrayref the underlying code splits it using $; (see why in MOTIVATION) and from then on the key is an arrayref as any true multi-key. Thus, the assigment above is equivalent to

    $hash{['see', '$;', 'in', 'perldoc', 'perlvar']} = 1;

once it has been processed.

You don't need to split the string back while iterating with each() or keys(), it already comes as an arrayref of strings.

Nevertheless take into account that this is slower, and broken if any of the components contains $;. It is supported just for consistency's sake.

MOTIVATION

Top

Perl comes already with some support for hashes with multi-keys. As you surely know, if perl sees

    $hash{'foo', 'bar', 'baz'} = 1;

it joins ('foo', 'bar', 'baz') with $; to obtain the actual key, thus resulting in a string. Then you retrieve the components of the multi-key like this:

    while (($k, $v) = each %hash) {
        @keys = $k eq '' ? ('') : split /$;/, $k, -1;
        # ...
    }

Since $; is \034 by default, a non-printable character, this is often enough.

Sometimes, however, that's not the most convenient way to work with multi-keys. For instance, that magic join doesn't work with arrays:

    @array = ('foo', 'bar', 'baz');
    $hash{@array} = 1; # WARNING, @array evaluated in scalar context!

You could be dealing with binary data. Or you could be writing a public module that uses user input in such a hash and don't want to rely on input not coming with $;, or don't want to document such an obscure, gratuitous, and implementation dependent constraint.

In such cases, Hash::MultiKey can help.

AUTHORS

Top

Xavier Noria (FXN), Benjamin Goldberg (GOLDBB).

THANKS

Top

Iain Truskett (SPOON) kindly checked whether this module works in perl 5.005 and found out the use of "/" in pack(), introduced in perl 5.006, prevents that.

Philip Monsen reported some tests of Hash::MultiKey 0.05 failed with perl 5.8.4.

COPYRIGHT and LICENSE

Top

SEE ALSO

Top

perlvar, perltie


Hash-MultiKey documentation Contained in the Hash-MultiKey distribution.

package Hash::MultiKey;

use 5.006;
use strict;
use warnings;

use Carp;

use vars qw($VERSION);
$VERSION = '0.06';

# ---[ Implementation Overview ]----------------------------------------
#
# The first implementation of this module was based in an explicit tree.
# Right after its announcement in news:comp.lang.perl.modules Benjamin
# Goldberg suggested a radically different approach, far much simple and
# efficient. The current code is entirely based on his idea.
#
# Multi-key hashes are implemented now with a plain hash. There is no
# nesting involved.
#
# Lists of keys are converted to strings with pack():
#
#     $key = pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys;
#
# and that $key is what's used in the underlying hash. The first chunk
# stores the number of keys, to be used afterwards when we decode it.
# Then, pairs length_of_key/key follow.
#
# Conversely, to retrieve the original list of keys from a real key we
# use unpack():
#
#     $n = unpack 'N', $key;
#     [ unpack 'x4' . ('w/a*' x $n), $key ];
#
# Iteration is delegated to the iterator of the very hash.
#
# Knowing that the following code is crystal clear, so comments have
# been removed altogether.
#
# ----------------------------------------------------------------------


sub TIEHASH {
    bless {}, shift;
}

sub CLEAR {
    %{ shift() } = ();
}

sub FETCH {
    my ($self, $keys) = @_;
    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
    @$keys or croak "Empty multi-key\n";
    $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys};
}

sub STORE {
    my ($self, $keys, $value) = @_;
    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
    @$keys or croak "Empty multi-key\n";
    $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys} = $value;
}

sub DELETE {
    my ($self, $keys) = @_;
    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
    @$keys or croak "Empty multi-key\n";
    delete $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys};
}

sub EXISTS {
    my ($self, $keys) = @_;
    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
    @$keys or croak "Empty multi-key\n";
    exists $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys};
}

sub FIRSTKEY {
    my ($self) = @_;
    keys %$self; # reset iterator
    $self->NEXTKEY;
}

sub NEXTKEY {
    my ($self) = @_;
    defined(my $key = each %$self) or return;
    my $n = unpack 'N', $key;
    [ unpack 'x4' . ('w/a*' x $n), $key ];
}

sub SCALAR {
    my ($self) = @_;
    scalar %$self;
}

1;


__END__