| Number-Nary documentation | Contained in the Number-Nary distribution. |
Number::Nary - encode and decode numbers as n-ary strings
version 1.100311
This module lets you convert numbers into strings that encode the number using the digit set of your choice. For example, you could get routines to convert to and from hex like so:
my ($enc_hex, $dec_hex) = n_codec('0123456789ABCDEF');
my $hex = $enc_hex->(255); # sets $hex to FF
my $num = $dec_hex->('A0'); # sets $num to 160
This would be slow and stupid, since Perl already provides the means to easily and quickly convert between decimal and hex representations of numbers. Number::Nary's utility comes from the fact that it can encode into bases composed of arbitrary digit sets.
my ($enc, $dec) = n_codec('0123'); # base 4 (for working with nybbles?)
# base64
my ($enc, $dec) = n_codec(
join('', 'A' .. 'Z', 'a' .. 'z', 0 .. 9, '+', '/', '=')
);
my ($encode_sub, $decode_sub) = n_codec($digit_string, \%arg);
This routine returns a reference to a subroutine which will encode numbers into the given set of digits and a reference which will do the reverse operation.
The digits may be given as a string or an arrayref. This routine will croak if the set of digits contains repeated digits, or if there could be ambiguity in decoding a string of the given digits. (Number::Nary is overly aggressive about weeding out possibly ambiguous digit sets, for the sake of the author's sanity.)
The encode sub will croak if it is given input other than a non-negative integer.
The decode sub will croak if given a string that contains characters not in the digit string, or, for fixed-string digit sets, if the lenth of the string to decode is not a multiple of the length of the component digits.
Valid arguments to be passed in the second parameter are:
predecode - if given, this coderef will be used to preprocess strings
passed to the decoder
postencode - if given, this coderef will be used to postprocess strings
produced by the encoder
my $string = n_encode($value, $digit_string);
This encodes the given value into a string using the given digit string. It is
written in terms of n_codec, above, so it's not efficient at all for
multiple uses in one process.
my $number = n_decode($string, $digit_string);
This is the decoding equivalent to n_encode, above.
n_codec is exported by default. n_encode and n_decode are exported.
Pairs of routines to encode and decode may be imported by using the
codec_pair group as follows:
use Number::Nary -codec_pair => { digits => '01234567', -suffix => '8' };
my $encoded = encode8($number);
my $decoded = decode8($encoded);
For more information on this kind of exporting, see Sub::Exporter.
I originally used this system to produce unique worksheet names in Excel. I had a large report generating system that used Win32::OLE, and to keep track of what was where I'd Storable-digest the options used to produce each worksheet and then n-ary encode them into the set of characters that were valid in worksheet names. Working out that set of characters was by far the hardest part.
Thanks, Jesse Vincent. When I remarked, on IRC, that this would be trivial to do, he said, "Great. Would you mind doing it?" (Well, more or less.) It was a fun little distraction.
Mark Jason Dominus and Michael Peters offered some useful advice on how to weed out ambiguous digit sets, enabling me to allow digit sets made up of varying-length digits.
Math::BaseCalc is in the same problem space wth Number::Nary. It provides only an OO interface and does not reliably handle multicharacter digits or recognize ambiguous digit sets.
Ricardo Signes <rjbs@cpan.org>
This software is copyright (c) 2010 by Ricardo Signes.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
| Number-Nary documentation | Contained in the Number-Nary distribution. |
use 5.006; use warnings; use strict; package Number::Nary; our $VERSION = '1.100311'; # ABSTRACT: encode and decode numbers as n-ary strings use Carp qw(croak); use Scalar::Util 0.90 qw(reftype); use List::MoreUtils 0.09 qw(uniq); use UDCode (); use Sub::Exporter -setup => { exports => [ qw(n_codec n_encode n_decode) ], groups => { default => [ qw(n_codec) ], codec_pair => \&_generate_codec_pair, } }; sub _generate_codec_pair { my (undef, undef, $arg, undef) = @_; my $local_arg = {%$arg}; my $digits = delete $local_arg->{digits}; my %pair; @pair{qw(encode decode)} = n_codec($digits, $local_arg); return \%pair; } sub _split_len_iterator { my ($length) = @_; return sub { my ($string, $callback) = @_; my $places = length($string) / $length; croak "string length is not a multiple of digit length" unless $places == int $places; for my $position (1 .. $places) { my $digit = substr $string, (-$length * $position), $length; $callback->($digit, $position); } } } sub _split_digit_iterator { my ($digits) = @_; sub { my ($string, $callback) = @_; my @digits; ITER: while (length $string) { for (@$digits) { if (index($string, $_) == 0) { push @digits, substr($string, 0, length $_, ''); next ITER; } } croak "could not decompose string '$string'"; } for (1 .. @digits) { $callback->($digits[-$_], $_); } } } sub _set_iterator { my ($digits, $length_ref) = @_; croak "digit set is empty" unless @$digits; croak "digit set contains zero-length digit" if do { no warnings 'uninitialized'; grep { ! length $_ } @$digits }; croak "digit set contains repeated digits" if @$digits != uniq @$digits; my @lengths = uniq map { length } @$digits; return _split_len_iterator($lengths[0]) if @lengths == 1; croak "digit set may be ambiguous" if ! UDCode::is_udcode(@$digits); return _split_digit_iterator($digits); } sub n_codec { my ($digit_set, $arg) = @_; my @digits; if (ref $digit_set) { croak "digit set must be a string or arrayref" unless reftype $digit_set eq 'ARRAY'; @digits = @$digit_set; } else { @digits = split //, $digit_set; } my $iterator = _set_iterator(\@digits); my $encode_sub = sub { my ($value) = @_; croak "value isn't an non-negative integer" if not defined $value or $value !~ /\A\d+\z/; my $string = ''; if (@digits == 1) { $string = $digits[0] x $value; } else { while (1) { my $digit = $value % @digits; $value = int($value / @digits); $string = "$digits[$digit]$string"; last unless $value; } } $string = $arg->{postencode}->($string) if $arg->{postencode}; return $string; }; my %digit_value = do { my $i = 0; map { $_ => $i++ } @digits; }; my $decode_sub = sub { my ($string) = @_; return unless defined $string; $string = $arg->{predecode}->($string) if $arg->{predecode}; my $value = 0; $iterator->($string, sub { my ($digit, $position) = @_; croak "string to decode contains invalid digits" unless exists $digit_value{$digit}; # Stupid hack, but I'm just cramming unary support in here at the moment. # It can be polished up later, if needed. -- rjbs, 2009-11-22 $value += @digits == 1 ? 1 : ($digit_value{$digit} * @digits ** ($position++ - 1)); }); return $value; }; return ($encode_sub, $decode_sub); } # If you really can't stand using n_codec, you could memoize these. sub n_encode { (n_codec($_[1]))[0]->($_[0]) } sub n_decode { (n_codec($_[1]))[1]->($_[0]) } 1; # my ($encode_sub, $decode_sub) = n_codec('8675309'); # jennynary __END__