List::Search - fast searching of sorted lists


List-Search documentation Contained in the List-Search distribution.

Index


Code Index:

NAME

Top

List::Search - fast searching of sorted lists

SYNOPSIS

Top

    use List::Search qw( list_search nlist_search custom_list_search );

    # Create a list to search
    my @list = sort qw( bravo charlie delta );

    # Search for a value, returns the index of first match
    print list_search( 'alpha',   \@list );    #  0
    print list_search( 'charlie', \@list );    #  1
    print list_search( 'zebra',   \@list );    #  -1

    # Search numerically
    my @numbers = sort { $a <=> $b } ( 10, 20, 100, 200, );
    print nlist_search( 20, \@numbers );       #  2

    # Search using some other comparison
    my $cmp_code = sub { lc( $_[0] ) cmp lc( $_[1] ) };
    my @custom_list = sort { $cmp_code->( $a, $b ) } qw( FOO bar BAZ bundy );
    print list_search_generic( $cmp_code, 'foo', \@custom_list );

DESCRIPTION

Top

This module lets you quickly search a sorted list. It will return the index of the first entry that matches, or if there is no exact matches then the first entry that is greater than the search key.

For example in the list my @list = qw( bob dave fred ); searching for dave will return 1 as $list[1] eq 'dave'. Searching for charles will also return 1 as dave is the first entry that is greater than charles.

If there are none of the entries match then -1 is returned. You can either check for this or use it as an index to get the last values in the list. Whichever approach you choose will depend on what you are trying to do.

The actual searching is done using a binary search which is very fast.

METHODS

Top

  my $idx = list_search( $key, \@sorted_list );

Searches the list using cmp as the comparison operator. Returns the index of the first entry that is equal to or greater than $key. If there is no match then returns -1.

  my $idx = nlist_search( $key, \@sorted_list );

Searches the list using <=> as the comparison operator. Returns the index of the first entry that is equal to or greater than $key. If there is no match then returns -1.

WARNING: I intend to change this method so that it accepts a block in the same way that sort does. This means that you will be able to use $a and $b as expected. Until then take care with this one : )

  my $cmp_sub = sub { $_[0] cmp $_[1] };
  my $idx = custom_list_search( $cmp_sub, $key, \@sorted_list );

Searches the list using the subroutine to compare the values. Returns the index of the first entry that is equal to or greater than $key. If there is no match then returns -1.

NOTE - the list must have been sorted using the same comparison, ie:

  my @sorted_list = sort { $cmp_sub->( $a, $b ) } @list;

list_contains, nlist_contains, custom_list_contains

    my $bool =  list_contains( $key, \@sorted_list );   # string sort
    my $bool = nlist_contains( $key, \@sorted_list );   # number sort

    my $bool = custom_list_contains( $cmp_sub_ref, $key, \@sorted_list );

Returns true if $key was found in the list, false otherwise.

AUTHOR

Top

Edmund von der Burg <evdb@ecclestoad.co.uk>

http://www.ecclestoad.co.uk

SEE ALSO

Top

For fast sorting of lists try Sort::Key. For matching on not just the start of the item try Text::Match::FastAlternatives. For matching in an unsorted list try List::MoreUtils.

CREDITS

Top

Sean Woolcock submitted several bug fixes which were included in 0.3

SVN ACCESS

Top

You can access the latest (possibly unstable) code here:

http://dev.ecclestoad.co.uk/svn/cpan/List-Search

COPYRIGHT

Top


List-Search documentation Contained in the List-Search distribution.
use strict;
use warnings;

package List::Search;

our $VERSION = '0.3';

use vars qw(@ISA @EXPORT_OK);
@ISA       = qw(Exporter);
@EXPORT_OK = qw(
  list_search   nlist_search   custom_list_search
  list_contains nlist_contains custom_list_contains
);

sub list_search {
    my ( $key, $array_ref ) = @_;
    return custom_list_search( \&_alpha_sort, $key, $array_ref );
}

sub nlist_search {
    my ( $key, $array_ref ) = @_;
    return custom_list_search( \&_numeric_sort, $key, $array_ref );
}

sub custom_list_search {
    my ( $cmp_code, $key, $array_ref ) = @_;

    my $max_index = scalar(@$array_ref) - 1;
    return -1 if $max_index < 0;

    my $low  = 0;
    my $mid  = undef;
    my $high = $max_index;

    while ( $low <= $high ) {
        $mid = int( $low + ( ( $high - $low ) / 2 ) );
        my $mid_val = $array_ref->[$mid];

        my $cmp_result = $cmp_code->( $key, $mid_val );

        if ( $cmp_result > 0 ) {
            $low = $mid + 1;
        }
        else {
            $high = $mid - 1;
        }
    }

    # Look at the values here and work out what to return.

    # Perhaps there are no matches in the array
    return -1 if $cmp_code->( $key, $array_ref->[-1] ) == 1;

    # Perhaps $mid is just before the best match
    return $mid + 1 if $cmp_code->( $key, $array_ref->[$mid] ) == 1;

    # $mid is correct
    return $mid;
}

sub list_contains {
    my ( $key, $array_ref ) = @_;
    return custom_list_contains( \&_alpha_sort, $key, $array_ref );
}

sub nlist_contains {
    my ( $key, $array_ref ) = @_;
    return custom_list_contains( \&_numeric_sort, $key, $array_ref );
}

sub custom_list_contains {
    my ( $code, $key, $array_ref ) = @_;

    # Get the index of the key
    my $idx = custom_list_search( $code, $key, $array_ref );

    # Compare the key to the index
    my $cmp_result = $code->( $key, $array_ref->[$idx] );

    return $cmp_result == 0    # is there a difference?
      ? 1                      # there was no difference, so $key is in array
      : 0;                     # $key is not in array
}

sub _alpha_sort   { $_[0] cmp $_[1]; }
sub _numeric_sort { $_[0] <=> $_[1]; }

1;