| Search-Tools documentation | view source | Contained in the Search-Tools distribution. |
Search::Tools::HeatMap - locate the best matches in a snippet extract
use Search::Tools::Tokenizer;
use Search::Tools::HeatMap;
my $tokens = $self->tokenizer->tokenize( $my_string, qr/^(interesting)$/ );
my $heatmap = Search::Tools::HeatMap->new(
tokens => $tokens,
window_size => 20, # default
as_sentences => 0, # default
);
if ( $heatmap->has_spans ) {
my $tokens_arr = $tokens->as_array;
# stringify positions
my @snips;
for my $span ( @{ $heatmap->spans } ) {
push( @snips, $span->{str} );
}
my $occur_index = $self->occur - 1;
if ( $#snips > $occur_index ) {
@snips = @snips[ 0 .. $occur_index ];
}
printf("%s\n", join( ' ... ', @snips ));
}
Search::Tools::HeatMap implements a simple algorithm for locating the densest clusters of unique, hot terms in a TokenList.
HeatMap is used internally by Snipper but documented here in case someone wants to abuse and/or improve it.
Create a new HeatMap. The TokenList object may be either a Search::Tools::TokenList or Search::Tools::TokenListPP object.
Builds the HeatMap object. Called internally by new().
The max width of a span. Defaults to 20 tokens, including the matches.
Set this in new(). Access it later if you need to, but the spans will have already been created by new().
Try to match clusters at sentence boundaries. Default is false.
Set this in new().
Returns an array ref of matching clusters. Each span in the array is a hash ref with the following keys:
This item is available only if debug() is true.
Returns the number of spans found.
Peter Karman <karman at cpan dot org>
The idea of the HeatMap comes from KinoSearch, though the implementation here is original.
Please report any bugs or feature requests to bug-search-tools at rt.cpan.org, or through
the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Search-Tools.
I will be notified, and then you'll
automatically be notified of progress on your bug as I make changes.
You can find documentation for this module with the perldoc command.
perldoc Search::Tools
You can also look for information at:
Copyright 2009 by Peter Karman.
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
KinoSearch
| Search-Tools documentation | view source | Contained in the Search-Tools distribution. |