Search::Tools::HeatMap - locate the best matches in a snippet extract


Search-Tools documentation  | view source Contained in the Search-Tools distribution.

Index


NAME

Top

Search::Tools::HeatMap - locate the best matches in a snippet extract

SYNOPSIS

Top

 use Search::Tools::Tokenizer;
 use Search::Tools::HeatMap;

 my $tokens = $self->tokenizer->tokenize( $my_string, qr/^(interesting)$/ );
 my $heatmap = Search::Tools::HeatMap->new(
     tokens         => $tokens,
     window_size    => 20,  # default
     as_sentences   => 0,   # default
 );

 if ( $heatmap->has_spans ) {

     my $tokens_arr = $tokens->as_array;

     # stringify positions
     my @snips;
     for my $span ( @{ $heatmap->spans } ) {
         push( @snips, $span->{str} );
     }
     my $occur_index = $self->occur - 1;
     if ( $#snips > $occur_index ) {
         @snips = @snips[ 0 .. $occur_index ];
     }
     printf("%s\n", join( ' ... ', @snips ));

 }

DESCRIPTION

Top

Search::Tools::HeatMap implements a simple algorithm for locating the densest clusters of unique, hot terms in a TokenList.

HeatMap is used internally by Snipper but documented here in case someone wants to abuse and/or improve it.

METHODS

Top

new( tokens => TokenList )

Create a new HeatMap. The TokenList object may be either a Search::Tools::TokenList or Search::Tools::TokenListPP object.

init

Builds the HeatMap object. Called internally by new().

window_size

The max width of a span. Defaults to 20 tokens, including the matches.

Set this in new(). Access it later if you need to, but the spans will have already been created by new().

as_sentences

Try to match clusters at sentence boundaries. Default is false.

Set this in new().

spans

Returns an array ref of matching clusters. Each span in the array is a hash ref with the following keys:

cluster
pos
heat
str
str_w_pos

This item is available only if debug() is true.

unique

has_spans

Returns the number of spans found.

AUTHOR

Top

Peter Karman <karman at cpan dot org>

ACKNOWLEDGEMENTS

Top

The idea of the HeatMap comes from KinoSearch, though the implementation here is original.

BUGS

Top

Please report any bugs or feature requests to bug-search-tools at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Search-Tools. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

Top

You can find documentation for this module with the perldoc command.

    perldoc Search::Tools




You can also look for information at:

* RT: CPAN's request tracker

http://rt.cpan.org/NoAuth/Bugs.html?Dist=Search-Tools

* AnnoCPAN: Annotated CPAN documentation

http://annocpan.org/dist/Search-Tools

* CPAN Ratings

http://cpanratings.perl.org/d/Search-Tools

* Search CPAN

http://search.cpan.org/dist/Search-Tools/

COPYRIGHT

Top

SEE ALSO

Top

KinoSearch


Search-Tools documentation  | view source Contained in the Search-Tools distribution.