Search::Tools::Query - objectified string for highlighting, snipping, etc.


Search-Tools documentation Contained in the Search-Tools distribution.

Index


Code Index:

NAME

Top

Search::Tools::Query - objectified string for highlighting, snipping, etc.

SYNOPSIS

Top

 use Search::Tools::QueryParser;
 my $qparser  = Search::Tools::QueryParser->new;
 my $query    = $qparser->parse(q(the quick color:brown "fox jumped"));
 my $terms    = $query->terms; # ['quick', 'brown', '"fox jumped"']
 my $regex    = $query->regex_for($terms->[0]); # S::T::RegEx
 my $tree     = $query->tree; # the Search::Query::Dialect tree()
 print "$query\n";  # the quick color:brown "fox jumped"
 print $query->str . "\n"; # same thing




DESCRIPTION

Top

METHODS

Top

terms

Array ref of key words from the original query string. See Search::Tools::QueryParser for controls over ignore_fields() and tokenizing regex.

NOTE: Only positive words are extracted by QueryParser. In other words, if you search for:

 foo not bar

then only foo is returned. Likewise:

 +foo -bar

would return only foo.

str

The original string.

regex

The hash ref of terms to Search::Tools::RegEx objects.

dialect

The internal Search::Query::Dialect object. See tree() and str_clean() which delegate to the dialect object.

qp

The Search::Tools::QueryParser object used to generate the Query.

from_regexp_keywords( RegExp_Keywords_object )

Class method for easing backwards compatability with the pre-0.24 API.

num_terms

Returns the number of terms().

tree

Returns the internal Search::Query::Dialect tree().

str_clean

Returns the internal Search::Query::Dialect stringify().

regex_for(term)

Returns a Search::Tools::RegEx object for term.

regexp_for

Alias for regex_for(). The author has come to prefer "regex" instead of "regexp" because it's one less keystroke.

matches_text( text )

Returns the number of matches for the query against text.

matches_html( html )

Returns the number of matches for the query against html.

terms_as_regex([treat_phrases_as_singles])

Returns all terms() as a single qr// regex, pipe-joined in a "OR" logic.

AUTHOR

Top

Peter Karman <karman@cpan.org>

BUGS

Top

Please report any bugs or feature requests to bug-search-tools at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Search-Tools. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

Top

You can find documentation for this module with the perldoc command.

    perldoc Search::Tools




You can also look for information at:

* RT: CPAN's request tracker

http://rt.cpan.org/NoAuth/Bugs.html?Dist=Search-Tools

* AnnoCPAN: Annotated CPAN documentation

http://annocpan.org/dist/Search-Tools

* CPAN Ratings

http://cpanratings.perl.org/d/Search-Tools

* Search CPAN

http://search.cpan.org/dist/Search-Tools/

COPYRIGHT

Top

SEE ALSO

Top

Search::Query::Dialect


Search-Tools documentation Contained in the Search-Tools distribution.
package Search::Tools::Query;
use strict;
use warnings;
use base qw( Search::Tools::Object );
use overload
    '""'     => sub { $_[0]->str; },
    'bool'   => sub {1},
    fallback => 1;
use Carp;
use Data::Dump qw( dump );
use Search::Tools::RegEx;
use Search::Tools::UTF8;

our $VERSION = '0.59';

__PACKAGE__->mk_ro_accessors(
    qw(
        terms
        dialect
        str
        regex
        qp
        )
);

# backcompat
sub from_regexp_keywords {
    my $proto = shift;
    my $class = ref($proto) || $proto;
    my $rekw  = shift or croak "RegExp::Keywords required";

    #dump $rekw;

    my $regex = {};
    for ( $rekw->keywords ) {
        my $rek = $rekw->{hash}->{$_} or croak "no Keyword object for $_";
        $regex->{$_} = Search::Tools::RegEx->new(
            plain     => $rek->plain,
            html      => $rek->html,
            is_phrase => $rek->phrase,
            term      => $rek->word,
        );
    }
    my $self = $class->new(
        terms => $rekw->{array},
        regex => $regex,
        str   => $rekw->{kw}->{query},
        qp    => $rekw->{kw},
    );
    return $self;
}

sub num_terms {
    return scalar @{ shift->{terms} };
}

sub tree {
    my $self = shift;
    return $self->dialect->tree();
}

sub str_clean {
    my $self = shift;
    return $self->dialect->stringify();
}

sub regex_for {
    my $self = shift;
    my $term = shift;
    unless ( defined $term ) {
        croak "term required";
    }
    my $regex = $self->{regex} or croak "regex not defined for query";
    if ( !exists $regex->{$term} ) {
        croak "no regex for $term";
    }
    return $regex->{$term};
}

*regexp_for = \&regex_for;

sub _matches {
    my $self  = shift;
    my $style = shift;
    my $count = 0;
    for my $term ( @{ $self->{terms} } ) {
        my $regex = $self->{regex}->{$term}->{$style};
        $count += to_utf8($_[0]) =~ m/$regex/;
    }
    return $count;
}

sub matches_text {
    my $self = shift;
    my $text = shift;
    if ( !defined $text ) {
        croak "text required";
    }
    return $self->_matches( 'plain', $text );
}

sub matches_html {
    my $self = shift;
    my $html = shift;
    if ( !defined $html ) {
        croak "html required";
    }
    return $self->_matches( 'html', $html );
}

sub terms_as_regex {
    my $self                     = shift;
    my $treat_phrases_as_singles = shift;
    $treat_phrases_as_singles = 1 unless defined $treat_phrases_as_singles;
    my $wildcard = $self->qp->wildcard;
    my $wild_esc = quotemeta($wildcard);
    my $wc       = $self->qp->word_characters;
    my @re;
    for my $term ( @{ $self->{terms} } ) {

        my $q = quotemeta($term);    # quotemeta speeds up the match, too
                                     # even though we have to unquote below

        $q =~ s/\\$wild_esc/[$wc]*/g;    # wildcard match is very approximate

        # treat phrases like OR'd words
        # since that will just create more matches.
        # if hiliting later, the phrase will be treated as such.
        $q =~ s/(\\ )+/\|/g if $treat_phrases_as_singles;

        push( @re, $q );
    }

    my $j = sprintf( '(%s)', join( '|', @re ) );
    return qr/$j/i;
}

1;

__END__