Parse::Pyapp - PCFG Parser


Parse-Pyapp documentation  | view source Contained in the Parse-Pyapp distribution.

Index


NAME

Top

Parse::Pyapp - PCFG Parser

SYNOPSIS

Top

  use Parse::Pyapp;

  my $parser = Parse::Pyapp->new();

  $parser->addrule($LHS, [ $RHS_1, $P_RHS_1 ], [ $RHS_2, $P_RHS_2 ]);

  $parser->addlex($LHS, [ $RHS_1, $P_RHS_1 ], [ $RHS_2, $P_RHS_2 ]);

  $parser->start($LHS);

  $parser->parse(@words) or print "Parse error\n";

DESCRIPTION

Top

This module is a (PCFG | SCFG) parser. You may use this module to do stochastic parsing.

USAGE

Top

Initiation of a parser

    $parser = Parse::Pyapp->new();

Adding lexicons

    $parser->addlex('N',
		      [ 'house', .5 ],
		      [ 'book', .5 ]
		      );

You can hook an semantic action to alexicon. For instance,

    $parser->addlex('N',
			[ 'house', .5 ],
			[ 'book', .5 ],
                        sub { print $_[1] }
		      );

Parse::Pyapp passes the parser itself as the first parameter, and the lexicon comes in the second place. The left-hand-side symbol can be accessed with $_[0]->{lhs}.

Adding rules

    $parser->addrule('VP',
		   [ 'V', 0.5 ],
		   [ 'V', 'NP', .5 ]
		   );

First one is the LHS symbol, and then follow all the possible right-hand-side derivations with their probabilities.

Similarly, you can hook semantic actions to the end of a derivation. For instance,

    $parser->addrule('VP',
		   [ 'V', 0.5, sub { print $_[1] } ],
		   [ 'V', 'NP', .5 ]
		   );

Parse::Pyapp passes the parser itself as the first parameter, and the corresponding tokens as the rest. The left-hand-side symbol can be accessed with $_[0]->{lhs}, and right-hand POS tags with @{$_->{pos}}

Currently, this module does not check if the sum of probabilities going out from a non-terminal is equal to 1.

Setting the starting symbol

    $parser->start('S');

Parsing a sentence

You need to tokenize the sentence yourself.

    $parser->parse(@words);

It returns non-undef if there is no error.

CAVEATS

Top

This is still an alpha version, and everything is subject to change. Use it with your cautions. By the way, since it's all written in Perl, thus slowness is the fate.

TO DO

Top

Grammar learning, lexical relations, structural modeling, yacc-like input, error handling, etc. There is a lot of room for improvement.

COPYRIGHT

Top


Parse-Pyapp documentation  | view source Contained in the Parse-Pyapp distribution.