DBIx::MyParsePP - Pure-perl SQL parser based on MySQL grammar and lexer


DBIx-MyParsePP documentation Contained in the DBIx-MyParsePP distribution.

Index


Code Index:

NAME

Top

DBIx::MyParsePP - Pure-perl SQL parser based on MySQL grammar and lexer

SYNOPSIS

Top

  use DBIx::MyParsePP;
  use Data::Dumper;

  my $parser = DBIx::MyParsePP->new();

  my $query = $parser->parse("SELECT 1");

  print Dumper $query;
  print $query->toString();

DESCRIPTION

Top

DBIx::MyParsePP is a pure-perl SQL parser that implements the MySQL grammar and lexer. The grammar was automatically converted from the original sql_yacc.yy file by removing all the C code. The lexer comes from sql_lex.cc, completely translated in Perl almost verbatim.

The grammar is converted into Perl form using Parse::Yapp.

CONSTRUCTOR

Top

charset, version, sql_mode, client_capabilities and stmt_prepare_mode can be passed as arguments to the constructor. Please use DBIx::MyParsePP::Lexer to bring in the required constants and see DBIx::MyParsePP::Lexer for information.

METHODS

Top

DBIx::MyParsePP provides parse() which takes the string to be parsed. The result is a DBIx::MyParsePP::Query object which contains the result from the parsing.

Queries can be reconstructed back into SQL by calling the toString() method.

SPECIAL CONSIDERATIONS

Top

The file containing the grammar lib/DBIx/MyParsePP/Parser.pm is about 5 megabytes in size and takes a while to load. Compex statements take a while to parse, e.g. the first Twins query from the MySQL manual can only be parsed at a speed of few queries per second per 1GHz of CPU. If you require a full-speed parsing solution, please take a look at DBIx::MyParse, which requires a GCC compiler and produces more concise parse trees.

The parse trees produced by DBIx::MyParsePP contain one leaf for every grammar rule that has been matched, even rules that serve no useful purpose. Therefore, parsing event simple statements such as SELECT 1 produce trees dozens of levels deep. Please exercise caution when walking those trees recursively. The DBIx::MyParsePP::Rule module contains the extract() and shrink() methods which are useful for dealing with the inherent complexity of the MySQL grammar.

USING GRAMMARS FROM OTHER MYSQL VERSIONS

Top

The package by default parses strings using the grammar from MySQL version 5.0.45. If you wish to use the grammar from a different version, you can use the bin/myconvpp.pl script to prepare the grammar:

	$ perl bin/myconvpp.pl --

SEE ALSO

Top

For Yacc grammars, please see the Bison manual at:

	http://www.gnu.org/software/bison

For generating Yacc parsers in Perl, please see:

	http://search.cpan.org/~fdesar

For a full-speed C++-based parser that generates nicer parse trees, please see DBIx::MyParse

AUTHOR

Top

Philip Stoev, <philip@stoev.org>

COPYRIGHT AND LICENSE

Top


DBIx-MyParsePP documentation Contained in the DBIx-MyParsePP distribution.

package DBIx::MyParsePP;
use strict;

use DBIx::MyParsePP::Lexer;
use DBIx::MyParsePP::Parser;
use DBIx::MyParsePP::Query;


our $VERSION = '0.50';

use constant MYPARSEPP_YAPP			=> 0;
use constant MYPARSEPP_CHARSET			=> 1;
use constant MYPARSEPP_VERSION			=> 2;
use constant MYPARSEPP_SQL_MODE			=> 3;
use constant MYPARSEPP_CLIENT_CAPABILITIES	=> 4;
use constant MYPARSEPP_STMT_PREPARE_MODE	=> 5;

my %args = (
	charset		=> MYPARSEPP_CHARSET,
	version		=> MYPARSEPP_VERSION,
	sql_mode	=> MYPARSEPP_SQL_MODE,
	client_capabilities	=> MYPARSEPP_CLIENT_CAPABILITIES,
	stmt_prepare_mode	=> MYPARSEPP_STMT_PREPARE_MODE
);

1;

sub new {
	my $class = shift;
	my $parser = bless ([], $class );

        my $max_arg = (scalar(@_) / 2) - 1;

        foreach my $i (0..$max_arg) {
                if (exists $args{$_[$i * 2]}) {
                        $parser->[$args{$_[$i * 2]}] = $_[$i * 2 + 1];
                } else {
                        warn("Unkown argument '$_[$i * 2]' to DBIx::MyParsePP->new()");
                }
        }

	my $yapp = DBIx::MyParsePP::Parser->new();
	$parser->[MYPARSEPP_YAPP] = $yapp;
	return $parser;
}

sub parse {
	my ($parser, $string) = @_;

	my $lexer = DBIx::MyParsePP::Lexer->new(
		string => $string,
		charset => $parser->[MYPARSEPP_CHARSET],
		version	 => $parser->[MYPARSEPP_VERSION],
		sql_mode => $parser->[MYPARSEPP_SQL_MODE],
		client_capabilities => $parser->[MYPARSEPP_CLIENT_CAPABILITIES],
		stmt_prepare_mode => $parser->[MYPARSEPP_CLIENT_CAPABILITIES]
	);
		
	my $query = DBIx::MyParsePP::Query->new(
		lexer => $lexer
	);

	my $yapp = $parser->[MYPARSEPP_YAPP];
	my $result = $yapp->YYParse( yylex => sub { $lexer->yylex() }, yyerror => sub { $parser->error(@_, $query) } );

	if (defined $result) {
		$query->setRoot($result->[0]);
	}

	return $query;
}

sub error {
	my ($parser, $yapp, $query) = @_;
	$query->setActual($yapp->YYCurval);
	$query->setExpected($yapp->YYExpect);
}

1;

__END__