| MetaTrans documentation | Contained in the MetaTrans distribution. |
MetaTrans::Base - Abstract base class for creating meta-translator plug-ins
# This is not a working example. It serves for illustration only.
# For a working one see MetaTrans::UltralinguaNet source code.
package MetaTrans::MyPlugin;
use MetaTrans::Base;
use vars qw(@ISA);
@ISA = qw(MetaTrans::Base);
use HTTP::Request;
use URI::Escape;
sub new
{
my $class = shift;
my %options = @_;
$options{host_server} = "www.some-online-translator.com"
unless (defined $options{host_server});
my $self = new MetaTrans::Base(%options);
$self = bless $self, $class;
# supported translation directions:
# English <-> German
# English <-> French
# English <-> Spanish
$self->set_languages('eng', 'ger', 'fre', 'spa');
$self->set_dir_1_to_all('eng');
$self->set_dir_all_to_1('eng');
return $self;
}
sub create_request
{
my $self = shift;
my $expression = shift;
my $src_lang_code = shift;
my $dest_lang_code = shift;
# our-language-codes-to-server-language-codes conversion table
my %table = (eng => 'eng', ger => 'deu', fre => 'fra', spa => 'esp');
return new HTTP::Request('GET',
'http://www.some-online-translator.com/translate.cgi?' .
'expr=' . uri_escape($expression) . '&' .
'src=' . $table{$src_lang_code} . '&' .
'dst=' . $table{$dest_lang_code}
);
}
sub process_response
{
my $self = shift;
my $contents = shift;
# we don't care about these here, but
# in some cases we might need to care
my $src_lang_code = shift;
my $dest_lang_code = shift;
my @result;
while ($contents =~ m|
<td class="expr">([^<]*)</td>
<td class="trns">([^<]*)</td>
|gsix)
{
my $expression = $1;
my $translation = $2;
# add some $expression and $translation normalization code here
push @result, ($expression, $translation);
}
return @result;
}
1;
This class serves as a base for creating MetaTrans plug-ins,
especially those ones, which extract data from online translators.
Please see MetaTrans first. MetaTrans::Base already contains
many features a MetaTrans plug-in must have and makes creating
new plug-ins really easy.
To perform a translation using an online translator (e.g. http://www.ultralingua.net/) one needs to do two things:
To create a MetaTrans plug-in using MetaTrans::Base one
only needs to do a bit more. The first step is to derrive
from MetaTrans::Base and "override" following two abstract
methods:
Should return a HTTP::Request object to be used by LWP::UserAgent
for retrieving HTML output, which contains translation of $expression from
the language with $src_lang_code to the language with $dest_lang_code.
This basicaly emulates sending a form.
This method should extract translations from the HTML code ($contents) returned by webserver in response to the request. The translations must be returned in an array of following form:
(expression_1, translation_1, expression_2, translation_2, ...)
Character encoding must be UTF-8! In addition all expressions and their translations should be normalized in a way so that all the grammar and meaning information were in parenthesis or behind a semi-colon. For example, if you request a English to French translation of "dog" from the http://www.ultralingua.net/ translator, the first line of the result is
dog n. : 1. chien n.m.,f. chienne 2. pitou n.m. (Familier) (Québécisme)
The MetaTrans::UltralinguaNet module returns it as
('dog (n.)', 'chien (n.m.,f.)', 'dog (n.)', 'pitou (n.m.)')
The next step is specifying list of languages supported by the plug-in.
We have to say, which languages we are able to translate from and which to.
This can be done easily by calling appropriate methods inherrited from
MetaTrans::Base. Please see SPECIFYING SUPPORTED LANGUAGES.
The last step is setting the host_server attribute to the name of the
online translator used by the plug-in. See ATTRIBUTES.
The MetaTrans::UltralinguaNet source code should serve as a good example
on how to create a MetaTrans plug-in derrived from MetaTrans::Base.
This method constructs a new MetaTrans::Base object and returns it. Key/value pair arguments may be provided to set up the initial state. The following options correspond to attribute methods described below:
KEY DEFAULT --------------- ---------------- host_server 'unknown.server' script_name undef timeout 5 matching M_START match_at_bounds 1
Please note that as long as the MetaTrans::Base is an abstract class,
calling the constructor method only makes sense in the derrived classes.
Get/set the name of the online translator used by the plug-in. Is is only
used to inform the user where the translation comes from and hence can
be set to any meaningful value. It is a convention to set this to
the online translator base URL with the 'http://' stripped. For example,
the MetaTrans::UltralinguaNet sets host_server to
'www.ultralingua.net'.
Get/set the name of the script, which runs this plug-in as a command line
application. The script uses this to identify itself when printing usage.
If unset, the script name is extracted from $0 variable. See the run
method.
Get/set the time in seconds we want to wait for a reply from the online translator before timing out.
Get/set the way of matching the found translations to the searched expression.
Some online translators in addition to the translation of the searched
expression also return translations of related expressions. For example,
we want to translate "dog" from English to French and we also get
translations of "dog days" or "every dog has his day". If this is not what
we want we can help ourselves by setting matching to appropriate value:
Match only those expressions which are the same as the searched one. Matching is incasesensitive and ignores grammar information, i.e. everything in parenthesis or after semi-colon. The same applies bellow.
Examples:
'Dog' matches 'dog' (incasesensitive)
'Hund' matches 'Hund; r' (grammar information ignored)
'dog' does not match 'dog bite' (not an exact match)
Match those expressions which are prefixed with the searched expression.
Examples:
'Dog' matches 'dog bite' (incasesensitive)
'Hund' matches 'Hund is los'
'Hund' does not match 'bissiger Hund' ('Hund' is not a prefix)
Match those expressions which contain the searched expression, no matter where.
Examples:
'Big Dog' matches 'very big dog'
'big dog' does not match 'big angry dog' ('big dog' is not a substring)
Match those expressions which contain all the words of the searched expression.
Examples:
'big dog' matches 'big angry dog'
'big dog' does not match 'angry dog' (not all words are contained)
Return all without any filtering.
You can
use MetaTrans::Base qw(:match_consts);
to import matching constant names (M_EXACT, M_START, ...) into your
program's namespace.
Get/set the match-at-boundaries flag. Setting it to true value makes
matching behave in a slightly different way.
Subexpressions and words are matched at word boundaries only. In practice
this means that with matching set to M_WORDS the
expression "big dog"
won't be matched to "big angry doggie" while it would be with
match-at-boundaries set to false value. The same applies to
M_START and M_EXPR. The option has no effect when matching is set
to M_EXACT or M_ALL.
Get/set the default translation direction. May only be set to supported one, see SPECIFYING SUPPORTED LANGUAGES. Returns old value as an array of two language codes.
Every MetaTrans plug-in has to specify supported languages and translation
directions. MetaTrans::Base provides several methods for doing so. The
first step is specifying list of all languages, which appear on the left or
right side of any of supported translation directions. Consider your plug-in
supports following ones:
English -> French
English -> German
French -> Spanish
Then the list of supported languages is simply English, French, German and Spanish.
The arguments passed to particular methods need to be language codes, not language names. Please see MetaTrans::Languagues for a complete list.
Set supported languages to the ones specified by @language_codes. In the
above exapmle one would call:
$plugin->set_languages('eng', 'fre', 'ger', 'spa');
Add support for translating from language with $src_lang_code to language
with $dest_lang_code. Both languages need to be previously declared as
supported. The method returns true value on success, false value on error. To
specify we support directions from the above example we would simply call:
$plugin->set_dir_1_to_1('eng', 'fre');
$plugin->set_dir_1_to_1('eng', 'ger');
$plugin->set_dir_1_to_1('fre', 'spa');
Remove support for translating from language with $src_lang_code to language
with $dest_lang_code. Both languages need to be previously declared as
supported. The method returns true value on success, false value on error.
Add support for translating from language with $src_lang_code to all
languages whichs codes are in @dest_lang_codes. The direction from
$src_lang_code language to itself won't be set as supported even if
$src_lang_code is specified in @dest_lang_codes. However, calling
$plugin->set_dir_1_to_1($src_lang_code, $src_lang_code);
will do the job if this is what you want. It only results in warning messages
if some of the @dest_lang_codes are unsupported. Only the supported ones
will be used, others are ignored. The method returns number of directions
set as supported on (partial) success, 0 on error.
Example:
my @all_languages = ('eng', 'fre', 'ger', 'spa');
$plugin->set_languages(@all_languages);
$plugin->set_dir_1_to_spec('eng', @all_languages);
... will result in following supported translation directions:
English -> French
English -> German
English -> Spanish
This is just a shorter way for writting:
$plugin->set_dir_1_to_spec($src_lang_code, @all_codes);
where @all_codes is an array of codes of all supported languages.
This works exactly as set_dir_1_to_spec with reversed sides.
This is just a shorter way for writting:
$plugin->set_dir_spec_to_1($dest_lang_code, @all_codes);
where @all_codes is an array of codes of all supported languages.
Example:
my @src_lang_codes = ('ger', 'fre', 'spa');
$plugin->set_languages('eng', 'por', @src_lang_codes);
$plugin->set_dir_spec_to_1('eng', @src_lang_codes);
... will result in following supported translation directions:
German -> English
French -> English
Spanish -> English
But if we replaced the last line with
$plugin->set_dir_all_to_1('eng');
the result would have been:
Portuguese -> English
German -> English
French -> English
Spanish -> English
These are the methods MetaTrans expects every plug-in to provide. You only
need to worry about this if you are writting a plug-in from a scratch. If you
are derriving from MetaTrans::Base all these methods are inherited. They
make use of the abstract methods create_request and process_response,
attribute values and supported translation directions specified using
set_dir_* methods. If you only want to use MetaTrans::Base as a base
class for your plug-in you can stop reading here. Everything you need to know
was written above.
If you are writting a plug-in from a scratch you have to make sure it provides
all the methods with appropriate functionality specified in this section. In
addition, every MetaTrans plug-in has to provide attribute methods
as specified in ATTRIBUTES section.
Returns true value if the translation direction is supported from language with
$src_lang_code to language with $dest_lang_code, false value otherwise.
Returns a list of all language codes, which the plug-in is able to translate
from. For example, ('eng', 'fre') will be returned if supported translation
directions are:
English -> French
English -> Spanish
French -> Spanish
Returns a list of all language codes, which the plug-in is able to translate
to from the language with $src_lang_code. If called with 'eng' as an
parameter in the above example, returned value would be ('fre', 'spa').
Returns translation of $expression as an array of expression-translation
pairs in one string separated by " = " in UTF-8 character encoding.
An example output is:
("dog = chien", "dog = pitou", "dog days = canicule")
undef value is returned and an error printed if $src_lang_code
-> $dest_lang_code is an unsupported translation direction. 'timeout'
string is returned if timeout occurs when querying online translator,
'error' string is returned on any other error.
Default translation direction (see default_dir attribute) is used if
the method is called with first argument only.
This method is a very ugly hack, for which writting MetaTrans plug-ins from
a scratch is discouraged. See MetaTrans for more information on why this
it is required.
The get_trans_command method is expected to return an array containing
command, which if run using Proc::SyncExec::sync_popen_noshell function
will print translations of $expression from $src_lang_code language to
$dest_lang_code language (the first element of the array is the program
name, list of arguments follows). The command also needs to contain options
correspondent to current plug-in attribute values and ensure appropriate
behaviour. Each line of the output must correspond to one translation and
have following form:
expression = translation
In addition, the $append string, if specified, should be appendet to each
line of the output.
Returns true value if the $found_expr expression matches input expression
$in_expr when using M_EXACT matching options (see matching attribute).
Returns true value if the $found_expr expression matches input expression
$in_expr when using M_START matching options (see matching attribute).
The $at_bounds argument corresponds to the match_at_bounds attribute.
Returns true value if the $found_expr expression matches input expression
$in_expr when using M_EXPR matching options (see matching attribute).
The $at_bounds argument corresponds to the match_at_bounds attribute.
Returns true value if the $found_expr expression matches input expression
$in_expr when using M_WORDS matching options (see matching attribute).
The $at_bounds argument corresponds to the match_at_bounds attribute.
Returns the $expression with all the grammar and meaning information deleted
(everything in parantheses or behind a semicolon) in perl's internal UTF-8
format (see Encode).
Converts $string from $input_encoding to UTF-8 encoding. In addition all
HTML entities contained in the $string are converted to corresponding
UTF-8 characters. This may sometimes be very useful when writting the
process_response method.
Run the plug-in as a command line application. Very useful for testing and debugging. Try executing following script to see what this does:
#!perl
# load a plug-in class derrived from MetaTrans::Base
use MetaTrans::UltralinguaNet;
# instantiate an object
my $plugin = new MetaTrans::UltralinguaNet;
# run it
$plugin->run;
Please report any bugs or feature requests to
bug-metatrans@rt.cpan.org, or through the web interface at
http://rt.cpan.org. I will be notified, and then you'll automatically
be notified of progress on your bug as I make changes.
Jan Pomikalek, <xpomikal@fi.muni.cz>
Copyright 2004 Jan Pomikalek, All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
| MetaTrans documentation | Contained in the MetaTrans distribution. |
package MetaTrans::Base; use strict; use warnings; use vars qw($VERSION @ISA @EXPORT_OK %EXPORT_TAGS %ENV); use Exporter; use MetaTrans::Languages qw(get_lang_by_code is_known_lang); use Carp; use Encode; use Getopt::Long; use HTML::Entities; use LWP::UserAgent; use HTTP::Response; $VERSION = do { my @r = (q$Revision: 1.3 $ =~ /\d+/g); sprintf "%d."."%02d", @r }; @ISA = qw(Exporter); @EXPORT_OK = qw(is_exact_match is_match_at_start is_match_expr is_match_words convert_to_utf8 M_EXACT M_START M_EXPR M_WORDS M_ALL); %EXPORT_TAGS = ( match_consts => [qw(M_EXACT M_START M_EXPR M_WORDS M_ALL)], match_funcs => [qw(is_exact_match is_match_at_start is_match_expr is_match_words)], ); # Expression matching types use constant M_EXACT => 1; # exact match use constant M_START => 2; # match at start use constant M_EXPR => 3; # match expression use constant M_WORDS => 4; # match words use constant M_ALL => 5; # match anything to anything
sub new { my $class = shift; my %options = @_; my $self = bless {}, $class; my %defaults = ( host_server => 'unknown.server', script_name => undef, timeout => 5, matching => M_START, match_at_bounds => 1, ); foreach my $attr (keys %defaults) { $self->{$attr} = $options{$attr} || $defaults{$attr}; } return $self; }
sub host_server { shift->_elem('host_server', @_); } sub script_name { shift->_elem('script_name', @_); } sub timeout { shift->_elem('timeout', @_); } sub match_at_bounds { shift->_elem('match_at_bounds', @_); } sub matching { my $self = shift; my $type = shift; my %ok = (M_EXACT, 1, M_START, 1, M_EXPR, 1, M_WORDS, 1, M_ALL, 1); my $old = $self->{matching}; if (defined $type) { exists $ok{$type} ? $self->{matching} = $type : carp "invalid matching type: '$type'"; } return $old; } sub default_dir { my $self = shift; my $src_lang_code = shift; my $dest_lang_code = shift; my @old_direction; if (defined @{$self->{direction}} && $self->is_supported_dir(@{$self->{direction}})) { @old_direction = @{$self->{direction}}; } else { # return `the first' supported translation direction OUTER: foreach my $src_lang_code (@{$self->{language_keys}}) { foreach my $dest_lang_code (@{$self->{language_keys}}) { if ($self->is_supported_dir($src_lang_code, $dest_lang_code)) { @old_direction = ($src_lang_code, $dest_lang_code); last OUTER; } } } } return @old_direction unless defined $src_lang_code && defined $dest_lang_code; if ($self->is_supported_dir($src_lang_code, $dest_lang_code)) { carp "not supported direction: '${src_lang_code}2${dest_lang_code}'"; return @old_direction; } @{$self->{direction}} = ($src_lang_code, $dest_lang_code); return @old_direction; }
sub set_languages { my $self = shift; my @language_codes = @_; foreach (@language_codes) { unless (is_known_lang($_)) { carp "unknown language code: '$_', ignoring it"; next; } ${$self->{languages}}{$_} = get_lang_by_code($_); push @{$self->{language_keys}}, $_; # to keep ordering } }
sub set_dir_1_to_1 { my $self = shift; my $src_lang_code = shift; my $dest_lang_code = shift; unless (${$self->{languages}}{$src_lang_code}) { carp "language '$src_lang_code' not supported, " . "not setting '${src_lang_code}2${dest_lang_code}'"; return 0; } unless (${$self->{languages}}{$dest_lang_code}) { carp "language '$dest_lang_code' not supported, " . "not setting '${src_lang_code}2${dest_lang_code}'"; return 0; } ${$self->{directions}}{$src_lang_code . "2" . $dest_lang_code} = 1; return 1; }
sub unset_dir_1_to_1 { my $self = shift; my $src_lang_code = shift; my $dest_lang_code = shift; unless (${$self->{languages}}{$src_lang_code}) { carp "language '$src_lang_code' not supported, " . "not unsetting '${src_lang_code}2${dest_lang_code}'"; return 0; } unless (${$self->{languages}}{$dest_lang_code}) { carp "language '$dest_lang_code' not supported, " . "not unsetting '${src_lang_code}2${dest_lang_code}'"; return 0; } undef ${$self->{directions}}{$src_lang_code . "2" . $dest_lang_code}; return 1; }
sub set_dir_1_to_spec { my $self = shift; my $src_lang_code = shift; my @dest_lang_codes = @_; my $set = 0; unless (${$self->{languages}}{$src_lang_code}) { carp "language '$src_lang_code' not supported"; return $set; } foreach my $dest_lang_code (@dest_lang_codes) { next if $dest_lang_code eq $src_lang_code; $set += $self->set_dir_1_to_1($src_lang_code, $dest_lang_code); } return $set; }
sub set_dir_1_to_all { my $self = shift; my $src_lang_code = shift; return $self->set_dir_1_to_spec($src_lang_code, @{$self->{language_keys}}); }
sub set_dir_spec_to_1 { my $self = shift; my $dest_lang_code = shift; my @src_lang_codes = @_; my $set = 0; unless (${$self->{languages}}{$dest_lang_code}) { carp "language '$dest_lang_code' not supported"; return $set; } foreach my $src_lang_code (@src_lang_codes) { next if $src_lang_code eq $dest_lang_code; $set += $self->set_dir_1_to_1($src_lang_code, $dest_lang_code); } return $set; }
sub set_dir_all_to_1 { my $self = shift; my $dest_lang_code = shift; return $self->set_dir_spec_to_1($dest_lang_code, @{$self->{language_keys}}); }
sub is_supported_dir { my $self = shift; my $src_lang_code = shift; my $dest_lang_code = shift; return ${$self->{directions}}{$src_lang_code . "2" . $dest_lang_code}; }
sub get_all_src_lang_codes { my $self = shift; my @result; OUTER: foreach my $src_lang_code (@{$self->{language_keys}}) { foreach my $dest_lang_code (@{$self->{language_keys}}) { if ($self->is_supported_dir($src_lang_code, $dest_lang_code)) { push @result, $src_lang_code; next OUTER; } } } return @result; }
sub get_dest_lang_codes_for_src_lang_code { my $self = shift; my $src_lang_code = shift; my @result; foreach my $dest_lang_code (@{$self->{language_keys}}) { push @result, $dest_lang_code if $self->is_supported_dir($src_lang_code, $dest_lang_code); } return @result; }
sub translate { my $self = shift; my $expression = shift; my $src_lang_code = shift; my $dest_lang_code = shift; unless (scalar(keys %{$self->{directions}}) > 0) { carp "no supported directions defined"; return 'error'; } ($src_lang_code, $dest_lang_code) = $self->default_dir unless (defined $src_lang_code && defined $dest_lang_code); unless ($self->is_supported_dir($src_lang_code, $dest_lang_code)) { carp "not supported direction: '${src_lang_code}2${dest_lang_code}'"; return 'error'; } my $ua = new LWP::UserAgent; $ua->cookie_jar({ file => "$ENV{HOME}/.metatrans.cookies.txt" }); $ua->timeout($self->{timeout}); # strip blanks $expression =~ s/\s+/ /g; $expression =~ s/^ //; $expression =~ s/ $//; my $request = $self->create_request($expression, $src_lang_code, $dest_lang_code); my $response = $ua->request($request); if ($response->is_error()) { if ($response->code =~ /50[03]/) { carp "timeout while translating '$expression'"; return 'timeout'; } else { carp "error (" . $response->code . ") while translating '$expression'"; return 'error'; } } my $content = $response->content(); my @processed = $self->process_response($content, $src_lang_code, $dest_lang_code); my @result; my $at_bounds = $self->{match_at_bounds}; while (@processed > 0) { my $left = shift @processed; my $right = shift @processed; next unless $self->{matching} == M_EXACT ? &is_exact_match($expression, $left) : $self->{matching} == M_START ? &is_match_at_start($expression, $left, $at_bounds) : $self->{matching} == M_EXPR ? &is_match_expr($expression, $left, $at_bounds) : $self->{matching} == M_WORDS ? &is_match_words($expression, $left, $at_bounds) : 1; push @result, "$left = $right"; } return @result; }
sub get_trans_command { my $self = shift; my $expression = shift; my $src_lang_code = shift; my $dest_lang_code = shift; my $append = shift; my $class = ref($self); # $append =~ s/"/\\"/g; # $expression =~ s/"/\\"/g; # my $command = "runtrans"; # $command.= " $class"; # $command.= " -t " . $self->{timeout}; # $command.= " -m " . ($self->{matching} == M_EXACT ? 'exact' : # $self->{matching} == M_START ? 'start' : # $self->{matching} == M_EXPR ? 'expr' : # $self->{matching} == M_WORDS ? 'words' : # 'all' ); # $command.= " -b " if $self->{match_at_bounds}; # $command.= " -d " . $src_lang_code . "2" . $dest_lang_code; # $command.= " -a \"$append\""; # $command.= " \"$expression\""; my @command; push @command, "runtrans", $class; push @command, "-t", $self->{timeout}; push @command, "-m", ($self->{matching} == M_EXACT ? 'exact' : $self->{matching} == M_START ? 'start' : $self->{matching} == M_EXPR ? 'expr' : $self->{matching} == M_WORDS ? 'words' : 'all' ); push @command, "-b" if $self->{match_at_bounds}; push @command, "-d", $src_lang_code . "2" . $dest_lang_code; push @command, "-a", $append; push @command, $expression; return @command; }
sub is_exact_match { my $in_expr = shift; my $found_expr = shift; return lc(&strip_grammar_info($in_expr)) eq lc(&strip_grammar_info($found_expr)); }
sub is_match_at_start { my $in_expr = shift; my $found_expr = shift; my $at_bounds = shift; my $in_stripped = &strip_grammar_info($in_expr); my $found_stripped = &strip_grammar_info($found_expr); return $at_bounds ? $found_stripped =~ /^\Q$in_stripped\E\b/g : $found_stripped =~ /^\Q$in_stripped\E/g ; }
sub is_match_expr { my $in_expr = shift; my $found_expr = shift; my $at_bounds = shift; my $in_stripped = &strip_grammar_info($in_expr); my $found_stripped = &strip_grammar_info($found_expr); return $at_bounds ? $found_stripped =~ /\b\Q$in_stripped\E\b/g : $found_stripped =~ /\Q$in_stripped\E/g ; }
sub is_match_words { my $in_expr = shift; my $found_expr = shift; my $at_bounds = shift; my $in_stripped = &strip_grammar_info($in_expr); my $found_stripped = &strip_grammar_info($found_expr); foreach my $word (split /\W+/, $in_stripped) { return undef unless $at_bounds ? $found_stripped =~ /\b\Q$word\E\b/g : $found_stripped =~ /\Q$word\E/g ; } return 1; }
sub strip_grammar_info { my $expr = shift; $expr = Encode::decode_utf8($expr) unless Encode::is_utf8($expr); $expr =~ s/\([^)]*\)//g; #$expr =~ s/, (r|e|s)\s*$//; $expr =~ s/;.*//; $expr =~ s/\W+/ /g; $expr =~ s/^ //; $expr =~ s/ $//; return $expr; }
sub convert_to_utf8 { my $input_encoding = shift; my $string = shift; $string = Encode::decode($input_encoding, $string); my $str_unescaped = HTML::Entities::decode_entities($string); # $str_escaped might be in Perl's internal format, need to encode it return Encode::is_utf8($str_unescaped) ? Encode::encode_utf8($str_unescaped) : $str_unescaped; }
sub run { my $self = shift; croak "no supported directions defined" unless (scalar(keys %{$self->{directions}}) > 0); my @options = $self->_get_options(); return if @options < 7; my ($timeout, $matching, $at_bounds, $src_lang_code, $dest_lang_code, $append, $help) = @options; if ($help || @ARGV == 0) { $self->_print_usage(); return; } $self->timeout($timeout); $self->match_at_bounds($at_bounds); $self->matching($matching); my $state; my $i = 0; foreach my $expr (@ARGV) { $i++; my @translations = $self->translate($expr, $src_lang_code, $dest_lang_code); if (@translations && $translations[0] !~ /=/) { $state = $translations[0]; next; } $state = "ok"; foreach my $trans (@translations) { print "$trans$append\n"; } print "\n" unless $i == @ARGV; } print $state . $append . "\n" if $append; }
################################################################################ # private methods # ################################################################################ sub _get_options { my $self = shift; my $timeout = $self->{timeout}; my $matching_str; my $matching = $self->{timeout}; my $at_bounds; my $direction; my $help; my $append = ''; Getopt::Long::Configure("bundling"); GetOptions( 't=i' => \$timeout, 'm=s' => \$matching_str, 'b' => \$at_bounds, 'd=s' => \$direction, 'a=s' => \$append, 'h' => \$help, ); if (defined $matching_str) { $matching_str eq 'exact' ? $matching = M_EXACT : $matching_str eq 'start' ? $matching = M_START : $matching_str eq 'expr' ? $matching = M_EXPR : $matching_str eq 'words' ? $matching = M_WORDS : $matching_str eq 'all' ? $matching = M_ALL : do { warn "invalid matching type: '$matching_str'\n"; return undef; } } if (defined $direction && $direction !~ /2/) { warn "invalid direction format: '$direction'\n"; return undef; } my ($src_lang_code, $dest_lang_code) = defined $direction ? split /2/, $direction : undef; return ($timeout, $matching, $at_bounds, $src_lang_code, $dest_lang_code, $append, $help); } sub _print_usage { my $self = shift; my $host = $self->{host_server}; my $script = $self->{script_name}; my $timeout = $self->{timeout}; my $matching = $self->{matching}; unless (defined $script) { $script = $0; $script =~ s|^.*/||; } my ($def_exact, $def_start, $def_expr, $def_words, $def_all) = ('', '', '', '', ''); my $def_str = '(def)'; $matching == M_EXACT ? $def_exact = $def_str : $matching == M_START ? $def_start = $def_str : $matching == M_EXPR ? $def_expr = $def_str : $matching == M_WORDS ? $def_words = $def_str : $def_all = $def_str ; my ($def_src_lang_code, $def_dest_lang_code) = $self->default_dir(); my ($wd, $wl, $wr) = $self->_get_column_widths(); my @dir_options; foreach my $src_lang_code (@{$self->{language_keys}}) { foreach my $dest_lang_code (@{$self->{language_keys}}) { next unless $self->is_supported_dir($src_lang_code, $dest_lang_code); my $dir_option = sprintf("%-${wd}s: %-${wl}s -> %-${wr}s", $src_lang_code . "2" . $dest_lang_code, ${$self->{languages}}{$src_lang_code}, ${$self->{languages}}{$dest_lang_code}); $dir_option .= " (default)" if ($src_lang_code eq $def_src_lang_code && $dest_lang_code eq $def_dest_lang_code); push @dir_options, $dir_option; } } my $indent = " "; my $directions = join("\n$indent", @dir_options); print <<EOF; Multilingual dictionary metasearcher for $host Usage: $script [options] expression [...]\ttranslate word(s) Options: -- expressions to be translated follow -t <timeout> wait for the response for <timeout> secs (default $timeout) -m <matching> set matching type exact: exact match only $def_exact start: match at start of the translated expr. only $def_start expr : match expr. anywhere in the translated expr. $def_expr words: match expr. words in the translated expr. $def_words all : match anything to anything $def_all -b match at word boundaries only -d <direction> set translation direction $directions -a <string> append <string> to each line of output -h print this help screen EOF } sub _get_column_widths { my $self = shift; my $max_dir_width = 0; my $max_lcol_width = 0; my $max_rcol_width = 0; foreach my $src_lang_code (@{$self->{language_keys}}) { foreach my $dest_lang_code (@{$self->{language_keys}}) { next unless $self->is_supported_dir($src_lang_code, $dest_lang_code); my $dir_width = length($src_lang_code . "2" . $dest_lang_code); my $lcol_width = length(${$self->{languages}}{$src_lang_code}); my $rcol_width = length(${$self->{languages}}{$dest_lang_code}); $max_dir_width = $dir_width if $dir_width > $max_dir_width; $max_lcol_width = $lcol_width if $lcol_width > $max_lcol_width; $max_rcol_width = $rcol_width if $rcol_width > $max_rcol_width; } } return ($max_dir_width, $max_lcol_width, $max_rcol_width); } # borrowed from LWP::MemberMixin sub _elem { my($self, $elem, $val) = @_; my $old = $self->{$elem}; $self->{$elem} = $val if defined $val; return $old; } 1; __END__