Template::Multilingual::Parser - Multilingual template parser


Template-Multilingual documentation Contained in the Template-Multilingual distribution.

Index


Code Index:

NAME

Top

Template::Multilingual::Parser - Multilingual template parser

SYNOPSIS

Top

    use Template;
    use Template::Multilingual::Parser;

    my $parser = Template::Multilingual::Parser->new();
    my $template = Template->new(PARSER => $parser);
    $template->process('example.ttml', { language => 'en'});

DESCRIPTION

Top

This subclass of Template Toolkit's Template::Parser parses multilingual templates: templates that contain text in several languages.

    <t>
      <en>Hello!</en>
      <fr>Bonjour !</fr>
    </t>

Use this module directly if you have subclassed Template, otherwise you may find it easier to use Template::Multilingual.

Language codes can be any string that matches \w+, but we suggest sticking to ISO-639 which provides 2-letter codes for common languages and 3-letter codes for many others.

METHODS

Top

new(\%params)

The new() constructor creates and returns a reference to a new parser object. A reference to a hash may be supplied as a parameter to provide configuration values.

Parser objects are typically provided as the PARSER option to the Template constructor.

Configuration values are all valid Template::Parser superclass options, and one specific to this class:

LANGUAGE_VAR

The LANGUAGE_VAR option can be used to set the name of the template variable which contains the current language. Defaults to language.

  my $parser = Template::Multilingual::Parser->new({
     LANGUAGE_VAR => 'global.language',
  });

You will need to set this variable with the current language value at request time, usually in your Template subclass' process() method.

parse($text)

parse() is called by the Template Toolkit. It parses multilingual sections from the input text and translates them to Template Toolkit directives. The result is then passed to the Template::Parser superclass.

sections

Returns a reference to an array of tokenized sections. Each section is a reference to hash with either a nolang key or a lang key.

A nolang key denotes text outside of any multilingual sections. The value is the text itself.

A lang key denotes text inside a multilingual section. The value is a reference to a hash, whose keys are language codes and values the corresponding text. For example, the following multilingual template:

  foo <t><fr>bonjour</fr><en>Hello</en></t> bar

will parse to the following sections:

  [ { nolang => 'foo ' },
    {   lang => { fr => 'bonjour', en => 'hello' } },
    { nolang => ' bar' },
  ]

LANGUAGE SUBTAG HANDLING

Top

This module supports language subtags to express variants, e.g. "en_US" or "en-US". Here are the rules used for language matching:

AUTHOR

Top

Eric Cholet, <cholet@logilune.com>

BUGS

Top

Multilingual text sections cannot be used inside TT directives. The following is illegal and will trigger a TT syntax error:

    [% title = "<t><fr>Bonjour</fr><en>Hello</en></t>" %]

Use this instead:

    [% title = BLOCK %]<t><fr>Bonjour</fr><en>Hello</en></t>[% END %]




The TAG_STYLE, START_TAG and END_TAG directives are supported, but the TAGS directive is not.

Please report any bugs or feature requests to bug-template-multilingual@rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Template-Multilingual. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SEE ALSO

Top

Template::Multilingual

ISO 639-2 Codes for the Representation of Names of Languages: http://www.loc.gov/standards/iso639-2/langcodes.html

COPYRIGHT & LICENSE

Top


Template-Multilingual documentation Contained in the Template-Multilingual distribution.

package Template::Multilingual::Parser;

use strict;
use base qw(Template::Parser);

our $VERSION = '1.00';

sub new
{
    my ($class, $options) = @_;
    my $self = $class->SUPER::new($options);
    $self->{_sections} = [];
    $self->{_langvar} = $options->{LANGUAGE_VAR} || 'language';

    my $style = $self->{ STYLE }->[-1];
    @$self{ qw(_start _end) } = @$style{ qw( START_TAG END_TAG  ) };
    for (qw( _start _end )) {
        $self->{$_} =~ s/\\([^\\])/$1/g;
    }

    return $self;
}

sub parse
{
    my ($self, $text) = @_;

    # isolate multilingual sections
    $self->_tokenize($text);

    # replace multilingual sections with TT directives
    my ($S, $E, $LANGVAR) = map $self->{$_}, qw(_start _end _langvar);

    # if language is a variant (en_US), create a template variable holding the fallback value (en)
    $text = "$S IF (tm_matches = $LANGVAR.match('^(\\w+)[-_].*\$')); tm_fb = tm_matches.0; END $E";

    for my $section (@{$self->{_sections}}) {
        if ($section->{nolang}) {
            $text .= $section->{nolang};
        }
        elsif (my $t = $section->{lang}) {
            my @languages = keys %$t;

            # first loop through languages: look for exact match
            $text .= "$S tm_f = 0; SWITCH $LANGVAR $E";
            for my $lang (@languages) {
                $text .= "$S CASE '$lang' $E" . $t->{$lang};
            }
            # add a default case to trigger fallback
            $text .= "$S CASE; tm_f=1; END; $E";

            # second loop: fallback to primary language (en_US matches en)
            $text .= "$S IF tm_fb AND tm_f; tm_f=0; SWITCH tm_fb; $E";
            for my $lang (@languages) {
                $text .= "$S CASE '$lang' $E" . $t->{$lang};
            }
            # add a default case to trigger last resort fallback
            #   LANG is fr_XX or fr but template has neither
            #   we try to fallback to fr_YY is present
            my %seen;
            my @fallbacks = map { /^(\w+)[-_].*$/ && !$seen{$_}++ ? [ $1 => $_] : () } sort @languages;
            if (@fallbacks) {
                # third loop: fallback to first available variant
                $text .= "$S CASE; tm_f=1; END; END; IF tm_f; SWITCH tm_fb || $LANGVAR; $E";
                for my $ref (@fallbacks) {
                    my ($lang, $variant) = @$ref;
                    $text .= "$S CASE '$lang' $E" . $t->{$variant};
                }
            }
            $text .= "$S END; END $E";
        }
    }
    return $self->SUPER::parse ($text);
}

sub _tokenize
{
    my ($self, $text) = @_;

    # extract all sections from the text
    $self->{_sections} = [];
    my @tokens = split m!<t>(.*?)</t>!s, $text;
    my $i = 0;
    for my $t (@tokens) {
        if ($i) {             # <t>...</t> multilingual section
            my %section;
            while ($t =~ m!<([^<>]+)>(.*?)</\1>!gs) {
                $section{$1} = $2;
            }
            push @{$self->{_sections}}, { lang => \%section }
                if %section;
        }
        else {                # bare text
            push @{$self->{_sections}}, { nolang => $t } if $t;
        }
        $i = 1 - $i;
    }
}
sub sections { $_[0]->{_sections} }

1; # End of Template::Multilingual::Parser