Lingua::IT::Hyphenate - Italian word hyphenation


Lingua-IT-Hyphenate documentation Contained in the Lingua-IT-Hyphenate distribution.

Index


Code Index:

NAME

Top

Lingua::IT::Hyphenate - Italian word hyphenation

SYNOPSIS

Top

    use Lingua::IT::Hyphenate qw( sillabe );

    @syllabes = sillabe( @whatever );

DESCRIPTION

Top

The sillabe (italian for "syllabes") function, which can be exported to your main namespace, gets one or more text strings and returns an array of syllabes. Note that it destroys everything which is not considered part of a word (punctuation et alia); it leaves untouched numbers and the ' (single quote).

    print join("-", sillabe("salve, mondo!"));

    # returns: sal-ve-mon-do

The hyphenation algorithm doesn't take into account sequences of vowels for which there are no predictable rules (eg. it doesn't distinguish between italian "iato" and "dittongo"), so "aiuto" is hyphenated as "aiu-to" (instead of the correct "a-iu-to").

It's not complete, but at least it doesn't produce wrong results.

AUTHOR

Top

Aldo Calpini ( dada@perl.it ).


Lingua-IT-Hyphenate documentation Contained in the Lingua-IT-Hyphenate distribution.

package Lingua::IT::Hyphenate;

require Exporter;
@ISA = qw(Exporter);
@EXPORT_OK = qw(sillabe);

use vars qw( $VERSION $_V $_C );

$VERSION = "0.14";

$_V = "[aeiouàèéìòù]";
$_C = "[b-df-hj-np-tv-z]";

sub sillabe {
    my(@words) = split(/[^a-zA-Zàèéìòù'0-9]+/, join(" ", @_)); #'
    my @result = ();
    foreach $word (@words) {
#DEBUG  print "word $word --> ";
        $word =~ s/($_V)([bcfgptv][lr])/$1=$2/gi;
        $word =~ s/($_V)([cg]h)/$1=$2/gi;     
        $word =~ s/($_V)(gn)/$1=$2/gi;     
        $word =~ s/($_C)\1/$1=$1/gi;
        $word =~ s/(s$_C)/=$1/gi;
        1 while $word =~ s/($_V*$_C+$_V+)($_C$_V)/$1=$2/gi;
        1 while $word =~ s/($_V*$_C+$_V+$_C)($_C)/$1=$2/gi;
        $word =~ s/^($_V+$_C)($_C)/$1=$2/gi;
        $word =~ s/^($_V+)($_C$_V)/$1=$2/gi;
        $word =~ s/^=//;
        $word =~ s/=$//;
        $word =~ s/=+/=/g;
#DEBUG  print "$word\n";
        push(@result, split(/=/, $word));
    }
    return @result;
}

1;

__END__