Lingua::Han::Cantonese - Retrieve the Cantonese(GuangDongHua) of Chinese character(HanZi).


Lingua-Han-Cantonese documentation Contained in the Lingua-Han-Cantonese distribution.

Index


Code Index:

NAME

Top

Lingua::Han::Cantonese - Retrieve the Cantonese(GuangDongHua) of Chinese character(HanZi).

SYNOPSIS

Top

  use Lingua::Han::Cantonese;

  my $h2p = new Lingua::Han::Cantonese();
  print $h2p->han2Cantonese("我"); # ngo
  my @result = $h2p->han2Cantonese("爱你"); # @result = ('ngoi', 'nei');

  # we can set the tone up
  my $h2p = new Lingua::Han::Cantonese(tone => 1);
  print $h2p->han2Cantonese("我"); #ngo5
  my @result = $h2p->han2Cantonese("爱你"); # @result = ('ngoi3', 'nei5');
  print $h2p->han2Cantonese("林道"); #lam4dou3
  print $h2p->han2Cantonese("I love 余瑞华 a"); #i love jyu4seoi6waa4 a

DESCRIPTION

Top

Retrieve the Cantonese(GuangDongHua) of Chinese character(HanZi).

RETURN VALUE

Top

Usually, it returns its Cantonese/spell. It includes more than 20,000 words (from Unicode.org Unihan.txt, version 4.1.0).

if not(I mean it's not a Chinese character), returns the original word;

OPTION

Top

tone => 1|0

default is 0. if tone is needed, plz set this to 1.

SEE ALSO

Top

Unicode::Unihan, Lingua::Han::PinYin

AUTHOR

Top

Fayland Lam, <fayland at gmail.com>

BUGS

Top

Please report any bugs or feature requests to bug-lingua-han-cantonese at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Lingua-Han-Cantonese. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

Top

You can find documentation for this module with the perldoc command.

    perldoc Lingua::Han::Cantonese

You can also look for information at:

* AnnoCPAN: Annotated CPAN documentation

http://annocpan.org/dist/Lingua-Han-Cantonese

* CPAN Ratings

http://cpanratings.perl.org/d/Lingua-Han-Cantonese

* RT: CPAN's request tracker

http://rt.cpan.org/NoAuth/Bugs.html?Dist=Lingua-Han-Cantonese

* Search CPAN

http://search.cpan.org/dist/Lingua-Han-Cantonese

ACKNOWLEDGEMENTS

Top

COPYRIGHT & LICENSE

Top


Lingua-Han-Cantonese documentation Contained in the Lingua-Han-Cantonese distribution.

package Lingua::Han::Cantonese;

use warnings;
use strict;
use vars qw($VERSION);
$VERSION = '0.07';

use File::Spec;
use Lingua::Han::Utils qw/Unihan_value/;

sub new {
	my $class = shift;
	my $dir = __FILE__; $dir =~ s/\.pm//o;
	-d $dir or die "Directory $dir nonexistent!";
	my $self = { @_ };
	my %ct;
	my $file = File::Spec->catfile($dir, 'Cantonese.dat');
	open(FH, $file)	or die "$file: $!";
	while(<FH>) {
		my ($uni, $ct) = split(/\s+/);
		$ct{$uni} = $ct;
	}
	close(FH);
	$self->{'ct'} = \%ct;
	return bless $self => $class;
}

sub han2Cantonese {
	my ($self, $hanzi) = @_;
	
	my @code = Unihan_value($hanzi);

	my @result;
	foreach my $code (@code) {
		my $value = $self->{'ct'}->{$code};
		if (defined $value) {
			$value =~ s/\d//isg unless ($self->{'tone'});
		} else {
			# if it's not a Chinese, return original word
			$value = pack("U*", hex $code);
		}
		push @result, lc $value;
	}
	
	return wantarray ? @result : join('', @result);

}

1;
__END__