| Lingua-ZH-MacChinese-Simplified documentation | Contained in the Lingua-ZH-MacChinese-Simplified distribution. |
Lingua::ZH::MacChinese::Simplified - transcoding between Mac OS Chinese Simplified encoding and Unicode
(1) using function names exported by default:
use Lingua::ZH::MacChinese::Simplified;
$wchar = decodeMacChineseSimp($octet);
$octet = encodeMacChineseSimp($wchar);
(2) using function names exported on request:
use Lingua::ZH::MacChinese::Simplified qw(decode encode);
$wchar = decode($octet);
$octet = encode($wchar);
(3) using function names fully qualified:
use Lingua::ZH::MacChinese::Simplified ();
$wchar = Lingua::ZH::MacChinese::Simplified::decode($octet);
$octet = Lingua::ZH::MacChinese::Simplified::encode($wchar);
# $wchar : a string in Perl's Unicode format
# $octet : a string in Mac OS Chinese Simplified encoding
This module provides transcoding from/to Mac OS Chinese Simplified encoding (denoted MacChineseSimp hereafter).
In order to ensure roundtrip mapping, MacChineseSimp encoding
has some characters with mapping from a single MacChineseSimp character
to a sequence of Unicode characters and vice versa.
Such characters include 0xA6D9 (MacChineseSimp) from/to
0xFF0C+0xF87E (Unicode) for "FULLWIDTH COMMA for vertical text".
This module provides functions to transcode between MacChineseSimp and Unicode, without information loss for every MacChineseSimp character.
$wchar = decode($octet)$wchar = decode($handler, $octet)$wchar = decodeMacChineseSimp($octet)$wchar = decodeMacChineseSimp($handler, $octet)Converts MacChineseSimp to Unicode.
decodeMacChineseSimp() is an alias for decode() exported by default.
If the $handler is not specified,
any MacChineseSimp character that is not mapped to Unicode is deleted;
if the $handler is a code reference,
a string returned from that coderef is inserted there.
if the $handler is a scalar reference,
a string (a PV) in that reference (the referent) is inserted there.
The 1st argument for the $handler coderef is
a string of the unmapped MacChineseSimp character (e.g. "\xFC\xFE").
$octet = encode($wchar)$octet = encode($handler, $wchar)$octet = encodeMacChineseSimp($wchar)$octet = encodeMacChineseSimp($handler, $wchar)Converts Unicode to MacChineseSimp.
encodeMacChineseSimp() is an alias for encode() exported by default.
If the $handler is not specified,
any Unicode character that is not mapped to MacChineseSimp is deleted;
if the $handler is a code reference,
a string returned from that coderef is inserted there.
if the $handler is a scalar reference,
a string (a PV) in that reference (the referent) is inserted there.
The 1st argument for the $handler coderef is
the Unicode code point (unsigned integer) of the unmapped character.
E.g.
sub hexNCR { sprintf("&#x%x;", shift) } # hexadecimal NCR
sub decNCR { sprintf("&#%d;" , shift) } # decimal NCR
print encodeMacChineseSimp("ABC\x{100}\x{10000}");
# "ABC"
print encodeMacChineseSimp(\"", "ABC\x{100}\x{10000}");
# "ABC"
print encodeMacChineseSimp(\"?", "ABC\x{100}\x{10000}");
# "ABC??"
print encodeMacChineseSimp(\&hexNCR, "ABC\x{100}\x{10000}");
# "ABCĀ𐀀"
print encodeMacChineseSimp(\&decNCR, "ABC\x{100}\x{10000}");
# "ABCĀ𐀀"
Sorry, the author is not working on a Mac OS. Please let him know if you find something wrong.
SADAHIRO Tomoyuki <SADAHIRO@cpan.org>
Copyright(C) 2003-2007, SADAHIRO Tomoyuki. Japan. All rights reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CHINSIMP.TXT
http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/CORPCHAR.TXT
| Lingua-ZH-MacChinese-Simplified documentation | Contained in the Lingua-ZH-MacChinese-Simplified distribution. |
package Lingua::ZH::MacChinese::Simplified; require 5.006001; use strict; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); require Exporter; require DynaLoader; $VERSION = '0.04'; @ISA = qw(Exporter DynaLoader); @EXPORT = qw(decodeMacChineseSimp encodeMacChineseSimp); @EXPORT_OK = qw(decode encode); bootstrap Lingua::ZH::MacChinese::Simplified $VERSION; 1; __END__