blob: e45ebff2e70be0d8b4d2f327bf657f9373842e62 [file] [log] [blame]
If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see perlpod manpage) which is
specially designed to be readable as is.
The following documentation is written in EUC-CN encoding.
Èç¹ûÄãÓÃÒ»°ãµÄÎÄ×ֱ༭Æ÷ÔÄÀÀÕâ·ÝÎļþ, ÇëºöÂÔÎÄÖÐÆæÌصÄ×¢¼Ç×Ö·û.
Õâ·ÝÎļþÊÇÒÔ POD (¼òÃ÷Îļþ¸ñʽ) д³É; ÕâÖÖ¸ñʽÊÇΪÁËÄÜÈÃÈËÖ±½ÓÔĶÁ,
¶øÌرðÉè¼ÆµÄ. ¹ØÓڴ˸ñʽµÄ½øÒ»²½ÐÅÏ¢, Çë²Î¿¼ perlpod ÏßÉÏÎļþ.
=encoding euc-cn
=head1 NAME
perlcn - ¼òÌåÖÐÎÄ Perl Ö¸ÄÏ
=head1 DESCRIPTION
»¶Ó­À´µ½ Perl µÄÌìµØ!
´Ó 5.8.0 °æ¿ªÊ¼, Perl ¾ß±¸ÁËÍêÉÆµÄ Unicode (ͳһÂë) Ö§Ô®,
Ò²Á¬´øÖ§Ô®ÁËÐí¶àÀ­¶¡ÓïϵÒÔÍâµÄ±àÂ뷽ʽ; CJK (ÖÐÈÕº«) ±ãÊÇÆäÖеÄÒ»²¿·Ý.
Unicode Êǹú¼ÊÐԵıê×¼, ÊÔͼº­¸ÇÊÀ½çÉÏËùÓеÄ×Ö·û: Î÷·½ÊÀ½ç, ¶«·½ÊÀ½ç,
ÒÔ¼°Á½Õß¼äµÄÒ»ÇÐ (Ï£À°ÎÄ, ÐðÀûÑÇÎÄ, ÑÇÀ­²®ÎÄ, Ï£²®À´ÎÄ, Ó¡¶ÈÎÄ,
Ó¡µØ°²ÎÄ, µÈµÈ). ËüÒ²ÈÝÄÉÁ˶àÖÖ×÷ҵϵͳÓëƽ̨ (Èç PC ¼°Âó½ðËþ).
Perl ±¾ÉíÒÔ Unicode ½øÐвÙ×÷. Õâ±íʾ Perl ÄÚ²¿µÄ×Ö·û´®Êý¾Ý¿ÉÓà Unicode
±íʾ; Perl µÄº¯Ê½ÓëËã·û (ÀýÈçÕý¹æ±íʾʽ±È¶Ô) Ò²ÄÜ¶Ô Unicode ½øÐвÙ×÷.
ÔÚÊäÈë¼°Êä³öʱ, ΪÁË´¦ÀíÒÔ Unicode ֮ǰµÄ±àÂ뷽ʽ´æ·ÅµÄÊý¾Ý, Perl
ÌṩÁË Encode Õâ¸öÄ£¿é, ¿ÉÒÔÈÃÄãÇáÒ׵ضÁÈ¡¼°Ð´Èë¾ÉÓеıàÂëÊý¾Ý.
Encode ÑÓÉìÄ£¿éÖ§Ô®ÏÂÁмòÌåÖÐÎĵıàÂ뷽ʽ ('gb2312' ±íʾ 'euc-cn'):
euc-cn Unix ÑÓÉì×Ö·û¼¯, Ò²¾ÍÊÇË׳ƵĹú±êÂë
gb2312-raw δ¾­´¦ÀíµÄ (µÍ±ÈÌØ) GB2312 ×Ö·û±í
gb12345 δ¾­´¦ÀíµÄÖйúÓ÷±ÌåÖÐÎıàÂë
iso-ir-165 GB2312 + GB6345 + GB8565 + ÐÂÔö×Ö·û
cp936 ×ÖÂëÒ³ 936, Ò²¿ÉÒÔÓà 'GBK' (À©³ä¹ú±êÂë) Ö¸Ã÷
hz 7 ±ÈÌØÒݳöʽ GB2312 ±àÂë
¾ÙÀýÀ´Ëµ, ½« EUC-CN ±àÂëµÄµµ°¸×ª³É Unicode, ìóÐè¼üÈëÏÂÁÐÖ¸Áî:
perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8
Perl Ò²ÄÚ¸½ÁË "piconv", Ò»Ö§ÍêÈ«ÒÔ Perl д³ÉµÄ×Ö·ûת»»¹¤¾ß³ÌÐò, Ó÷¨ÈçÏÂ:
piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8
piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn
ÁíÍâ, ÀûÓà encoding Ä£¿é, Äã¿ÉÒÔÇáÒ×д³öÒÔ×Ö·ûΪµ¥Î»µÄ³ÌÐòÂë, ÈçÏÂËùʾ:
#!/usr/bin/env perl
# Æô¶¯ euc-cn ×Ö´®½âÎö; ±ê×¼Êä³öÈë¼°±ê×¼´íÎó¶¼ÉèΪ euc-cn ±àÂë
use encoding 'euc-cn', STDIN => 'euc-cn', STDOUT => 'euc-cn';
print length("ÂæÍÕ"); # 2 (Ë«ÒýºÅ±íʾ×Ö·û)
print length('ÂæÍÕ'); # 4 (µ¥ÒýºÅ±íʾ×Ö½Ú)
print index("×»×»½Ì»å", "»×»½"); # -1 (²»°üº¬´Ë×Ó×Ö·û´®)
print index('×»×»½Ì»å', '»×»½'); # 1 (´ÓµÚ¶þ¸ö×Ö½Ú¿ªÊ¼)
ÔÚ×îºóÒ»ÁÐÀý×ÓÀï, "×»" µÄµÚ¶þ¸ö×Ö½ÚÓë "×»" µÄµÚÒ»¸ö×Ö½Ú½áºÏ³É EUC-CN
ÂëµÄ "»×"; "×»" µÄµÚ¶þ¸ö×Ö½ÚÔòÓë "½Ì" µÄµÚÒ»¸ö×Ö½Ú½áºÏ³É "»½".
Õâ½â¾öÁËÒÔÇ° EUC-CN Âë±È¶Ô´¦ÀíÉϳ£¼ûµÄÎÊÌâ.
=head2 ¶îÍâµÄÖÐÎıàÂë
Èç¹ûÐèÒª¸ü¶àµÄÖÐÎıàÂë, ¿ÉÒÔ´Ó CPAN (L<http://www.cpan.org/>) ÏÂÔØ
Encode::HanExtra Ä£¿é. ËüÄ¿Ç°ÌṩÏÂÁбàÂ뷽ʽ:
gb18030 À©³ä¹ýµÄ¹ú±êÂë, °üº¬·±ÌåÖÐÎÄ
ÁíÍâ, Encode::HanConvert Ä£¿éÔòÌṩÁ˼ò·±×ª»»ÓõÄÁ½ÖÖ±àÂë:
big5-simp Big5 ·±ÌåÖÐÎÄÓë Unicode ¼òÌåÖÐÎÄ»¥×ª
gbk-trad GBK ¼òÌåÖÐÎÄÓë Unicode ·±ÌåÖÐÎÄ»¥×ª
ÈôÏëÔÚ GBK Óë Big5 Ö®¼ä»¥×ª, Çë²Î¿¼¸ÃÄ£¿éÄÚ¸½µÄ b2g.pl Óë g2b.pl Á½Ö§³ÌÐò,
»òÔÚ³ÌÐòÄÚʹÓÃÏÂÁÐд·¨:
use Encode::HanConvert;
$euc_cn = big5_to_gb($big5); # ´Ó Big5 תΪ GBK
$big5 = gb_to_big5($euc_cn); # ´Ó GBK תΪ Big5
=head2 ½øÒ»²½µÄÐÅÏ¢
Çë²Î¿¼ Perl ÄÚ¸½µÄ´óÁ¿ËµÃ÷Îļþ (²»ÐÒÈ«ÊÇÓÃÓ¢ÎÄдµÄ), À´Ñ§Ï°¸ü¶à¹ØÓÚ
Perl µÄ֪ʶ, ÒÔ¼° Unicode µÄʹÓ÷½Ê½. ²»¹ý, ÍⲿµÄ×ÊÔ´Ï൱·á¸»:
=head2 Ìṩ Perl ×ÊÔ´µÄÍøÖ·
=over 4
=item L<http://www.perl.com/>
Perl µÄÊ×Ò³ (ÓÉÅ·À³Àñ¹«Ë¾Î¬»¤)
=item L<http://www.cpan.org/>
Perl ×ۺϵä²ØÍø (Comprehensive Perl Archive Network)
=item L<http://lists.perl.org/>
Perl ÓʵÝÂÛ̳һÀÀ
=back
=head2 ѧϰ Perl µÄÍøÖ·
=over 4
=item L<http://www.oreilly.com.cn/index.php?func=booklist&cat=68>
¼òÌåÖÐÎÄ°æµÄÅ·À³Àñ Perl Êé½å
=back
=head2 Perl ʹÓÃÕß¼¯»á
=over 4
=item L<http://www.pm.org/groups/asia.html>
Öйú Perl Íƹã×éÒ»ÀÀ
=back
=head2 Unicode Ïà¹ØÍøÖ·
=over 4
=item L<http://www.unicode.org/>
Unicode ѧÊõѧ»á (Unicode ±ê×¼µÄÖƶ¨Õß)
=item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html>
Unix/Linux É쵀 UTF-8 ¼° Unicode ´ð¿ÍÎÊ
=back
=head1 SEE ALSO
L<Encode>, L<Encode::CN>, L<encoding>, L<perluniintro>, L<perlunicode>
=head1 AUTHORS
Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>
Audrey Tang (ÌÆ·ï) E<lt>audreyt@audreyt.orgE<gt>
=cut