| Revision history for Perl module Unicode::Collate. |
| |
| 0.89 Sat Mar 10 20:19:11 2012 |
| - avoid "use Test". |
| |
| 0.88 Mon Mar 5 21:56:13 2012 |
| - DUCET is updated (for Unicode 6.1.0) as Collate/allkeys.txt. |
| ! Please notice that allkeys.txt will be overwritten if you have had |
| other allkeys.txt already. |
| - U+9FCC is a new CJK unified ideograph. |
| - The default UCA_Version is 24. |
| - Locale/*.pl (except fr.pl) and CJK/Korean.pm are updated. |
| - modified tests: cjkrange.t, compatui.t, hangtype.t, loc_cjkc.t, |
| loc_es.t, loc_estr.t, overcjk0.t, overcjk1.t, version.t in t. |
| |
| 0.87 Sat Nov 26 17:01:42 2011 |
| - Now Locale/*.pl files are searched in @INC. (see [rt.cpan.org #72666]) |
| - added locale_version method to access the version number of Locale/*.pl. |
| |
| 0.86 Wed Nov 23 17:16:00 2011 |
| - tailored compatibility ideographs as well as unified ideographs for |
| the locales: ja, ko, zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke. |
| - added loc_cjkc.t in t. |
| |
| 0.85 Sat Nov 19 20:01:57 2011 |
| - U::C::Locale newly supports locales: bn, sa. |
| - updated some locales to CLDR 2.0 : zh__pinyin, zh__stroke. |
| * supported compatibility decomposable characters and U+FDD0 indexes. |
| * updated CJK/Pinyin.pm and CJK/Stroke.pm. |
| - added loc_bn.t, loc_cjk.t, loc_sa.t in t. |
| |
| 0.84 Sun Nov 6 14:44:51 2011 |
| - U::C::Locale supports script codes. |
| - U::C::Locale newly supports locales: fa, sr_Latn, ur. |
| - added loc_fa.t, loc_srla.t, loc_ur.t in t. |
| |
| 0.83 Sun Oct 30 20:22:04 2011 |
| - mklocale: auto-generate equivalents for suppressed contractions. |
| * be.txt, bg.txt, kk.txt, mk.txt, ru.txt, sr.txt, uk.txt in data |
| are simplified. |
| * but no Locale/*.pl will be modified. |
| |
| 0.82 Sun Oct 30 10:03:48 2011 |
| - U::C::Locale newly supports locales: si, si__dictionary, |
| sv__reformed, ta, te, th, wae. |
| - added loc_si.t, loc_sidt.t, loc_svrf.t, loc_ta.t, loc_te.t, |
| loc_th.t, loc_wae.t in t. |
| - updated some locales to CLDR 2.0 : sk, sr, sv, uk. |
| - updated CJK/Pinyin.pm according to CLDR 2.0. |
| |
| 0.81 Sun Oct 23 21:32:36 2011 |
| - U::C::Locale newly supports locales: ml, mr, or, pa. |
| - added loc_ml.t, loc_mr.t, loc_or.t, loc_pa.t in t. |
| - updated some locales to CLDR 2.0 : mk, mt, nb, nn, ro, ru. |
| |
| 0.80 Sun Oct 9 21:00:21 2011 |
| - U::C::Locale newly supports locales: bs, hi, kn, kok, ln. |
| - added loc_bs.t, loc_hi.t, loc_kn.t, loc_kok.t, loc_ln.t in t. |
| - updated some locales to CLDR 2.0 : ha, hr, kk, lt. |
| |
| 0.79 Sun Oct 2 20:31:01 2011 |
| - pod: [rt.cpan.org #70241] Fix minor grammar error in manpage |
| by Harlan Lieberman-Berg. |
| - 'suppress' no longer affects contractions via 'entry'. |
| - U::C::Locale newly supports locales: as, fi__phonebook, gu. |
| - added loc_as.t, loc_fiph.t, loc_gu.t in t. |
| - updated some locales to CLDR 2.0 : ar, be, bg. |
| |
| 0.78 Mon Jul 25 21:29:50 2011 |
| - tried fixing the tarball with world writable files. |
| ( http://www.perlmonks.org/?node_id=731935 ) |
| |
| 0.77 Sun Jul 3 21:15:08 2011 |
| - xs: [perl #93470] [PATCH] consting in Collate.xs by Robin Barker. |
| |
| 0.76 Sun May 15 10:06:59 2011 |
| - updated CJK/Pinyin.pm and CJK/Stroke.pm according to CLDR 1.9.1. |
| (type='pinyin' alt='short' and type='stroke' alt='short' respectively) |
| |
| 0.75 Sat May 7 21:07:38 2011 |
| - supported ignore_level2 and rewrite. |
| - added iglevel2.t and rewrite.t in t. |
| |
| 0.74 Mon Mar 21 19:07:38 2011 |
| - removed sw (Swahili) collation according to CLDR 1.9. |
| (removed files: Collate/Locale/sw.pl and data/sw.txt) |
| - shifted primary weights of letters > Z for some languages. |
| (affected locales: da, fi, fo, kl, nb, nn, sv) |
| |
| 0.73 Sun Mar 6 13:24:22 2011 |
| - DUCET is updated (for Unicode 6.0.0) as Collate/allkeys.txt. |
| ! However no maint perl has supported Unicode 6.0.0 yet; |
| wait for 5.14, or try developing 5.13.7 or later. |
| ! Please notice that allkeys.txt will be overwritten if you have had |
| other allkeys.txt already. |
| - The default UCA_Version is 22. |
| - Locale/*.pl (except fr.pl and ko.pl) and CJK/Korean.pm are updated. |
| - test: compare allkeys.txt's version with Base_Unicode_Version |
| in t/default.t. |
| |
| 0.72 Sat Jan 22 17:28:32 2011 |
| - xs: fix mixing char* and U8*. |
| |
| 0.71 Tue Jan 18 22:29:44 2011 |
| - t/loc_test.t should not fail without Unicode::Normalize. |
| |
| 0.70 Sun Jan 16 20:31:07 2011 |
| - Now U::C::Locale->new will use the compiled DUCET via XS if available. |
| added some tests in t/loc_test.t. |
| |
| 0.69 Sat Jan 15 19:41:11 2011 |
| - clarified about XSUB. revised INSTALL in README. |
| - xs: flag passed to utf8n_to_uvuni(). |
| - doc and comments: [perl #81876] Fix typos by Peter J. Acklam. |
| |
| 0.68 Tue Nov 23 20:17:22 2010 |
| - doc: clarified about (backwards => [ ]) and (backwards => undef). |
| - separated t/backwds.t from t/test.t. |
| - added cjk_b5.t, cjk_gb.t, cjk_ja.t, cjk_ko.t, cjk_py.t, cjk_st.t in t |
| for CJK/*.pm without Locale.pm. |
| |
| 0.67 Sun Nov 14 11:38:59 2010 |
| - supported UCA_Version 22 for Unicode 6.0.0. |
| * 2B740..2B81D are new CJK unified ideographs. |
| * noncharacters (e.g. U+FFFF) should be overridable, not be ignored. |
| ! DUCET is NOT updated, as no maint perl supports Unicode 6.0.0. |
| Thus the default UCA_Version is still 20. |
| - added t/nonchar.t. |
| - improved discontiguous contractions of 3 or more characters. |
| (e.g. 0FB2 0F71 0F80 and 0FB3 0F71 0F80) |
| - auxiliary: now 'mklocale' also copes with Korean.pm according to DUCET. |
| |
| 0.66 Sun Nov 7 10:47:30 2010 |
| - U::C::Locale newly supports locale: ko. |
| - added Unicode::Collate::CJK::Korean for ko. |
| - added t/loc_ko.t. |
| - 12 compat. ideographs (e.g. U+FA0E) are treated as unified ideographs. |
| (though DUCET also does it, now Unicode::Collate does it without DUCET.) |
| - added t/compatui.t. |
| ! Ideographs Ext.B (U+20000..U+2A6D6) can be overridden with UCA_Version 8. |
| This is a long-standing behavior from Unicode::Collate 0.11 to 0.63. |
| A wrong fix at 0.64 should be abandoned. |
| |
| 0.65 Wed Nov 3 13:10:20 2010 |
| - U::C::Locale newly supports locale: zh and its some variants. |
| (zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke) |
| - added Unicode::Collate::CJK::Big5 for zh__big5han. |
| - added Unicode::Collate::CJK::GB2312 for zh__gb2312han. |
| - added Unicode::Collate::CJK::Pinyin for zh__pinyin. |
| - added Unicode::Collate::CJK::Stroke for zh__stroke. |
| - added loc_zh.t, loc_zhb5.t, loc_zhgb.t, loc_zhpy.t, loc_zhst.t in t. |
| |
| 0.64 Sun Oct 31 14:17:29 2010 |
| - U::C::Locale newly supports locale: ja. |
| - added Unicode::Collate::CJK::JISX0208 for ja. |
| - added loc_ja.t, loc_jait.t, loc_japr.t in t. |
| - a subroutine specified in 'overrideCJK' or 'overrideHangul' is allowed |
| to return an integer or undef value. |
| - fix: Ideographs Ext.B (U+20000..U+2A6D6) are assigned in Unicode 3.1, |
| then 'overrideCJK' should not override them with UCA_Version 8. |
| !! sorry, this fix is based on a wrong idea. reverted at 0.66. !! |
| - separated t/overcjk0.t and t/overcjk1.t from t/override.t. |
| |
| 0.63 Sun Oct 10 22:13:21 2010 |
| - supported suppress contractions (see 'suppress' in POD). |
| - internal for 'hangul_terminator' in getSortKey(). |
| - U::C::Locale newly supports locales: be, bg, kk, mk, ru, sr. |
| - added loc_be.t, loc_bg.t, loc_cyrl.t, loc_kk.t, loc_mk.t, loc_ru.t, |
| loc_sr.t in t. |
| - added tailoring with U+0340 or U+0341 instead of U+0300 or U+0301. |
| (affected locales: hr, is, pl, se, to, wo) |
| |
| 0.62 Wed Oct 6 21:35:54 2010 |
| - U::C::Locale newly supports locales: ar, hu, hy, se, to, uk. |
| - added loc_ar.t, loc_hu.t, loc_hy.t, loc_se.t, loc_to.t, loc_uk.t in t. |
| - Vietnamese (vi): added tailoring for U+0340 and U+0341. |
| |
| 0.61 Sat Oct 2 11:41:29 2010 |
| - U::C::Locale newly supports locales: hr, ig, sq. |
| - added loc_hr.t, loc_ig.t, loc_sq.t in t. |
| - precomposed e-dot-below, o-dot-below, o-tilde are tailored as well. |
| (affected locales: et, yo) |
| - Vietnamese (vi): added contractions for non-blocked decompositions |
| * base + dot-below + mark such as a\x{323}\x{306}, \x{1EA1}\x{306} etc. |
| * base + tone + horn such as o\x{309}\x{31B}, \x{1ECF}\x{31B} etc. |
| |
| 0.60 Thu Sep 23 21:37:36 2010 |
| - bug fix: index() [and its friends including gmatch()] didn't remove |
| ignorable characters in the substring correctly. |
| Thanks for the bug report: |
| http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2010-09/msg00014.html |
| |
| - U::C::Locale newly supports locales: de__phonebook, nso, om, tn, vi. |
| - added loc_de.t, loc_deph.t, loc_nso.t, loc_om.t, loc_tn.t, loc_vi.t in t. |
| - precomposed a-breve, a-circ, e-circ, o-circ are tailored as well. |
| (affected locales: ro, sk, sv) |
| |
| 0.59 Sun Sep 5 17:03:52 2010 |
| - U::C::Locale newly supports locales: az, fil, ha, lt, mt, tr, wo, yo. |
| - added loc_az.t, loc_fil.t, loc_ha.t, loc_lt.t, loc_mt.t, loc_tr.t, |
| loc_wo.t, loc_yo.t in t. |
| - precomposed a-uml, o-uml, and u-uml are tailored as well. |
| (affected locales: da, et, fi, fo, is, kl, nb, nn, sk, sv) |
| |
| 0.58 Sun Aug 29 19:56:50 2010 |
| - U::C::Locale newly supports locales: af, cy, da, fo, haw, is, kl, sw. |
| - added loc_af.t, loc_cy.t, loc_da.t, loc_fo.t, loc_haw.t, loc_is.t, |
| loc_kl.t, loc_sw.t in t. |
| |
| 0.57 Sun Aug 22 22:39:58 2010 |
| - U::C::Locale newly supports locales: ca, et, fi, lv, sk, sl. |
| - added loc_ca.t, loc_et.t, loc_fi.t, loc_lv.t, loc_sk.t, loc_sl.t in t. |
| |
| 0.56 Sun Aug 8 20:24:03 2010 |
| - Unicode::Collate::Locale newly supports locales: eo, nb, ro, sv. |
| - added loc_eo.t, loc_es.t, loc_estr.t, loc_nb.t, loc_ro.t, loc_sv.t in t. |
| ! renamed t/locale_{xy}.t to t/loc_{xy}.t (for safer 8.3 names) |
| (loc_cs.t, loc_fr.t, loc_nn.t, loc_pl.t, loc_test.t) |
| |
| 0.55 Sun Aug 1 21:21:23 2010 |
| - incorporated Unicode::Collate::Locale with some changes. see: |
| http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/2004-03/msg00030.html |
| - supported locales: cs, es, es__traditional, fr, nn, pl. |
| ! added t/locale*.t that uses DUCET. |
| (locale_cs.t, locale_fr.t, locale_nn.t, locale_pl.t, locale_test.t) |
| - data/*.txt and mklocale for preparation of Locale/*.pl from DUCET. |
| |
| 0.54 Sun Jul 25 21:37:04 2010 |
| - Now UCA Revision 20 (based on Unicode 5.2.0). |
| - DUCET is also updated (for Unicode 5.2.0) as Collate/allkeys.txt, |
| which *is required* to test this module. |
| ! Please notice that allkeys.txt will be overwritten if you have had |
| other allkeys.txt already. |
| - U+9FC4..U+9FCB and U+2A700..U+2B734 are new CJK unified ideographs. |
| - Many hangul jamo are assigned (affecting hangul_terminator). |
| |
| ! Now XSUB will be built by default. (XSUB needs a C compiler.) |
| To build pure perl, run disableXS before Makefile.PL. |
| ! DUCET will be compiled when XS is used. Explicit saying |
| <table => 'allkeys.txt'> (or using another table) will prevent |
| this module from using the compiled DUCET. |
| |
| ! added t/default.t that uses DUCET. |
| |
| 0.53 Sun Feb 14 20:46:27 2010 |
| - Now UCA Revision 18 (based on Unicode 5.1.0). |
| - DUCET is also updated (for Unicode 5.1.0) as Collate/allkeys.txt, |
| which is not required to test this module. |
| ! Please notice that allkeys.txt will be overwritten if you have had |
| other allkeys.txt already. |
| - U+9FBC..U+9FC3 are new CJK unified ideographs. |
| |
| 0.52 Thu Oct 13 21:51:09 2005 |
| - The Unicode::Collate->new method does not destroy user's $_ any longer. |
| (thanks to Jon Warbrick for bug report) |
| |
| 0.51 Sun May 29 20:21:19 2005 |
| - Added the latest DUCET (for Unicode 4.1.0) as Collate/allkeys.txt, |
| which is not required to test this module. |
| ! Please notice that allkeys.txt will be overwritten if you have had |
| other allkeys.txt already. |
| - Added INSTALL section in POD. |
| |
| 0.50 Sun May 8 20:26:39 2005 |
| - Now UCA Revision 14 (based on Unicode 4.1.0). |
| - Some tests are modified. |
| - Added cjkrange.t, ignor.t, override.t in t. |
| - Added META.yml. |
| |
| 0.40 Sat Apr 24 06:54:40 2004 |
| - Now a table file is searched in @INC. |
| |
| 0.33 Sat Dec 13 14:07:27 2003 |
| - documentation improvement: in "entry", "overrideHangul", etc. |
| |
| 0.32 Wed Dec 3 23:38:18 2003 |
| - A matching part from index(), match() etc. will include illegal |
| code points (as well as ignorable characters) following a grapheme. |
| - Contraction with illegal code point will be invalid. |
| - Added t/view.t. |
| - Added some tests in t/illegal.t. |
| - Separated t/altern.t and t/rearrang.t from t/test.t. |
| - modified XSUB internals. |
| |
| 0.31 Sun Nov 16 15:40:15 2003 |
| - Illegal code points (surrogate and noncharacter; they are definitely |
| ignorable) will be distinguished from NULL ("\0"); |
| but porting is not successful in the case of ((Pure Perl) and |
| (Perl 5.7.3 or before)). If perl 5.6.X is used, XSUB may help it |
| in place of broken CORE::unpack('U*') in older perl. |
| - added illegal.t and illegalp.t in t. |
| - added XSUB where some functions are implemented in XSUB. |
| Pure Perl is also supported. |
| |
| 0.30 Mon Oct 13 21:26:37 2003 |
| - fix: Completely ignorable in table should be able to be overridden |
| by non-ignorable in entry. |
| - fix: Maximum length for contraction must not be shortened |
| by a shorter contraction following in table and/or entry. |
| - added t/normal.t. |
| - some doc fixes |
| |
| 0.29 Mon Oct 13 12:18:23 2003 |
| - now UCA Version 11 (but no functionality is different from Version 9). |
| - supported hangul_terminator. |
| - fix: Base_Unicode_Version falsely returns Perl's Unicode version. |
| C4 in UTS #10 requires UTS's Unicode version. |
| - For variable weighting, 'variable' is recommended |
| and 'alternate' is deprecated. |
| - added version() method. |
| - added hangtype.t, trailwt.t, variable.t, and version.t in t. |
| |
| 0.28 Sat Sep 06 20:16:01 2003 |
| - Fixed another inconsistency under (normalization => undef): |
| Non-contiguous contraction is always neglected. |
| - Fixed: according to S2.1 in UTS #10, a blocked combining character |
| should not be contracted. One test in t/test.t was wrong, then removed. |
| - Added t/contract.t. |
| - (normalization => "prenormalized") is able to be used. |
| |
| 0.27 Sun Aug 31 22:23:17 2003 |
| some improvements: |
| - The maximum length of contracted CE was not checked (v0.22 to v0.26). |
| Collation of a large string including a first letter of a contraction |
| that is not a part of that contraction (say, 'c' of 'ca' |
| where 'ch' is defined) was too slow, inefficient. |
| - A form name for 'normalization', no longer restricted to |
| /^(?:NF)?K?[CD]\z/, will be allowed as long as |
| Unicode::Normalize::normalize() accepts it, since Unicode::Normalize |
| or UAX #15 may be changed/enhanced in future. |
| - When Hangul syllables are decomposed under <normalization => undef>, |
| contraction among jamo (LV, VT, LVT) derived from the same |
| Hangul syllable is allowed. |
| - Added t/hangul.t. |
| |
| 0.26 Sun Aug 03 22:23:17 2003 |
| - fix: an expansion in which a CE is level 3 ignorable and others are not |
| was wrongly made level 3 ignorable as a whole entry. |
| (In DUCET, some precomposed characters in Musical Symbols are so) |
| |
| 0.25 Mon Jun 06 23:20:17 2003 |
| - fix Makefile.PL. |
| - internal tweak (again): pack_U() and unpack_U(). |
| |
| 0.24 Thu Apr 02 23:12:54 2003 |
| - internal tweak for (?un)pack 'U'. |
| |
| 0.23 Wed Sep 04 19:25:20 2002 |
| - fix: scalar match() no longer returns an lvalue substr ref. |
| - fix: "Ignorable after variable" should be made level 3 ignorable |
| even if alternate => 'blanked'. |
| - Now a grapheme may contain trailing level 2, level 3, |
| and completely ignorable characters. |
| |
| 0.22 Mon Sep 02 23:15:14 2002 |
| - New File: t/index.t. |
| (The new t/test.t excludes tests for index.) |
| - tweak on index(). POSITION is supported. |
| - add match, gmatch, subst, gsubst methods. |
| - fix: ignorable after variable in 'shift'-variable weight. |
| |
| 0.21 Sat Aug 03 10:24:00 2002 |
| - upgrade keys.txt and t/test.t for UCA Version 9. |
| |
| 0.20 Fri Jul 26 02:15:25 2002 |
| - now UCA Version 9. |
| - U+FDD0..U+FDEF are new non-characters. |
| - fix: whitespace characters before @backwards etc. in a table file. |
| - now values for 'alternate', 'backwards', etc., |
| which are explicitly specified via new(), |
| are preferred to those specified in a table file. |
| |
| 0.12 Sun May 05 09:43:10 2002 |
| - add new methods, ->UCA_Version and ->Base_Unicode_Version. |
| - test fix: removed the needless requirement of Unicode::Normalize. |
| [reported by David Hand] |
| |
| 0.11 Fri May 03 02:28:10 2002 |
| - fix: now derived collation elements can be used for Hangul Jamo |
| when their weights are not defined. |
| [reported by Andreas J. Koenig] |
| - fix: rearrangements had not worked. |
| - mentioned pleblem on index() in BUGS. |
| - more documents, more tests. |
| - tag names for 'alternate' are case-insensitive (i.e. 'SHIFTed' etc.). |
| - The <undef> value for the keys "overrideCJK", "overrideHangul", |
| "rearrange" has a special behavior (different from default). |
| |
| 0.10 Tue Dec 11 23:26:42 2001 |
| - now you are allowed to use no table file. |
| - fix: fetching CE with two or more combining characters. |
| |
| 0.09 Sun Nov 11 17:02:40:18 2001 |
| - add the following methods: eq, ne, lt, le, gt, le. |
| - relies on &Unicode::Normalize::getCombinClass() |
| in place of %Unicode::Normalize::Combin |
| (the hash is not defined in the XS version of Unicode::Normalize). |
| then you should install Unicode::Normalize 0.10 or later. |
| - now independent of Lingua::KO::Hangul::Util |
| (this module does decomposition of Hangul syllables for itself) |
| |
| 0.08 Mon Aug 20 22:40:18 2001 |
| - add the index method. |
| |
| 0.07 Thu Aug 16 23:42:02 2001 |
| - rename the module name to Unicode::Collate. |
| |
| 0.06 Thu Aug 16 23:18:36 2001 |
| - add description of the getSortKey method. |
| |
| 0.05 Mon Aug 13 22:23:11 2001 |
| - bug fix: on the things of 4.2.1, UTR #10 |
| - getSortKey returns a string, but not an arrayref. |
| |
| 0.04 Mon Aug 13 22:23:11 2001 |
| - some bugs are fixed. |
| - some tailoring parameters are added. |
| |
| 0.03 Mon Aug 06 06:26:35 2001 |
| - modify README |
| |
| 0.02 Sun Aug 05 20:20:01 2001 |
| - some fix |
| |
| 0.01 Sun Jul 29 16:16:15 2001 |
| - original version; created by h2xs 1.21 |
| with options -A -X -n Sort::UCA |