docs/ldml/tr35.html - platform/external/cldr - Git at Google

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
 "https://www.w3.org/TR/html4/loose.dtd">
 <html>
 <head>
   <meta name="generator" content=
   "HTML Tidy for HTML5 for Apple macOS version 5.6.0">
   <meta http-equiv="Content-Type" content=
   "text/html; charset=utf-8">
   <meta http-equiv="Content-Language" content="en-us">
   <link rel="stylesheet" href=
   "../reports.css" type="text/css">
   <title>UTS #35: Unicode Locale Data Markup Language</title>
   <style type="text/css">
   <!--
   .dtd {
         font-family: monospace;
         font-size: 90%;
         background-color: #CCCCFF;
         border-style: dotted;
         border-width: 1px;
   }

   .xmlExample {
         font-family: monospace;
         font-size: 80%
   }

   .blockedInherited {
         font-style: italic;
         font-weight: bold;
         border-style: dashed;
         border-width: 1px;
         background-color: #FF0000
   }

   .inherited {
         font-weight: bold;
         border-style: dashed;
         border-width: 1px;
         background-color: #00FF00
   }

   .element {
         font-weight: bold;
         color: red;
   }

   .attribute {
         font-weight: bold;
         color: maroon;
   }

   .attributeValue {
         font-weight: bold;
         color: blue;
   }

   li, p {
         margin-top: 0.5em;
         margin-bottom: 0.5em
   }

   h2, h3, h4, h5, table {
         margin-top: 1.5em;
         margin-bottom: 0.5em;
   }

   h5 {
         font-size: medium;
         font-style: italic
   }
   -->
   </style>
 </head>
 <body>
   <table class="header" width="100%">
     <tr>
       <td class="icon"><a href="https://unicode.org"><img alt=
       "[Unicode]" src="../logo60s2.gif"
       width="34" height="33" style=
       "vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a>&nbsp;
       <a class="bar" href=
       "https://www.unicode.org/reports/">Technical Reports</a></td>
     </tr>
     <tr>
       <td class="gray">&nbsp;</td>
     </tr>
   </table>
   <div class="body">
     <h2 style="text-align: center">Unicode Technical Standard #35</h2>
     <h1>Unicode Locale Data Markup Language (LDML)</h1>
     <!-- At least the first row of this header table should be identical across the parts of this UTS. -->
     <table border="1" cellpadding="2" cellspacing="0" class="wide">
       <tr>
         <td>Version</td>
         <td>38</td>
       </tr>
       <tr>
         <td>Editors</td>
         <td>Mark Davis (<a href="mailto:markdavis@google.com">markdavis@google.com</a>) and
         <a href="tr35.html#Acknowledgments">other CLDR committee members</a></td>
       </tr>
       <tr>
         <td>Date</td>
         <td>2020-10-23</td>
       </tr>
       <tr>
         <!-- This link must be made live when posting the final version but is disabled during proposed update stage. -->
         <td>This Version</td>
         <td>
 		<a href="https://www.unicode.org/reports/tr35/tr35-61/tr35.html">
 		https://www.unicode.org/reports/tr35/tr35-61/tr35.html</a></td>
       </tr>
       <tr>
         <td>Previous Version</td>
         <td>
 		<a href="https://www.unicode.org/reports/tr35/tr35-60/tr35.html">
 		https://www.unicode.org/reports/tr35/tr35-60/tr35.html</a></td>
       </tr>
       <tr>
         <td>Latest Version</td>
         <td><a href=
         "https://www.unicode.org/reports/tr35/">https://www.unicode.org/reports/tr35/</a></td>
       </tr>
       <tr>
         <td>Corrigenda</td>
         <td><a href=
         "http://unicode.org/cldr/corrigenda.html">http://unicode.org/cldr/corrigenda.html</a></td>
       </tr>
       <tr>
         <td>Latest Proposed Update</td>
         <td><a href=
         "https://www.unicode.org/reports/tr35/proposed.html">https://www.unicode.org/reports/tr35/proposed.html</a></td>
       </tr>
       <tr>
         <td>Namespace</td>
         <td><a href=
         "https://unicode.org/cldr/">https://unicode.org/cldr/</a></td>
       </tr>
       <tr>
         <td>DTDs</td>
         <td><a href="https://github.com/unicode-org/cldr/tree/maint/maint-38/common/dtd">
 		http://unicode.org/cldr/dtd/38/</a></td>
       </tr>
       <tr>
         <td>Revision</td>
         <td><a href="#Modifications">61</a></td>
       </tr>
     </table>
     <h3><i>Summary</i></h3>
     <p>This document describes an XML format (<i>vocabulary</i>)
     for the exchange of structured locale data. This format is used
     in the <a href="https://unicode.org/cldr/">Unicode Common Locale
     Data Repository</a>.</p>
     <h3><i>Status</i></h3>

     <!-- NOT YET APPROVED
                 <p>
                                 <i class="changed">This is a<b><font color="#ff3333">
                                 draft </font></b>document which may be updated, replaced, or superseded by
                                 other documents at any time. Publication does not imply endorsement
                                 by the Unicode Consortium. This is not a stable document; it is
                                 inappropriate to cite this document as other than a work in
                                 progress.
                         </i>
                 </p>
      END NOT YET APPROVED -->
     <!-- APPROVED -->
     <p><i>This document has been reviewed by Unicode members and
     other interested parties, and has been approved for publication
     by the Unicode Consortium. This is a stable document and may be
     used as reference material or cited as a normative reference by
     other specifications.</i></p>
     <!-- END APPROVED -->

     <blockquote>
       <p><i><b>A Unicode Technical Standard (UTS)</b> is an
       independent specification. Conformance to the Unicode
       Standard does not imply conformance to any UTS.</i></p>
     </blockquote>
     <p><i>Please submit corrigenda and other comments with the CLDR
     bug reporting form [<a href=
     "http://cldr.unicode.org/index/bug-reports">Bugs</a>]. Related
     information that is useful in understanding this document is
     found in the <a href="#References">References</a>. For the
     latest version of the Unicode Standard see [<a href=
     "https://www.unicode.org/versions/latest/">Unicode</a>]. For a
     list of current Unicode Technical Reports see [<a href=
     "https://www.unicode.org/reports/">Reports</a>]. For more
     information about versions of the Unicode Standard, see
     [<a href=
     "https://www.unicode.org/versions/">Versions</a>].</i></p><!-- This section of Parts should be identical in all of the parts of this UTS. -->
     <h2><a name="Parts" href="#Parts" id="Parts">Parts</a></h2>
     <p>The LDML specification is divided into the following
     parts:</p>
     <ul class="toc">
       <li>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
       locales, basic structure)</li>
       <li>Part 2: <a href="tr35-general.html#Contents">General</a>
       (display names &amp; transforms, etc.)</li>
       <li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
       (number &amp; currency formatting)</li>
       <li>Part 4: <a href="tr35-dates.html#Contents">Dates</a>
       (date, time, time zone formatting)</li>
       <li>Part 5: <a href=
       "tr35-collation.html#Contents">Collation</a> (sorting,
       searching, grouping)</li>
       <li>Part 6: <a href=
       "tr35-info.html#Contents">Supplemental</a> (supplemental
       data)</li>
       <li>Part 7: <a href=
       "tr35-keyboards.html#Contents">Keyboards</a> (keyboard
       mappings)</li>
     </ul>
     <h2><a name="Contents" href="#Contents" id="Contents">Contents
     of Part 1, Core</a></h2>
     <!-- START Generated TOC: CheckHtmlFiles -->
     <ul class="toc">
       <li>1 <a href="#Introduction">Introduction</a>
         <ul class="toc">
           <li>1.1 <a href="#Conformance">Conformance</a></li>
         </ul>
       </li>
       <li>2 <a href="#Locale">What is a Locale?</a></li>
       <li>3 <a href="#Identifiers">Unicode Language and Locale
       Identifiers</a>
         <ul class="toc">
           <li>3.1 <a href="#Unicode_language_identifier">Unicode
           Language Identifier</a></li>
           <li>3.2 <a href="#Unicode_locale_identifier">Unicode
           Locale Identifier</a>
             <ul class='toc'>
               <li><a href="#Canonical_Unicode_Locale_Identifiers">3.2.1 Canonical Unicode Locale Identifiers</a></li>
             </ul>
           </li>
           <li>3.3 <a href="#BCP_47_Conformance">BCP 47
           Conformance</a>
             <ul class="toc">
               <li>3.3.1 <a href=
               "#BCP_47_Language_Tag_Conversion">BCP 47 Language Tag
               Conversion</a></li>
             </ul>
           </li>
           <li>3.4 <a href="#Field_Definitions">Language Identifier
           Field Definitions</a>
             <ul class="toc">
               <li>Table: <a href=
               "#Language_Locale_Field_Definitions">Language
               Identifier Field Definitions</a></li>
             </ul>
           </li>
           <li>3.5 <a href="#Special_Codes">Special Codes</a>
             <ul class="toc">
               <li>3.5.1 <a href=
               "#Unknown_or_Invalid_Identifiers">Unknown or Invalid
               Identifiers</a></li>
               <li>3.5.2 <a href="#Numeric_Codes">Numeric
               Codes</a></li>
               <li>3.5.3 <a href="#Private_Use_Codes">Private Use
               Codes</a>
                 <ul class="toc">
                   <li>Table: <a href="#Private_Use_CLDR">Private
                   Use Codes in CLDR</a></li>
                 </ul>
               </li>
             </ul>
           </li>
           <li>3.6 <a href=
           "#Locale_Extension_Key_and_Type_Data">Unicode BCP 47 U
           Extension</a>
             <ul class="toc">
               <li>3.6.1 <a href="#Key_And_Type_Definitions_">Key
               And Type Definitions</a>
                 <ul class="toc">
                   <li>Table: <a href=
                   "#Key_Type_Definitions">Key/Type
                   Definitions</a></li>
                 </ul>
               </li>
               <li>3.6.2 <a href=
               "#Numbering%20System%20Data">Numbering System
               Data</a></li>
               <li>3.6.3 <a href="#Time_Zone_Identifiers">Time Zone
               Identifiers</a></li>
               <li>3.6.4 <a href=
               "#Unicode_Locale_Extension_Data_Files">U Extension
               Data Files</a></li>
               <li>3.6.5 <a href=
               "#Unicode_Subdivision_Codes">Subdivision Codes</a>
                 <ul class="toc">
                   <li>3.6.5.1 <a href="#Validity">Validity</a></li>
                 </ul>
               </li>
             </ul>
           </li>
           <li>3.7 <a href="#t_Extension">Unicode BCP 47 T
           Extension</a>
             <ul class="toc">
               <li>3.7.1 <a href="#Transformed_Content_Data_File">T
               Extension Data Files</a></li>
             </ul>
           </li>
           <li>3.8 <a href="#Compatibility_with_Older_Identifiers">
             Compatibility with Older Identifiers</a>
             <ul class="toc">
               <li>3.8.1 <a href="#Old_Locale_Extension_Syntax">Old
               Locale Extension Syntax</a>
                 <ul class="toc">
                   <li>Table: <a href=
                   "#Locale_Extension_Mappings">Locale Extension
                   Mappings</a></li>
                 </ul>
               </li>
               <li>3.8.2 <a href="#Legacy_Variants">Legacy
               Variants</a>
                 <ul class="toc">
                   <li>Table: <a href=
                   "#Legacy_Variant_Mappings">Legacy Variant
                   Mappings</a></li>
                 </ul>
               </li>
               <li>3.8.3 <a href="#Relation_to_OpenI18n">Relation to
               OpenI18n</a></li>
             </ul>
           </li>
           <li>3.9 <a href=
           "#Transmitting_Locale_Information">Transmitting Locale
           Information</a>
             <ul class="toc">
               <li>3.9.1 <a href=
               "#Message_Formatting_and_Exceptions">Message
               Formatting and Exceptions</a></li>
             </ul>
           </li>
           <li>3.10 <a href="#Language_and_Locale_IDs">Unicode
           Language and Locale IDs</a>
             <ul class="toc">
               <li>3.10.1 <a href="#Written_Language">Written
               Language</a></li>
               <li>3.10.2 <a href="#Hybrid_Locale">Hybrid Locale
               Identifiers</a></li>
             </ul>
           </li>
           <li>3.11 <a href="#Validity_Data">Validity Data</a></li>
         </ul>
       </li>
       <li>4 <a href="#Locale_Inheritance">Locale Inheritance and
       Matching</a>
         <ul class="toc">
           <li>4.1 <a href="#Lookup">Lookup</a>
             <ul class="toc">
               <li>4.1.1 <a href="#Bundle_vs_Item_Lookup">Bundle vs
               Item Lookup</a>
                 <ul class="toc">
                   <li>Table: <a href="#Lookup-Differences">Lookup
                   Differences</a></li>
                 </ul>
               </li>
               <li>4.1.2 <a href="#Multiple_Inheritance">Lateral
               Inheritance</a>
                 <ul class="toc">
                   <li>Table: <a href="#Count_Fallback_normal">Count
                   Fallback: normal</a></li>
                   <li>Table: <a href=
                   "#Count_Fallback_currency">Count Fallback:
                   currency</a></li>
                 </ul>
               </li>
               <li>4.1.3 <a href="#Parent_Locales">Parent
               Locales</a></li>
             </ul>
           </li>
           <li>4.2 <a href="#Inheritance_and_Validity">Inheritance
           and Validity</a>
             <ul class="toc">
               <li>4.2.1 <a href="#Definitions">Definitions</a></li>
               <li>4.2.2 <a href="#Resolved_Data_File">Resolved Data
               File</a></li>
               <li>4.2.3 <a href="#Valid_Data">Valid Data</a></li>
               <li>4.2.4 <a href=
               "#Checking_for_Draft_Status">Checking for Draft
               Status</a></li>
               <li>4.2.5 <a href=
               "#Keyword_and_Default_Resolution">Keyword and Default
               Resolution</a></li>
               <li>4.2.6 <a href=
               "#Inheritance_vs_Related">Inheritance vs Related
               Information</a></li>
             </ul>
           </li>
           <li>4.3 <a href="#Likely_Subtags">Likely Subtags</a></li>
           <li>4.4 <a href="#LanguageMatching">Language Matching</a>
             <ul class='toc'>
               <li>4.4.1 <a href=
               "#EnhancedLanguageMatching">Enhanced Language
               Matching</a></li>
             </ul>
           </li>
         </ul>
       </li>
       <li>5 <a href="#XML_Format">XML Format</a>
         <ul class="toc">
           <li>5.1 <a href="#Common_Elements">Common Elements</a>
             <ul class="toc">
               <li>5.1.1 <a href="#special">Element special</a>
                 <ul class="toc">
                   <li>5.1.1.1 <a href=
                   "#Sample_Special_Elements">Sample Special
                   Elements</a></li>
                 </ul>
               </li>
               <li>5.1.2 <a href="#Alias_Elements">Element alias</a>
                 <ul class="toc">
                   <li>Table: <a href=
                   "#Inheritance_with_source_locale_">Inheritance
                   with source="locale"</a></li>
                 </ul>
               </li>
               <li>5.1.3 <a href="#Element_displayName">Element
               displayName</a></li>
               <li>5.1.4 <a href="#Escaping_Characters">Escaping
               Characters</a></li>
             </ul>
           </li>
           <li>5.2 <a href="#Common_Attributes">Common
           Attributes</a>
             <ul class="toc">
               <li>5.2.1 <a href="#Attribute_type">Attribute
               type</a></li>
               <li>5.2.2 <a href="#Attribute_draft">Attribute
               draft</a></li>
               <li>5.2.3 <a href="#alt_attribute">Attribute
               alt</a></li>
             </ul>
           </li>
           <li>5.3 <a href="#Common_Structures">Common
           Structures</a>
             <ul class="toc">
               <li>5.3.1 <a href="#Date_Ranges">Date and Date
               Ranges</a></li>
               <li>5.3.2 <a href="#Text_Directionality">Text
               Directionality</a></li>
               <li>5.3.3 <a href="#Unicode_Sets">Unicode Sets</a>
                 <ul class="toc">
                   <li>5.3.3.1 <a href="#Lists_of_Code_Points">Lists
                   of Code Points</a></li>
                   <li>5.3.3.2 <a href="#Unicode_Properties">Unicode
                   Properties</a></li>
                   <li>5.3.3.3 <a href="#Boolean_Operations">Boolean
                   Operations</a></li>
                   <li>5.3.3.4 <a href=
                   "#UnicodeSet_Examples">UnicodeSet
                   Examples</a></li>
                 </ul>
               </li>
               <li>5.3.4 <a href="#String_Range">String
               Range</a></li>
             </ul>
           </li>
           <li>5.4 <a href="#Identity_Elements">Identity
           Elements</a></li>
           <li>5.5 <a href="#Valid_Attribute_Values">Valid Attribute
           Values</a></li>
           <li>5.6 <a href="#Canonical_Form">Canonical Form</a>
             <ul class="toc">
               <li>5.6.1 <a href="#Content">Content</a></li>
               <li>5.6.2 <a href="#Ordering">Ordering</a></li>
               <li>5.6.3 <a href="#Comments">Comments</a></li>
             </ul>
           </li>
           <li>5.7 <a href="#DTD_Annotations">DTD
           Annotations</a>
             <ul class='toc'>
               <li>5.7.1 <a href="#match_expressions" >Attribute Value Constraints</a></li>
             </ul>
           </li>
         </ul>
       </li>
       <li>6 <a href="#Property_Data">Property Data</a>
         <ul class="toc">
           <li>6.1 <a href="#Script_Metadata">Script
           Metadata</a></li>
           <li>6.2 <a href="#Extended_Pictographic">Extended
           Pictographic</a></li>
           <li>6.3 <a href="#Labels.txt">Labels.txt</a></li>
           <li><a href="#Segmentation_Tests">6.4 Segmentation Tests</a></li>
         </ul>
       </li>
       <li>7 <a href="#Format_Parse_Issues">Issues in Formatting and
       Parsing</a>
         <ul class="toc">
           <li>7.1 <a href="#Lenient_Parsing">Lenient Parsing</a>
             <ul class="toc">
               <li>7.1.1 <a href="#Motivation">Motivation</a></li>
               <li>7.1.2 <a href="#Loose_Matching">Loose
               Matching</a></li>
             </ul>
           </li>
           <li>7.2 <a href="#Invalid_Patterns">Handling Invalid
           Patterns</a></li>
         </ul>
       </li>
       <li>Annex A <a href="#Deprecated_Structure">Deprecated
       Structure</a>
         <ul class="toc">
           <li>A.1 <a href="#Fallback_Elements">Element
           fallback</a></li>
           <li>A.2 <a href="#BCP47_Keyword_Mapping">BCP 47 Keyword
           Mapping</a></li>
           <li>A.3 <a href="#Choice_Patterns">Choice
           Patterns</a></li>
           <li>A.4 <a href="#Element_default">Element
           default</a></li>
           <li>A.5 <a href=
           "#Deprecated_Common_Attributes">Deprecated Common
           Attributes</a>
             <ul>
               <li>A.5.1 <a href="#Attribute_standard">Attribute
               standard</a></li>
               <li>A.5.2 <a href=
               "#Attribute_draft_nonLeaf">Attribute draft in
               non-leaf elements</a></li>
             </ul>
           </li>
           <li>A.6 <a href="#Element_base">Element base</a></li>
           <li>A.7 <a href="#Element_rules">Element rules</a></li>
           <li>A.8 <a href=
           "#Deprecated_subelements_of_dates">Deprecated subelements
           of &lt;dates&gt;</a></li>
           <li>A.9 <a href=
           "#Deprecated_subelements_of_calendars">Deprecated
           subelements of &lt;calendars&gt;</a></li>
           <li>A.10 <a href=
           "#Deprecated_subelements_of_timeZoneNames">Deprecated
           subelements of &lt;timeZoneNames&gt;</a></li>
           <li>A.11 <a href=
           "#Deprecated_subelements_of_zone_metazone">Deprecated
           subelements of &lt;zone&gt; and &lt;metazone&gt;</a></li>
           <li>A.12 <a href=
           "#Renamed_attribute_values_for_contextTransformUsage">Renamed
           attribute values for &lt;contextTransformUsage&gt;
           element</a></li>
           <li>A.13 <a href=
           "#Deprecated_subelements_of_segmentations">Deprecated
           subelements of &lt;segmentations&gt;</a></li>
           <li>A.14 <a href="#Element_cp">Element cp</a></li>
           <li>A.15 <a href="#validSubLocales">Attribute
           validSubLocales</a></li>
           <li>A.16 <a href="#postCodeElements">Elements
           postalCodeData, postCodeRegex</a></li>
           <li>A.17 <a href="#telephoneCodeData">Element
           telephoneCodeData</a></li>
         </ul>
       </li>
       <li>Annex B <a href="#Links_to_Other_Parts">Links to Other
       Parts</a>
         <ul class="toc">
           <li>Table: <a href="#Part_2_Links">Part 2 Links: General
           (display names &amp; transforms, etc.)</a></li>
           <li>Table: <a href="#Part_3_Links">Part 3 Links: Numbers
           (number &amp; currency formatting)</a></li>
           <li>Table: <a href="#Part_4_Links">Part 4 Links: Dates
           (date, time, time zone formatting)</a></li>
           <li>Table: <a href="#Part_5_Links">Part 5 Links:
           Collation (sorting, searching, grouping)</a></li>
           <li>Table: <a href="#Part_6_Links">Part 6 Links:
           Supplemental (supplemental data)</a></li>
           <li>Table: <a href="#Part_7_Links">Part 7 Links:
           Keyboards (keyboard mappings)</a></li>
         </ul>
       </li>
       <li>Annex C. <a href="#LocaleId_Canonicalization" >LocaleId Canonicalization</a></li>
       <li><a href="#References">References</a></li>
       <li><a href="#Acknowledgments">Acknowledgments</a></li>
       <li><a href="#Modifications">Modifications</a></li>
     </ul><!-- END Generated TOC: CheckHtmlFiles -->
     <h2><a name="Introduction" href="#Introduction" id=
     "Introduction">1 Introduction</a></h2>
     <p>Not long ago, computer systems were like separate worlds,
     isolated from one another. The internet and related events have
     changed all that. A single system can be built of many
     different components, hardware and software, all needing to
     work together. Many different technologies have been important
     in bridging the gaps; in the internationalization arena,
     Unicode has provided a lingua franca for communicating textual
     data. However, there remain differences in the locale data used
     by different systems.</p>
     <p>The best practice for internationalization is to store and
     communicate language-neutral data, and format that data for the
     client. This formatting can take place on any of a number of
     the components in a system; a server might format data based on
     the user's locale, or it could be that a client machine does
     the formatting. The same goes for parsing data, and
     locale-sensitive analysis of data.</p>
     <p>But there remain significant differences across systems and
     applications in the locale-sensitive data used for such
     formatting, parsing, and analysis. Many of those differences
     are simply gratuitous; all within acceptable limits for human
     beings, but yielding different results. In many other cases
     there are outright errors. Whatever the cause, the differences
     can cause discrepancies to creep into a heterogeneous system.
     This is especially serious in the case of collation
     (sort-order), where different collation caused not only
     ordering differences, but also different results of queries!
     That is, with a query of customers with names between "Abbot,
     Cosmo" and "Arnold, James", if different systems have different
     sort orders, different lists will be returned. (For comparisons
     across systems formatted as HTML tables, see [<a href=
     "#Comparisons">Comparisons</a>].)</p>
     <blockquote>
       <p class="note"><b>Note:</b> There are many different equally
       valid ways in which data can be judged to be "correct" for a
       particular locale. The goal for the common locale data is to
       make it as consistent as possible with existing locale data,
       and acceptable to users in that locale.</p>
     </blockquote>
     <p>This document specifies an XML format for the communication
     of locale data: the Unicode Locale Data Markup Language (LDML).
     This provides a common format for systems to interchange locale
     data so that they can get the same results in the services
     provided by internationalization libraries. It also provides a
     standard format that can allow users to customize the behavior
     of a system. With it, for example, collation (sorting) rules
     can be exchanged, allowing two implementations to exchange a
     specification of tailored collation rules. Using the same
     specification, the two implementations will achieve the same
     results in comparing strings. Unicode LDML can also be used to
     let a user encapsulate specialized sorting behavior for a
     specific domain, or create a customized locale for a minority
     language. Unicode LDML is also used in the Unicode Common
     Locale Data Repository (CLDR). CLDR uses an open process for
     reconciling differences between the locale data used on
     different systems and validating the data, to produce with a
     useful, common, consistent base of locale data.</p>
     <p>For more information, see the Common Locale Data Repository
     project page [<a href="#localeProject">LocaleProject</a>].</p>
     <p>As LDML is an interchange format, it was designed for ease
     of maintenance and simplicity of transformation into other
     formats, above efficiency of run-time lookup and use.
     Implementations should consider converting LDML data into a
     more compact format prior to use.</p>
     <h3><a name="Conformance" href="#Conformance" id=
     "Conformance">1.1 Conformance</a></h3>
     <p>There are many ways to use the Unicode LDML format and the
     data in CLDR, and the Unicode Consortium does not restrict the
     ways in which the format or data are used. However, an
     implementation may also claim conformance to LDML or to CLDR,
     as follows:</p>
     <p>&nbsp;</p>
     <p><i><b>UAX35-C1.</b></i> An implementation that claims
     conformance to this specification shall:</p>
     <ol>
       <li>Identify the sections of the specification that it
       conforms to.
         <ul>
           <li>For example, an implementation might claim
           conformance to all LDML features except for
           <i>transforms</i> and <i>segments</i>.</li>
         </ul>
       </li>
       <li>Interpret the relevant elements and attributes of LDML
       documents in accordance with the descriptions in those
       sections.
         <ul>
           <li>For example, an implementation that claims
           conformance to the date format patterns must interpret
           the characters in such patterns according to <a href=
           "tr35-dates.html#Date_Field_Symbol_Table">Date Field
           Symbol Table</a>.</li>
         </ul>
       </li>
       <li>Declare which types of CLDR data that it uses.
         <ul>
           <li>For example, an implementation might declare that it
           only uses language names, and those with a <i>draft</i>
           status of <i>contributed</i> or <i>approved</i>.</li>
         </ul>
       </li>
     </ol>
     <p><i><b>UAX35-C2.</b></i> An implementation that claims
     conformance to Unicode locale or language identifiers
     shall:</p>
     <ol>
       <li>Specify whether Unicode locale extensions are
       allowed</li>
       <li>Specify the canonical form used for identifiers in terms
       of casing and field separator characters.</li>
     </ol>
     <p>External specifications may also reference particular
     components of Unicode locale or language identifiers, such
     as:</p>
     <blockquote>
       <p><i>Field X can contain any Unicode region subtag values as
       given in Unicode Technical Standard #35: Unicode Locale Data
       Markup Language (LDML), excluding grouping codes.</i></p>
     </blockquote>
     <h2><a name="Locale" href="#Locale" id="Locale">2 What is a
     Locale?</a></h2>
     <p>Before diving into the XML structure, it is helpful to
     describe the model behind the structure. People do not have to
     subscribe to this model to use data in LDML, but they do need
     to understand it so that the data can be correctly translated
     into whatever model their implementation uses.</p>
     <p>The first issue is basic: <i>what is a locale?</i> In this
     model, a locale is an identifier (id) that refers to a set of
     user preferences that tend to be shared across significant
     swaths of the world. Traditionally, the data associated with
     this id provides support for formatting and parsing of dates,
     times, numbers, and currencies; for measurement units, for
     sort-order (collation), plus translated names for time zones,
     languages, countries, and scripts. The data can also include
     support for text boundaries (character, word, line, and
     sentence), text transformations (including transliterations),
     and other services.</p>
     <p>Locale data is not cast in stone: the data used on someone's
     machine generally may reflect the US format, for example, but
     preferences can typically set to override particular items,
     such as setting the date format for 2002.03.15, or using metric
     or Imperial measurement units. In the abstract, locales are
     simply one of many sets of preferences that, say, a website may
     want to remember for a particular user. Depending on the
     application, it may want to also remember the user's time zone,
     preferred currency, preferred character set, smoker/non-smoker
     preference, meal preference (vegetarian, kosher, and so on),
     music preference, religion, party affiliation, favorite
     charity, and so on.</p>
     <p>Locale data in a system may also change over time: country
     boundaries change; governments (and currencies) come and go:
     committees impose new standards; bugs are found and fixed in
     the source data; and so on. Thus the data needs to be versioned
     for stability over time.</p>
     <p>In general terms, the locale id is a parameter that is
     supplied to a particular service (date formatting, sorting,
     spell-checking, and so on). The format in this document does
     not attempt to represent all the data that could conceivably be
     used by all possible services. Instead, it collects together
     data that is in common use in systems and internationalization
     libraries for basic services. The main difference among locales
     is in terms of language; there may also be some differences
     according to different countries or regions. However, the line
     between <i>locales</i> and <i>languages</i>, as commonly used
     in the industry, are rather fuzzy. Note also that the vast
     majority of the locale data in CLDR is in fact language data;
     all non-linguistic data is separated out into a separate tree.
     For more information, see <i><a href=
     "#Language_and_Locale_IDs">Section 3.10 Language and Locale
     IDs</a></i>.</p>
     <p>We will speak of data as being "in locale X". That does not
     imply that a locale <i>is</i> a collection of data; it is
     simply shorthand for "the set of data associated with the
     locale id X". Each individual piece of data is called a
     <i>resource</i> or <i>field</i>, and a tag indicating the key
     of the resource is called a <i>resource tag.</i></p>
     <h2><a name="Identifiers" href="#Identifiers" id=
     "Identifiers"></a> <a name=
     "Unicode_Language_and_Locale_Identifiers" href=
     "#Unicode_Language_and_Locale_Identifiers" id=
     "Unicode_Language_and_Locale_Identifiers">3 Unicode Language
     and Locale Identifiers</a></h2>
     <p>Unicode LDML uses stable identifiers based on [<a href=
     "#BCP47">BCP47</a>] for distinguishing among languages,
     locales, regions, currencies, time zones, transforms, and so
     on. There are many systems for identifiers for these entities.
     The Unicode LDML identifiers may not match the identifiers used
     on a particular target system. If so, some process of
     identifier translation may be required when using LDML
     data.</p>
     <p>The BCP 47 extensions (-u- and -t-) are described in
     <em>Section 3.6 <a href="#u_Extension">Unicode BCP 47 U
     Extension</a></em> and <em>Section 3.7 <a href=
     "#BCP47_T_Extension">Unicode BCP 47 T Extension</a></em>.</p>
     <h3><i><a name="Unicode_language_identifier" href=
     "#Unicode_language_identifier" id=
     "Unicode_language_identifier">3.1 Unicode Language
     Identifier</a></i></h3>
     <p>A <i>Unicode language identifier</i> has the following
 		structure (provided in EBNF (Perl-based)). The following table defines
     syntactically well-formed identifiers: they are not necessarily
     valid identifiers. For additional validity criteria, see the
     links on the right.</p>
     <table>
       <tr>
         <th>&nbsp;</th>
         <th>
           <div align="center">
             EBNF
           </div>
         </th>
         <th>
           <div align="center">
             Validity / Comments
           </div>
         </th>
       </tr>
       <tr>
         <td><code><a href="#unicode_language_id" name=
         "unicode_language_id" id=
         "unicode_language_id">unicode_language_id</a></code></td>
         <td><code>= "root"<br>
         | (unicode_language_subtag<br>
         &nbsp; &nbsp; (sep unicode_script_subtag)?<br>
         &nbsp; | unicode_script_subtag)<br>
         &nbsp; (sep unicode_region_subtag)?<br>
         &nbsp; (sep unicode_variant_subtag)* ;</code></td>
         <td>"root" is treated as a special
         <code>unicode_language_subtag</code></td>
       </tr>
       <tr>
         <td><code><a href="#unicode_language_subtag" name=
         "unicode_language_subtag" id=
         "unicode_language_subtag">unicode_language_subtag</a></code></td>
         <td><code>= alpha{2,3} | alpha{5,8};</code></td>
         <td><code><a href=
         '#unicode_language_subtag_validity'>validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/validity/language.xml'>
         latest-data</a></code></td>
       </tr>
       <tr>
         <td><code><a href="#unicode_script_subtag" name=
         "unicode_script_subtag" id=
         "unicode_script_subtag">unicode_script_subtag</a></code></td>
         <td><code>= alpha{4} ;</code></td>
         <td><code><a href=
         '#unicode_script_subtag_validity'>validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/validity/script.xml'>
         latest-data</a></code></td>
       </tr>
       <tr>
         <td><code><a href="#unicode_region_subtag" name=
         "unicode_region_subtag" id=
         "unicode_region_subtag">unicode_region_subtag</a></code></td>
         <td><code>= (alpha{2} | digit{3}) ;</code></td>
         <td><code><a href=
         '#unicode_language_subtag_validity'>validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/validity/region.xml'>
         latest-data</a></code></td>
       </tr>
       <tr>
         <td><code><a href="#unicode_variant_subtag" name=
         "unicode_variant_subtag" id=
         "unicode_variant_subtag">unicode_variant_subtag</a></code></td>
         <td><code>= (alphanum{5,8}<br>
         | digit alphanum{3}) ;</code></td>
         <td><code><a href=
         '#unicode_language_subtag_validity'>validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/validity/variant.xml'>
         latest-data</a></code></td>
       </tr>
       <tr>
         <td><code>sep</code></td>
         <td><code>= [-_] ;</code></td>
       </tr>
       <tr>
         <td><code>digit</code></td>
         <td><code>= [0-9] ;</code></td>
       </tr>
       <tr>
         <td><code>alpha</code></td>
         <td><code>= [A-Z a-z] ;</code></td>
       </tr>
       <tr>
         <td><code>alphanum</code></td>
         <td><code>= [0-9 A-Z a-z] ;</code></td>
       </tr>
     </table>
     <p>The semantics of the various subtags is explained in
     <em>Section 3.4 <a href="#Field_Definitions">Language
     Identifier Field Definitions</a></em> ; there are also direct
     links from <code><a href=
     "#unicode_language_subtag">unicode_language_subtag</a></code> ,
     etc. While theoretically the <code><a href=
     "#unicode_language_subtag">unicode_language_subtag</a></code>
     may have more than 3 letters through the IANA registration
     process, in practice that has not occurred. The <code><a href=
     "#unicode_language_subtag">unicode_language_subtag</a></code>
     "und" may be omitted when there is a <code><a href=
     "#unicode_script_subtag">unicode_script_subtag</a></code> ; for
     that reason <code><a href=
     "#unicode_language_subtag">unicode_language_subtag</a></code>
     values with 4 letters are not permitted. However, such
     <code><a href=
     "#unicode_language_id">unicode_language_id</a></code> values
     are not intended for general interchange, because they are not
     valid BCP 47 tags. Instead, they are intended for certain
     protocols such as the identification of transliterators or font
     ScriptLangTag values. For more information on language subtags with 4 letters, see <a  href=
     "#Language_Tag_to_Locale_Identifier" >BCP 47 Language Tag to
 	Unicode BCP 47 Locale Identifier</a>.</p>
     <p>For example, "en-US" (American English), "en_GB" (British
     English), "es-419" (Latin American Spanish), and "uz-Cyrl"
     (Uzbek in Cyrillic) are all valid Unicode language
     identifiers.</p>
     <h3><i><a name="Unicode_locale_identifier" href=
     "#Unicode_locale_identifier" id="Unicode_locale_identifier">3.2
     Unicode Locale Identifier</a></i></h3>
     <p>A <i>Unicode locale identifier</i> is composed of a Unicode
     language identifier plus (optional) locale extensions. It has
     the following structure. The semantics of the U and T
     extensions are explained in <em>Section 3.6 <a href=
     "#u_Extension">Unicode BCP 47 U Extension</a></em> and
     <em>Section 3.7 <a href="#BCP47_T_Extension">Unicode BCP 47 T
     Extension</a></em>. Other extensions and private use extensions
     are supported for pass-through. The following table defines
     syntactically <em>well-formed</em> identifiers: they are not
     necessarily <em>valid</em> identifiers. For additional validity
     criteria, see the links on the right. </p>
     <p>As is often the case, the complete syntactic constraints are not easily captured by ABNF, so there is a further condition: There cannot be more than one extension with the
 		  same singleton (-a-, …, -t-, -u-, …). Note that the private use extension (-x-) must
     come after all other extensions. </p>
     <table border="0">
       <tr>
         <th>&nbsp;</th>
         <th>
           <div align="center">
             EBNF
           </div>
         </th>
         <th>
           <div align="center">
             Validity
           </div>
         </th>
       </tr>
       <tr>
         <td><code><a href="#unicode_locale_id" name=
         "unicode_locale_id" id=
         "unicode_locale_id">unicode_locale_id</a></code></td>
         <td><code>= unicode_language_id<br>
         &nbsp; extensions*<br>
         &nbsp; pu_extensions? ;</code></td>
       </tr>
       <tr>
         <td><code><a href="#extensions" name="extensions" id=
         "extensions">extensions</a></code></td>
         <td><code>= unicode_locale_extensions<br>
         | transformed_extensions<br>
         | other_extensions ;</code></td>
       </tr>
       <tr>
         <td><code><a href="#unicode_locale_extensions" name=
         "unicode_locale_extensions" id=
         "unicode_locale_extensions">unicode_locale_extensions</a></code></td>
         <td><code>= sep [uU]<br>
         &nbsp; ((sep keyword)+<br>
         &nbsp; |(sep attribute)+ (sep keyword)*) ;</code></td>
       </tr>
       <tr>
         <td><code><a href="#transformed_extensions" name=
         "transformed_extensions" id=
         "transformed_extensions">transformed_extensions</a></code></td>
         <td><code>= sep [tT]<br>
         &nbsp; ((sep tlang (sep tfield)*)<br>
         &nbsp; | (sep tfield)+) ;</code></td>
       </tr>
       <tr>
         <td><code><a href="#pu_extensions" name="pu_extensions" id=
         "pu_extensions">pu_extensions</a></code></td>
         <td><code>= sep [xX]<br>
 		&nbsp; (sep alphanum{1,8})+ ;</code></td>
       </tr>
       <tr>
         <td><code><a href="#other_extensions" name=
         "other_extensions" id=
         "other_extensions">other_extensions</a></code></td>
         <td><code>= sep [alphanum-[tTuUxX]]<br>
         &nbsp; (sep alphanum{2,8})+ ;</code></td>
       </tr>
       <tr>
         <td><code>keyword</code><br>
 			(Also known as <code>uvalue</code>)</td>
         <td><code>= key (sep type)? ;</code></td>
       </tr>
       <tr>
         <td><code>key</code><br>
 			(Also known as <code>ukey</code>)</td>
         <td><code>= alphanum alpha ;</code><br>
           (Note that this is narrower than in [<a href="https://www.ietf.org/rfc/rfc6067.txt" title="https://www.ietf.org/rfc/rfc6067.txt">RFC6067</a>], so that it is disjoint with tkey.)</td>
         <td><code><a href="#Key_Type_Definitions">validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/bcp47'>latest-data</a></code></td>
       </tr>
       <tr>
         <td><code>type</code><br>
 			(Also known as <code>uvalue</code>)</td>
         <td><code>= alphanum{3,8}<br>
         &nbsp; (sep alphanum{3,8})* ;</code></td>
         <td><code><a href="#Key_Type_Definitions">validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/bcp47'>latest-data</a></code></td>
       </tr>
       <tr>
         <td><code>attribute</code></td>
         <td><code>= alphanum{3,8} ;</code></td>
       </tr>
       <tr>
         <td><code><a name="unicode_subdivision_id" href=
         "#unicode_subdivision_id" id=
         "unicode_subdivision_id">unicode_subdivision_id</a><a name=
         "unicode_subdivision_subtag" id=
         "unicode_subdivision_subtag"></a><a name=
         "subdivision_attribute" id=
         "subdivision_attribute"></a></code></td>
         <td><code>= <a href=
         "#unicode_region_subtag">unicode_region_subtag</a>
         unicode_subdivision_suffix ;</code></td>
         <td><code><a href=
         '#unicode_subdivision_subtag_validity'>validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/validity/subdivision.xml'>
         latest-data</a></code></td>
       </tr>
       <tr>
         <td><code>unicode_subdivision_suffix</code></td>
         <td><code>= alphanum{1,4} ;</code></td>
       </tr>
       <tr>
         <td><code><a name="unicode_measure_unit" href=
         "#unicode_measure_unit" id=
         "unicode_measure_unit">unicode_measure_unit</a></code></td>
         <td><code>= alphanum{3,8}<br>
         &nbsp; (sep alphanum{3,8})* ;</code></td>
         <td><code><a href='#Validity_Data'>validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/validity/unit.xml'>latest-data</a></code></td>
       </tr>
       <tr>
         <td><code>tlang</code></td>
         <td><code>= unicode_language_subtag<br>
         &nbsp; (sep unicode_script_subtag)?<br>
         &nbsp; (sep unicode_region_subtag)?<br>
         &nbsp; (sep unicode_variant_subtag)* ;</code></td>
       </tr>
       <tr>
         <td><code>tfield</code></td>
         <td><code>= tkey tvalue;</code></td>
         <td><code><a href="#BCP47_T_Extension">validity</a><br>
         <a href=
         'https://github.com/unicode-org/cldr/blob/maint/maint-38/common/bcp47'>latest-data</a></code></td>
       </tr>
       <tr>
         <td><code>tkey</code></td>
         <td><code>= alpha digit ;</code></td>
       </tr>
       <tr>
         <td><code>tvalue</code></td>
         <td><code>= (sep alphanum{3,8})+ ;</code></td>
       </tr>
     </table>
     <p>For historical reasons, this is called a Unicode locale
     identifier. However, it really functions (with few exceptions)
     as a <span class="st">language</span> identifier, and accesses
     <span class="st">language</span>-based data. Except where it
     would be unclear, this document uses the term "locale" data
     loosely to encompass both types of data: for more information,
     see <i><a href="#Language_and_Locale_IDs">Section 3.10 Language
     and Locale IDs</a></i>.</p>
     <p>As of the release of this specification, there were no
     other_extensions defined. The other_extensions are present in
     the syntax to allow implementations to preserve that
     information.</p>
     <p>As for terminology, the term <i>code</i> may also be used
     instead of "subtag", and "territory" instead of "region". The
     primary language subtag is also called the <i>base language
     code</i>. For example, the base language code for "en-US"
     (American English) is "en" (English). The <i>type</i> may also
     be referred to as a <i>value</i> or <i>key-value</i>.</p>
     <p>The identifiers can vary in case and in the separator
     characters. The "-" and "_" separators are treated as
     equivalent, although "-" is preferred.</p>
     <p>All identifier field values are case-insensitive. Although
     case distinctions do not carry any special meaning, an
     implementation of LDML should use the casing recommendations in
     [<a href="#BCP47">BCP47</a>], especially when a Unicode locale
     identifier is used for locale data exchange in software
     protocols.</p>
     <h4><a name="Canonical_Unicode_Locale_Identifiers" href="#Canonical_Unicode_Locale_Identifiers">3.2.1 Canonical Unicode Locale Identifiers</a></h4>
     <p>A <code><a href=
     "#unicode_locale_id">unicode_locale_id</a></code> has <em>canonical syntax</em> when:</p>
     <ul>
 		<li>It starts with a language subtag (those beginning with a script subtag are only for specialized use)</li>
       <li>Casing
         <ul>
         <li>Any script subtag is in title case (eg, Hant)</li>
         <li>Any region subtag is in uppercase (eg, DE)</li>
         <li>All other subtags are in lowercase (eg, en, fonipa)</li>
         </ul>
       </li>
       <li>Order
         <ul>
           <li>Any variants are in alphabetical order (eg, en-fonipa-scouse,
             not en-scouse-fonipa)</li>
           <li>Any extensions are in alphabetical order by their singleton
             (eg, en-t-xxx-u-yyy, not en-u-yyy-t-xxx)</li>
           <li>All attributes are sorted in alphabetical order.</li>
           <li>All keywords and tfields  are sorted by alphabetical order of their keys, within their respective extensions.</li>
           <li>Any type or tfield value "true" is removed.</li>
         </ul>
       </li>
     </ul>
 	  <p>For example, the canonical form of
       "en-u-foo-bar-nu-thai-ca-buddhist-kk-true" is
       "en-u-bar-foo-ca-buddhist-kk-nu-thai". The attributes "foo" and
       "bar" in this example are provided only for illustration; no
       attribute subtags are defined by the current CLDR
       specification.</p>
     <p><b>Note:</b> The current version of CLDR data uses some
     non-preferred <em>syntax</em> for backward compatibility. This might be
     changed in future CLDR releases.</p>
     <ul>
       <li>It uses uppercase letters for variant subtags, while the
       preferred forms are all lowercase.</li>
       <li>It uses "_" as the separator, while the preferred form of
       the separator is "-".</li>
       <li>It uses "root", while the preferred form is "und".</li>
     </ul>

     <p>A <code><a href=
     "#unicode_locale_id">unicode_locale_id</a></code> is in <em>canonical form</em> when it has canonical syntax and contains no aliased subtags. A <code><a href=
     "#unicode_locale_id">unicode_locale_id</a></code> can be transformed into canonical form according to <a href="#LocaleId_Canonicalization" >Annex C. LocaleId Canonicalization</a>.</p>


 	    <p>A <code><a href=
     "#unicode_locale_id">unicode_locale_id</a></code> is <em>maximal</em> when the  <code><a href=
     "#unicode_language_id">unicode_language_id</a></code> and tlang (if any) have been transformed by the Add Likely Subtags operation in <em>Section 4.3 <a href="#Likely_Subtags">Likely Subtags</a></em>, excluding &quot;und&quot;. </p>
 	    <blockquote><em>Example:</em> the maxmal form of ja-Kana-t-it is ja-Kana-JP-t-it-Latn-IT</blockquote>
 	    <p>Two <code><a href=
     "#unicode_locale_id">unicode_locale_ids</a></code> are <em>equivalent</em> when their maximal canonical forms are identical.</p>
 			    <blockquote>
 			      <p><em>Example:</em> &quot;IW-HEBR-u-ms-imperial&quot; ~ &quot;he-u-ms-uksystem&quot;</p>
 			    </blockquote>
 		<p>The equivalence relationship may change over time, such as when subtags are deprecated or likely subtag mappings change. For example, if two countries were to merge, then various subtags would become deprecated. These kinds of changes are generally very infrequent.</p>

     <h3><a name="BCP_47_Conformance" href="#BCP_47_Conformance" id=
     "BCP_47_Conformance">3.3 BCP 47 Conformance</a></h3>
     <p>Unicode language and locale identifiers inherit the design
     and the repertoire of subtags from [<a href="#BCP47">BCP47</a>]
     Language Tags. There are some extensions and restrictions made
     for the use of the Unicode locale identifier in CLDR:</p>
     <ul>
       <li>It does not allow for the full syntax of [<a href=
       "#BCP47">BCP47</a>]:
         <ul>
           <li>No extlang subtags are allowed (as in the BCP 47
           canonical form, see BCP 47 <a href=
           "https://tools.ietf.org/html/bcp47#section-4.5">Section
           4.5</a> and <a href=
           "https://tools.ietf.org/html/bcp47#section-3.1.7" target=
           "_blank">Section 3.1.7</a>)</li>
           <li>No irregular BCP 47 legacy language tags
           (marked as “Type: grandfathered” in BCP 47) are allowed
           (these are all deprecated in BCP 47)</li>
           <li>A tag must not start with the subtag "x": thus a
           <em>privateuse</em> (eg x-abc) can only be after a
           language subtag, like "und"</li>
         </ul>
       </li>
       <li>It allows for certain semantic additions and constraints:
         <ul>
           <li>Certain codes that are private-use in BCP-47 and ISO
           are given semantics by LDML</li>
           <li>Each macrolanguage has an identified primary
           encompassed language, which is treated as an alias for
           the macrolanguage, and thus is replaced when
           canonicalizing (as allowed by BCP 47, see <a href=
           "https://tools.ietf.org/html/bcp47#section-4.1.2">Section
           4.1.2</a>)</li>
         </ul>
       </li>
       <li>It allows certain syntax for backwards compatibility (not
       BCP 47-compatible):
         <ul>
           <li>The "_" character for field separator characters, as
           well as the "-" used in [<a href="#BCP47">BCP47</a>]
           (however, the canonical form is with "-")</li>
           <li>The subtag "root" to indicate the generic locale used
           as the parent of all languages in the CLDR data model
           ("und" can be used instead)</li>
           <li>The language tag may begin with a script subtag
           rather than a language subtag. This is specialized use
           only, and not required for CLDR conformance.</li>
         </ul>
       </li>
     </ul>
     <p>There are thus two subtypes of Unicode locale
     identifiers:</p>
     <ul>
       <li>the term <em>Unicode CLDR locale identifier</em> applies
       where the backwards compatibility syntax is used.</li>
       <li>the term <em>Unicode BCP 47 locale identifier</em>
       applies otherwise. A <em>Unicode BCP 47 locale
       identifier</em> is also a valid BCP 47 language tag.</li>
     </ul>
     <h4><a name="BCP_47_Language_Tag_Conversion" href=
     "#BCP_47_Language_Tag_Conversion" id=
     "BCP_47_Language_Tag_Conversion">3.3.1 BCP 47 Language Tag
     Conversion</a></h4>
     <p>The different identifiers can be converted to one another as
     described in this section.</p>
     <h5><a name="Language_Tag_to_Locale_Identifier" href=
     "#Language_Tag_to_Locale_Identifier" id=
     "Language_Tag_to_Locale_Identifier">BCP 47 Language Tag to
     Unicode BCP 47 Locale Identifier</a></h5>
     <p>A valid [<a href="#BCP47">BCP47</a>] language tag can be
     converted to a valid Unicode BCP 47 locale identifier according to <a href="#LocaleId_Canonicalization" >Annex C. LocaleId Canonicalization</a></p>

     <p>The result is a Unicode BCP 47 locale identifier, in
     canonical form. It is both a BCP 47 language tag and a Unicode
     locale identifier. Because the process maps from all BCP 47
     language tags into a subset of BCP 47 language tags, the format
     changes are not reversible, much as a lowercase transformation
     of the string “McGowan” is not reversible.</p><br>
     <p><em>Examples</em></p>
     <table>
       <tr>
         <th style='width:10em'>BCP 47 language tag</th>
         <th style='width:10em'>Unicode BCP 47 locale
         identifier</th>
         <th>Comments</th>
       </tr>
       <tr>
         <td><code>en-US</code></td>
         <td><code>en-US</code></td>
         <td>no changes</td>
       </tr>
       <tr>
         <td><code>iw-FX</code></td>
         <td><code>he-FR</code></td>
         <td>BCP 47 canonicalization [1]</td>
       </tr>
       <tr>
         <td><code>cmn-TW</code></td>
         <td><code>zh-TW</code></td>
         <td>language alias [2]</td>
       </tr>
       <tr>
         <td><code>zh-cmn-TW</code></td>
         <td><code>zh-TW</code></td>
         <td>BCP 47 canonicalization [1], then language alias
         [2]</td>
       </tr>
       <tr>
         <td><code>sr-CS</code></td>
         <td><code>sr-RS</code></td>
         <td>territory alias [3]</td>
       </tr>
       <tr>
         <td><code>sh</code></td>
         <td><code>sr-Latn</code></td>
         <td>multiple replacement subtags [2.1]</td>
       </tr>
       <tr>
         <td><code>sh-Cyrl</code></td>
         <td><code>sr-Cyrl</code></td>
         <td>no replacement with multiple replacement subtags [2.1
         doesn't apply]</td>
       </tr>
       <tr>
         <td><code>hy-SU</code></td>
         <td><code>hy-AM</code></td>
         <td>multiple territory values [3.2]<br>
         <code>&lt;territoryAlias type="SU" replacement="RU AM AZ BY
         EE GE KZ KG LV LT MD TJ TM UA UZ" …/&gt;</code></td>
       </tr>
       <tr>
         <td><code>i-enochian</code></td>
         <td><code>und-x-i-enochian</code></td>
         <td>prefix any legacy language tags
           (marked as “Type: grandfathered” in BCP 47) with "und-x-" [4]</td>
       </tr>
       <tr>
         <td><code>x-abc</code></td>
         <td><code>und-x-abc</code></td>
         <td>prefix with "und-", so that there is always a base
         language subtag [5]</td>
       </tr>
     </table>
     <p>&nbsp;</p>
     <h5><a name="Unicode_Locale_Identifier_CLDR_to_BCP_47" href=
     "#Unicode_Locale_Identifier_CLDR_to_BCP_47" id=
     "Unicode_Locale_Identifier_CLDR_to_BCP_47">Unicode Locale
     Identifier: CLDR to BCP 47</a></h5>
     <p>A Unicode CLDR locale identifier can be converted to a valid
     [<a href="#BCP47">BCP47</a>] language tag (which is also a
     Unicode BCP 47 locale identifier) by performing the following
     transformation.</p>
     <ol>
       <li>Replace the "_" separators with "-"</li>
       <li>Replace the special language identifier "root" with the
       BCP 47 primary language tag "und"</li>
       <li>Add an initial "und" primary language subtag if the first
       subtag is a script.</li>
     </ol>
     <p><em>Examples:</em></p>
     <table>
       <tr>
         <th style='width:10em'>Unicode CLDR locale identifier</th>
         <th style='width:10em'>BCP 47 language tag</th>
         <th>Comments</th>
       </tr>
       <tr>
         <td><code>en_US</code></td>
         <td><code>en-US</code></td>
         <td>change separator [1]</td>
       </tr>
       <tr>
         <td><code>de_DE_u_co_phonebk</code></td>
         <td><code>de-DE-u-co-phonebk</code></td>
         <td>change separator [1]</td>
       </tr>
       <tr>
         <td><code>root</code></td>
         <td><code>und</code></td>
         <td>change to "und" [2]</td>
       </tr>
       <tr>
         <td><code>root_u_cu_usd</code></td>
         <td><code>und-u-cu-usd</code></td>
         <td>change to "und" [1, 2]</td>
       </tr>
       <tr>
         <td><code>Latn_DE</code></td>
         <td><code>und-Latn-DE</code></td>
         <td>add "und" [1, 3]</td>
       </tr>
     </table><br>
     <h5><a name="Unicode_Locale_Identifier_BCP_47_to_CLDR" href=
     "#Unicode_Locale_Identifier_BCP_47_to_CLDR" id=
     "Unicode_Locale_Identifier_BCP_47_to_CLDR">Unicode Locale
     Identifier: BCP 47 to CLDR</a></h5>
     <p>A Unicode BCP 47 locale identifier can be transformed into a
     Unicode CLDR locale identifier by performing the following
     transformation.</p>
     <ol>
       <li>the separator is changed to "_"</li>
       <li>the primary language subtag "und" is replaced with "root"
       if no script, region, or variant subtags are present.</li>
     </ol>
     <p><em>Examples:</em></p>
     <table>
       <tr>
         <th style='width:10em'>BCP 47 language tag</th>
         <th style='width:10em'>Unicode CLDR locale identifier</th>
         <th>Comments</th>
       </tr>
       <tr>
         <td><code>en-US</code></td>
         <td><code>en_US</code></td>
         <td>changes separator [1]</td>
       </tr>
       <tr>
         <td><code>und</code></td>
         <td><code>root</code></td>
         <td>changes to "root", because no script, region, or
         variant tag is present [2]</td>
       </tr>
       <tr>
         <td><code>und-US</code></td>
         <td><code>und_US</code></td>
         <td>no change to "und", because a region subtag is present
         [1]</td>
       </tr>
       <tr>
         <td nowrap><code>und-u-cu-USD</code></td>
         <td nowrap><code>root_u_cu_usd</code></td>
         <td>changes to "root", because no script, region, or
         variant tag is present [1, 2]</td>
       </tr>
     </table>
     <h3><a name="Field_Definitions" href="#Field_Definitions" id=
     "Field_Definitions">3.4 Language Identifier Field
     Definitions</a></h3>
     <p>Unicode language and locale identifier field values are
     provided in the following table. Note that some private-use BCP
     47 field values are given specific meanings in CLDR. While
     field values are based on [<a href="#BCP47">BCP47</a>] subtag
     values, their validity status in CLDR is specified by means of
     machine-readable files in the <a href=
     'https://github.com/unicode-org/cldr/releases/tag/latest/common/validity/'>common/validity/</a>
     subdirectory, such as language.xml. For the format of those
     files and more information, see <em><a href=
     '#Validity_Data'>Section 3.11 Validity Data</a></em>.</p>
     <table>
       <caption>
         <a name="Language_Locale_Field_Definitions" href=
         "#Language_Locale_Field_Definitions" id=
         "Language_Locale_Field_Definitions">Language Identifier
         Field Definitions</a>
       </caption>
       <tr>
         <th>Field</th>
         <th>Valid values</th>
       </tr>
       <tr>
         <td>
           <a href="#unicode_language_subtag_validity" name=
           "unicode_language_subtag_validity" id=
           "unicode_language_subtag_validity">unicode_language_subtag</a>
           <p>(also known as a <i>Unicode base language
           code)</i></p>
         </td>
         <td>
           Subtags in the language.xml file (see <em>Section 3.11
           <a href="#Validity_Data">Validity Data</a></em> ). These
           are based on [<a href="#BCP47">BCP47</a>] subtag values
           marked as <b>Type: language</b>
           <p>ISO 639-3 introduces the notion of "macrolanguages",
           where certain ISO 639-1 or ISO 639-2 codes are given
           broad semantics, and additional codes are given for the
           narrower semantics. For backwards compatibility, Unicode
           language identifiers retain use of the narrower semantics
           for these codes. For example:</p>
           <table border="1" cellspacing="0" cellpadding="2" style=
           "margin: 0.5em">
             <tr>
               <th>For</th>
               <th>Use</th>
               <th><i>Not</i></th>
             </tr>
             <tr>
               <td>Standard Chinese (Mandarin)</td>
               <td><code>zh</code></td>
               <td><code>cmn</code></td>
             </tr>
             <tr>
               <td>Standard Arabic</td>
               <td><code>ar</code></td>
               <td><code>arb</code></td>
             </tr>
             <tr>
               <td>Standard Malay</td>
               <td><code>ms</code></td>
               <td><code>zsm</code></td>
             </tr>
             <tr>
               <td>Standard Swahili</td>
               <td><code>sw</code></td>
               <td><code>swh</code></td>
             </tr>
             <tr>
               <td>Standard Uzbek</td>
               <td><code>uz</code></td>
               <td><code>uzn</code></td>
             </tr>
             <tr>
               <td>Standard Konkani</td>
               <td><code>kok</code></td>
               <td><code>knn</code></td>
             </tr>
             <tr>
               <td>Northern Kurdish</td>
               <td><code>ku</code></td>
               <td><code>kmr</code></td>
             </tr>
           </table>
           <p>If a language subtag matches the type attribute of a
           languageAlias element, then the replacement value is used
           instead. For example, because "swh" occurs in
           <tt>&lt;languageAlias type="swh"
           replacement="sw"/&gt;</tt> , "sw" must be used instead of
           "swh". Thus Unicode language identifiers use "ar-EG" for
           Standard Arabic (Egypt), not "arb-EG"; they use "zh-TW"
           for Mandarin Chinese (Taiwan), not "cmn-TW".</p>
           <p>The private use codes listed as
           <strong>excluded</strong> in <em>Section 3.5.3 <a href=
           "#Private_Use_Codes">Private Use Codes</a></em> will never be
           given specific semantics in Unicode identifiers, and are
           thus safe for use for other purposes by other
           applications.</p>
           <p>The CLDR provides data for normalizing language/locale
           codes, including mapping overlong codes like "eng-840" or
           "eng-USA" to the correct code "en-US"; see the
           <strong><a href=
           "https://unicode-org.github.io/cldr-staging/charts/38/supplemental/aliases.html">
           Aliases</a></strong> Chart.</p>
           <p>The following are special language subtags:</p>
           <table class="simple" border="1" cellspacing="0"
           cellpadding="2">
             <tr>
               <td>&nbsp;</td>
               <td><strong>Name</strong></td>
               <td><strong>Comment</strong></td>
             </tr>
             <tr>
               <td><code>mis</code></td>
               <td>Uncoded languages</td>
               <td>The content is in a language that doesn't yet
               have an ISO 639 code.</td>
             </tr>
             <tr>
               <td><code>mul</code></td>
               <td>Multiple languages</td>
               <td>The content contains more than one language or
               text that is simultaneously in multiple languages
               (such as brand names).</td>
             </tr>
             <tr>
               <td><code>zxx</code></td>
               <td>No linguistic content</td>
               <td>The content is not in any particular languages
               (such as images, symbols, etc.)</td>
             </tr>
           </table>
         </td>
       </tr>
       <tr>
         <td>
           <a href="#unicode_script_subtag_validity" name=
           "unicode_script_subtag_validity" id=
           "unicode_script_subtag_validity">unicode_script_subtag</a>
           <p>(also known as a <i>Unicode script code)</i></p>
         </td>
         <td>
           Subtags in the script.xml file (see <em>Section 3.11
           <a href="#Validity_Data">Validity Data</a></em>). These
           are based on [<a href="#BCP47">BCP47</a>] subtag values
           marked as <b>Type: script</b>
           <p>In most cases the script is not necessary, since the
           language is only customarily written in a single script.
           Examples of cases where it is used are:</p>
           <table border="1" cellspacing="0" cellpadding="2" style=
           "margin: 0.5em">
             <tr>
               <td><code>az_Arab</code></td>
               <td>Azerbaijani in Arabic script</td>
             </tr>
             <tr>
               <td><code>az_Cyrl</code></td>
               <td>Azerbaijani in Cyrillic script</td>
             </tr>
             <tr>
               <td><code>az_Latn</code></td>
               <td>Azerbaijani in Latin script</td>
             </tr>
             <tr>
               <td><code>zh_Hans</code></td>
               <td>Chinese, in simplified script (=zh, zh-Hans,
               zh-CN, zh-Hans-CN)</td>
             </tr>
             <tr>
               <td><code>zh_Hant</code></td>
               <td>Chinese, in traditional script</td>
             </tr>
           </table>
           <p>Unicode identifiers give specific semantics to certain
           Unicode Script values. For more information, see also
           [<a href=
           "https://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]:</p>
           <table cellspacing="0" cellpadding="2" border="1" style=
           "margin: 0.5em">
             <tr>
               <td><code>Qaag</code></td>
               <td>Zawgyi</td>
               <td colspan="2">Qaag is a special script code for
               identifying the non-standard use of Myanmar
               characters for display with the Zawgyi font. The
               purpose of the code is to enable migration to
               standard, interoperable use of Unicode by providing
               an identifier for Zawgyi for tagging text,
               applications, input methods, font tables,
               transformations, and other mechanisms used for
               migration.</td>
             </tr>
             <tr>
               <td><code>Qaai</code></td>
               <td>Inherited</td>
               <td colspan="2"><strong>deprecated</strong>: the
               <em>canonicalized</em> form is Zinh</td>
             </tr>
             <tr>
               <td><code>Zinh</code></td>
               <td>Inherited</td>
               <td colspan="2">&nbsp;</td>
             </tr>
             <tr>
               <td><code>Zsye</code></td>
               <td>Emoji Style</td>
               <td colspan="2">Prefer emoji style for characters
               that have both text and emoji styles available.</td>
             </tr>
             <tr>
               <td><code>Zsym</code></td>
               <td>Text Style</td>
               <td colspan="2">Prefer text style for characters that
               have both text and emoji styles available.</td>
             </tr>
             <tr>
               <td rowspan="7"><code>Zxxx</code></td>
               <td rowspan="7">Unwritten</td>
               <td colspan="2">Indicates spoken or otherwise
               unwritten content. For example:</td>
             </tr>
             <tr>
               <th>Sample(s)</th>
               <th>Description</th>
             </tr>
             <tr>
               <td>uz</td>
               <td>either written or spoken content</td>
             </tr>
             <tr>
               <td>uz-Latn <em>or</em> uz-Arab</td>
               <td>written-only content (particular script)</td>
             </tr>
             <tr>
               <td>uz-Zyyy</td>
               <td>written-only content (unspecified script)</td>
             </tr>
             <tr>
               <td>uz-Zxxx</td>
               <td>spoken-only content</td>
             </tr>
             <tr>
               <td>uz-Latn, uz-Zxxx</td>
               <td>both specific written and spoken content (using a
               <em>language list</em>)</td>
             </tr>
             <tr>
               <td><code>Zyyy</code></td>
               <td>Common</td>
               <td colspan="2">&nbsp;</td>
             </tr>
             <tr>
               <td><code>Zzzz</code></td>
               <td>Unknown</td>
               <td colspan="2">&nbsp;</td>
             </tr>
           </table>
           <p>The private use subtags listed as
           <strong>excluded</strong> in <em>Section 3.5.3 <a href=
           "#Private_Use_Codes">Private Use Codes</a></em> will never be
           given specific semantics in Unicode identifiers, and are
           thus safe for use for other purposes by other
           applications.</p>
         </td>
       </tr>
       <tr>
         <td>
           <a href="#unicode_region_subtag_validity" name=
           "unicode_region_subtag_validity" id=
           "unicode_region_subtag_validity">unicode_region_subtag</a>
           <p>(also known as a <i>Unicode region code,</i> or <i>a
           Unicode territory code)</i></p>
         </td>
         <td>
           Subtags in the region.xml file (see <em>Section 3.11
           <a href="#Validity_Data">Validity Data</a></em>). These
           are based on [<a href="#BCP47">BCP47</a>] subtag values
           marked as <b>Type: region</b>
           <p>Unicode identifiers give specific semantics to the
           following subtags:</p>
           <table border="1" cellspacing="0" cellpadding="2">
             <tr>
               <td>&nbsp;</td>
               <td><strong>Name</strong></td>
               <td><strong>Comment</strong></td>
               <td><strong>ISO 3166-1 status</strong></td>
             </tr>
             <tr>
               <td><code>QO</code></td>
               <td>Outlying Oceania</td>
               <td>countries in Oceania [009] that do not have a
               <a href=
               "https://unicode-org.github.io/cldr-staging/charts/38/supplemental/territory_containment_un_m_49.html">
               subcontinent</a>.</td>
               <td>private use</td>
             </tr>
             <tr>
               <td><code>QU</code></td>
               <td>European Union</td>
               <td><strong>deprecated</strong>: the
               <em>canonicalized</em> form is EU</td>
               <td>private use</td>
             </tr>
             <tr>
               <td><code>UK</code></td>
               <td>United Kingdom</td>
               <td><strong>deprecated</strong>: the
               <em>canonicalized</em> form is GB</td>
               <td>exceptionally reserved</td>
             </tr>
             <tr>
               <td><code>XA</code></td>
               <td>Pseudo-Accents</td>
               <td>special code indicating derived testing locale
               with English + added accents and lengthened</td>
               <td>private use</td>
             </tr>
             <tr>
               <td><code>XB</code></td>
               <td>Pseudo-Bidi</td>
               <td>special code indicating derived testing locale
               with forced RTL English</td>
               <td>private use</td>
             </tr>
             <tr>
               <td><code>XK</code></td>
               <td>Kosovo</td>
               <td>industry practice</td>
               <td>private use</td>
             </tr>
             <tr>
               <td><code>ZZ</code></td>
               <td>Unknown or Invalid Territory</td>
               <td>used in APIs or as replacement for invalid
               code</td>
               <td>private use</td>
             </tr>
           </table>
           <p>The private use subtags listed as
           <strong>excluded</strong> in <em>Section 3.5.3 <a href=
           "#Private_Use_Codes">Private Use Codes</a></em> will normally
           never be given specific semantics in Unicode identifiers,
           and are thus safe for use for other purposes by other
           applications. However, LDML may follow widespread
           industry practice in the use of some of these codes, such
           as for XK.</p>
           <p>The CLDR provides data for normalizing
           territory/region codes, including mapping overlong codes
           like "eng-840" or "eng-USA" to the correct code
           "en-US".</p>
           <p>Special Codes:</p>
           <ul>
             <li>The territory code 'UK' has a special status in
             ISO, and is used for the domain name instead of GB. It
             is thus recognized by CLDR as being an alternate
             (unnormalized) form of 'GB'.</li>
             <li>The territory code '001' (the World) is used to
             indicate a standardized form, such as "ar-001" for
             Modern Standard Arabic.</li>
           </ul>
         </td>
       </tr>
       <tr>
         <td>
           <a href="#unicode_variant_subtag_validity" name=
           "unicode_variant_subtag_validity" id=
           "unicode_variant_subtag_validity">unicode_variant_subtag</a>
           <p>(also known as a <i>Unicode language variant
           code)</i></p>
         </td>
         <td>
           Subtags in the variant.xml file (see <em>Section 3.11
           <a href="#Validity_Data">Validity Data</a></em> ). These
           are based on [<a href="#BCP47">BCP47</a>] subtag values
           marked as <b>Type: variant</b>
           <p>CLDR provides data for normalizing variant codes.
           About handling of the "POSIX" variant see <i>Section
           3.8.2, <a href="#Legacy_Variants">Legacy
           Variants</a></i>.</p>
         </td>
       </tr>
     </table>
     <p><i>Examples:</i></p>
     <blockquote>
       <pre>en
 fr_BE
 zh-Hant-HK</pre>
     </blockquote>
     <p><em>Deprecated</em> codes—such as QU above—are valid, but
     strongly discouraged.</p>
     <p>A locale that only has a language subtag (and optionally a
     script subtag) is called a <i>language locale</i>; one with
     both language and territory subtag is called a <i>territory
     locale</i> (or <i>country locale</i>).</p>
     <h3><a name="Special_Codes" href="#Special_Codes" id=
     "Special_Codes">3.5 Special Codes</a></h3>
     <h4><a name="Unknown_or_Invalid_Identifiers" href=
     "#Unknown_or_Invalid_Identifiers" id=
     "Unknown_or_Invalid_Identifiers">3.5.1 Unknown or Invalid
     Identifiers</a></h4>
     <p>The following identifiers are used to indicate an unknown or
     invalid code in Unicode language and locale identifiers. For
     Unicode identifiers, the region code uses a private use ISO
     3166 code, and Time Zone code uses an additional code; the
     others are defined by the relevant standards. When these codes
     are used in APIs connected with Unicode identifiers, the
     meaning is that either there was no identifier available, or
     that at some point an input identifier value was determined to
     be invalid or ill-formed.</p>
     <table border="1" cellspacing="0" cellpadding="4" style=
     "margin-top: 0.5em; margin-bottom: 0.5em" id="table4">
       <tr>
         <th>Code Type</th>
         <th>Value</th>
         <th>Description in Referenced Standards</th>
       </tr>
       <tr>
         <td>Language</td>
         <td><code>und</code></td>
         <td>Undetermined language, also used for “root”</td>
       </tr>
       <tr>
         <td>Script</td>
         <td><code>Zzzz</code></td>
         <td>Code for uncoded script, Unknown [<a href=
         "https://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]</td>
       </tr>
       <tr>
         <td>Region&nbsp;&nbsp;</td>
         <td><code>ZZ</code></td>
         <td>Unknown or Invalid Territory</td>
       </tr>
       <tr>
         <td>Currency</td>
         <td><code>XXX</code></td>
         <td>The codes assigned for transactions where no currency
         is involved</td>
       </tr>
       <tr>
         <td>Time Zone</td>
         <td><code>unk</code></td>
         <td>Unknown or Invalid Time Zone</td>
       </tr>
       <tr>
         <td>Subdivision</td>
         <td><em>&lt;region&gt;</em>zzzz</td>
         <td>Unknown or Invalid Subdivision</td>
       </tr>
     </table>
     <p>When only the script or region are known, then a locale ID
     will use "und" as the language subtag portion. Thus the locale
     tag "und_Grek" represents the Greek script; "und_US" represents
     the US territory.</p>
     <h4><a name="Numeric_Codes" href="#Numeric_Codes" id=
     "Numeric_Codes">3.5.2 Numeric Codes</a></h4>
     <p>For region codes, ISO and the UN establish a mapping to
     three-letter codes and numeric codes. However, this does not
     extend to the private use codes, which are the codes 900-999
     (total: 100), and AAA, QMA-QZZ, XAA-XZZ, and ZZZ (total: 1092).
     Unicode identifiers supply a standard mapping to these: for the
     numeric codes, it uses the top of the numeric private use
     range; for the 3-letter codes it doubles the final letter.
     These are the resulting mappings for all of the private use
     region codes:</p>
     <table border="1" cellspacing="0" cellpadding="4" style=
     "margin-top: 0.5em; margin-bottom: 0.5em" id="table19">
       <tr>
         <th>Region</th>
         <th>UN/ISO Numeric</th>
         <th>ISO 3-Letter</th>
       </tr>
       <tr>
         <td><code>AA</code></td>
         <td><code>958</code></td>
         <td><code>AAA</code></td>
       </tr>
       <tr>
         <td><code>QM..QZ</code></td>
         <td><code>959..972</code></td>
         <td><code>QMM..QZZ</code></td>
       </tr>
       <tr>
         <td><code>XA..XZ</code></td>
         <td><code>973..998</code></td>
         <td><code>XAA..XZZ</code></td>
       </tr>
       <tr>
         <td><code>ZZ</code></td>
         <td><code>999</code></td>
         <td><code>ZZZ</code></td>
       </tr>
     </table>
     <p>For script codes, ISO 15924 supplies a mapping (however, the
     numeric codes are not in common use):</p>
     <table border="1" cellspacing="0" cellpadding="4" style=
     "margin-top: 0.5em; margin-bottom: 0.5em" id="table21">
       <tr>
         <th>Script</th>
         <th>Numeric</th>
       </tr>
       <tr>
         <td><code>Qaaa..Qabx</code></td>
         <td><code>900..949</code></td>
       </tr>
     </table><br>
     <h4>3.5.3 <a name="Private_Use_Codes" href="#Private_Use_Codes" id=
     "Private_Use_Codes">Private Use Codes</a></h4>
     <p>Private use codes fall into three groups.</p>
     <ul>
       <li><strong>defined:</strong> those that are given particular
       semantics currently in CLDR</li>
       <li><strong>reserved:</strong> those that may be given
       particular semantics in future versions of CLDR</li>
       <li><strong>excluded:</strong> those that will never be given
       particular CLDR semantics in the future, and thus can
       normally be used by applications without worrying about
       collisions. However, CLDR may follow widespread industry
       practice in the use of some of these codes, such as for XA,
       XB, and XK.</li>
     </ul>
     <table>
       <caption>
         <a name="Private_Use_CLDR" href="#Private_Use_CLDR" id=
         "Private_Use_CLDR">Private Use Codes in CLDR</a>
       </caption>
       <tr>
         <th>category</th>
         <th>status</th>
         <th>codes</th>
       </tr>
       <tr>
         <td rowspan="3">base language</td>
         <td>defined</td>
         <td>none</td>
       </tr>
       <tr>
         <td>reserved</td>
         <td>qaa..qfy</td>
       </tr>
       <tr>
         <td>excluded</td>
         <td>qfz..qtz</td>
       </tr>
       <tr>
         <td rowspan="3">script</td>
         <td>defined</td>
         <td>Qaai (obsolete), Qaag</td>
       </tr>
       <tr>
         <td>reserved</td>
         <td>Qaaa..Qaaf Qaah Qaaj..Qaap</td>
       </tr>
       <tr>
         <td>excluded</td>
         <td>Qaaq..Qabx</td>
       </tr>
       <tr>
         <td rowspan="3">region</td>
         <td>defined</td>
         <td>QO, QU, UK, XA, XB, XK, ZZ</td>
       </tr>
       <tr>
         <td>reserved</td>
         <td>AA QM..QN QP..QT QV..QZ</td>
       </tr>
       <tr>
         <td>excluded</td>
         <td>XC..XJ, XL..XZ</td>
       </tr>
       <tr>
         <td rowspan="3">timezone</td>
         <td>defined</td>
         <td>IANA: Etc/Unknown<br>
         bcp47: as listed in bcp47/timezone.xml</td>
       </tr>
       <tr>
         <td>reserved</td>
         <td>bcp47: all non-5 letter codes not starting with x</td>
       </tr>
       <tr>
         <td>excluded</td>
         <td>bcp47: all non-5 letter codes starting with x</td>
       </tr>
     </table>
     <p>See also <em>Section 3.5.1 <a href=
     "#Unknown_or_Invalid_Identifiers">Unknown or Invalid
     Identifiers</a></em>.</p>
     <h3><a name="Locale_Extension_Key_and_Type_Data" id=
     "Locale_Extension_Key_and_Type_Data"></a><a name="u_Extension"
     href="#u_Extension" id="u_Extension">3.6 Unicode BCP 47 U
     Extension</a></h3>
     <p>[<a href="#BCP47">BCP47</a>] Language Tags provides a
     mechanism for extending language tags for use in various
     applications by extension subtags. Each extension subtag is
     identified by a single alphanumeric character subtag assigned
     by IANA.</p>
     <p>The Unicode Consortium has registered and is the maintaining
     authority for two BCP 47 language tag extensions: the extension
     'u' for Unicode locale extension [<a href=
     "#RFC6067">RFC6067</a>] and extension 't' for transformed
     content [<a href="#RFC6497">RFC6497</a>]. The Unicode BCP 47
     extension data defines the complete list of valid subtags.</p>
     <p>These subtags are all in lowercase (that is the canonical
     casing for these subtags), however, subtags are
     case-insensitive and casing does not carry any specific
     meaning. All subtags within the Unicode extensions are
     alphanumeric characters in length of two to eight that meet the
     rule <code>extension</code> in the [<a href=
     "#BCP47">BCP47</a>]</p>
     <p><strong>The -u- Extension.</strong> The syntax of 'u'
     extension subtags is defined by the rule
     <code>unicode_locale_extensions</code> in <a href=
     "#Unicode_locale_identifier">Section 3.2 Unicode locale
     identifier</a>, except the separator of subtags
     <code>sep</code> must be always hyphen '-' when the extension
     is used as a part of BCP 47 language tag.</p>
     <p>A 'u' extension may contain multiple <code>attribute</code>
     s or <code>keyword</code> s as defined in <a href=
     "#Unicode_locale_identifier">Section 3.2 Unicode locale
 		identifier</a>. The canonical syntax is defined as in <a href="#Canonical_Unicode_Locale_Identifiers">3.2.1 Canonical Unicode Locale Identifiers</a>.    </p>
     <p><em>See also <a href=
     "http://cldr.unicode.org/index/bcp47-extension">Unicode
     Extensions for BCP 47</a> on the CLDR site.</em></p>
     <h4><a href="#Key_And_Type_Definitions_" name=
     "Key_And_Type_Definitions_" id=
     "Key_And_Type_Definitions_">3.6.1 Key And Type
     Definitions</a></h4>
     <p>The following chart contains a set of U extension key values
     that are currently available, with a description or sampling of
     the U extension type values. Each category is associated with
     an XML file in the bcp47 directory.</p>
     <p>For the complete list of valid keys and types defined for
     Unicode locale extensions, see <a href=
     "#Unicode_Locale_Extension_Data_Files">Section 3.6.4 U
     Extension Data Files</a>. For information on the process for
     adding new <i>key</i>/<i>type</i>, see [<a href=
     "#localeProject">LocaleProject</a>].</p>
     <p>Most type values are represented by a single subtag in the
     current version of CLDR. There are exceptions, such as types
     used for key "ca" (calendar) and "kr" (collation reordering).
     If the type is not included, then the type value "true" is
     assumed. Note that the default for key with a possible "true"
     value is often "false", but may not always be. Note also that
     "true"/"True" is not a valid script code, since <a href=
     "https://www.unicode.org/iso15924/codelists.html">the ISO 15924
     Registration Authority has exceptionally reserved it</a>, which
     means that it will not be assigned for any purpose.</p>
     <p>The BCP 47 form for keys and types is the canonical form,
     and recommended. Other aliases are included for backwards
     compatibility.</p>
     <table>
       <caption>
         <a name="Key_Type_Definitions" href="#Key_Type_Definitions"
         id="Key_Type_Definitions">Key/Type Definitions</a>
       </caption>
       <tr>
         <th>key<br>
         (old key name)</th>
         <th>key description</th>
         <th>example type<br>
         (old type name)</th>
         <th>type description</th>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeCalendarIdentifier" name=
         "UnicodeCalendarIdentifier" id=
         "UnicodeCalendarIdentifier">Unicode Calendar Identifier</a>
         defines a type of calendar. The valid values are those
         <em>name</em> attribute values in the <em>type</em>
         elements of key name="ca" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="10">"ca"<br>
         (calendar)</td>
         <td rowspan="10">Calendar algorithm<br>
         <br>
         <i>(For information on the calendar algorithms associated
         with the data used with these, see [<a href=
         "#Calendars">Calendars</a>].)</i></td>
         <td>"buddhist"</td>
         <td>Thai Buddhist calendar (same as Gregorian except for
         the year)</td>
       </tr>
       <tr>
         <td>"chinese"</td>
         <td>Traditional Chinese calendar</td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td>"gregory"<br>
         (gregorian)</td>
         <td>Gregorian calendar</td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td>"islamic"</td>
         <td>Islamic calendar</td>
       </tr>
       <tr>
         <td>"islamic-civil"</td>
         <td>Islamic calendar, tabular (intercalary years
         [2,5,7,10,13,16,18,21,24,26,29] - civil epoch)</td>
       </tr>
       <tr>
         <td>"islamic-umalqura"</td>
         <td>Islamic calendar, Umm al-Qura</td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td colspan="2"><b>Note:</b> <i>Some calendar types are
         represented by two subtags. In such cases, the first subtag
         specifies a generic calendar type and the second subtag
         specifies a calendar algorithm variant. The CLDR uses
         generic calendar types (single subtag types) for tagging
         data when calendar algorithm variations within a generic
         calendar type are irrelevant. For example, type "islamic"
         is used for specifying Islamic calendar formatting data for
         all Islamic calendar types, including "islamic-civil" and
         "islamic-umalqura".</i></td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeCurrencyFormatIdentifier" name=
         "UnicodeCurrencyFormatIdentifier" id=
         "UnicodeCurrencyFormatIdentifier">Unicode Currency Format
         Identifier</a> defines a style for currency formatting. The
         valid values are those <em>name</em> attribute values in
         the <em>type</em> elements of key name="cf" in
         bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/currency.xml">currency.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="2">"cf"</td>
         <td rowspan="2">Currency Format style</td>
         <td>"standard"</td>
         <td>Negative numbers use the minusSign symbol (the
         default).</td>
       </tr>
       <tr>
         <td>"account"</td>
         <td>Negative numbers use parentheses or equivalent.</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeCollationIdentifier" name=
         "UnicodeCollationIdentifier" id=
         "UnicodeCollationIdentifier">Unicode Collation
         Identifier</a> defines a type of collation (sort order).
         The valid values are those <em>name</em> attribute values
         in the <em>type</em> elements of bcp47/<a target="_blank"
         href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/collation.xml">collation.xml</a></strong>.</td>
       </tr>
       <tr>
         <td colspan="4"><i>For information on each collation
         setting parameter, from <strong>ka</strong> to
         <strong>vt</strong>, see <a href=
         "tr35-collation.html#Setting_Options">Setting
         Options</a></i></td>
       </tr>
       <tr>
         <td rowspan="9">"co"<br>
         (collation)</td>
         <td rowspan="9">Collation type</td>
         <td>"standard"</td>
         <td>The default ordering for each language. For root it is
         based on the [<a href="#DUCET">DUCET</a>] (Default Unicode
         Collation Element Table): see <em><a href=
         "tr35-collation.html#Root_Collation">Root
         Collation</a></em>. Each other locale is based on that,
         except for appropriate modifications to certain characters
         for that language.</td>
       </tr>
       <tr>
         <td>"search"</td>
         <td>A special collation type dedicated for string search—it
         is not used to determine the relative order of two strings,
         but only to determine whether they should be considered
         equivalent for the specified strength, using the string
         search matching rules appropriate for the language.
         Compared to the normal collator for the language, this may
         add or remove primary equivalences, may make additional
         characters ignorable or change secondary equivalences, and
         may modify contractions to allow matching within them,
         depending on the desired behavior. For example, in Czech,
         the distinction between ‘a’ and ‘á’ is secondary for normal
         collation, but primary for search; a search for ‘a’ should
         never match ‘á’ and vice versa. A search collator is
         normally used with strength set to PRIMARY or SECONDARY
         (should be SECONDARY if using “asymmetric” search as
         described in the [<a href=
         "https://www.unicode.org/reports/tr41/#UTS10">UCA</a>]
         section Asymmetric Search). The search collator in root
         supplies matching rules that are appropriate for most
         languages (and which are different than the root collation
         behavior); language-specific search collators may be
         provided to override the matching rules for a given
         language as necessary.</td>
       </tr>
       <tr>
         <td colspan="2">
           <p>Other keywords provide additional choices for certain
           locales; <i>they only have effect in certain
           locales.</i></p>
         </td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td>"phonetic"</td>
         <td>Requests a phonetic variant if available, where text is
         sorted based on pronunciation. It may interleave different
         scripts, if multiple scripts are in common use.</td>
       </tr>
       <tr>
         <td>"pinyin"</td>
         <td>Pinyin ordering for Latin and for CJK characters; that
         is, an ordering for CJK characters based on a
         character-by-character transliteration into a pinyin. (used
         in Chinese)</td>
       </tr>
       <tr>
         <td>"reformed"</td>
         <td>Reformed collation (such as in Swedish)</td>
       </tr>
       <tr>
         <td>"searchjl"</td>
         <td>Special collation type for a modified string search in
         which a pattern consisting of a sequence of Hangul initial
         consonants (jamo lead consonants) will match a sequence of
         Hangul syllable characters whose initial consonants match
         the pattern. The jamo lead consonants can be represented
         using conjoining or compatibility jamo. This search
         collator is best used at SECONDARY strength with an
         "asymmetric" search as described in the [<a href=
         "https://www.unicode.org/reports/tr41/#UTS10">UCA</a>]
         section Asymmetric Search and obtained, for example, using
         ICU4C's usearch facility with attribute
         USEARCH_ELEMENT_COMPARISON set to value
         USEARCH_PATTERN_BASE_WEIGHT_IS_WILDCARD; this ensures that
         a full Hangul syllable in the search pattern will only
         match the same syllable in the searched text (instead of
         matching any syllable with the same initial consonant),
         while a Hangul initial consonant in the search pattern will
         match any Hangul syllable in the searched text with the
         same initial consonant.</td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeCurrencyIdentifier" name=
         "UnicodeCurrencyIdentifier" id=
         "UnicodeCurrencyIdentifier">Unicode Currency Identifier</a>
         defines a type of currency. The valid values are those
         <em>name</em> attribute values in the <em>type</em>
         elements of key name="cu" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/currency.xml">currency.xml</a>.</strong></td>
       </tr>
       <tr>
         <td>"cu"<br>
         (currency)</td>
         <td>Currency type</td>
         <td>
           <i>ISO 4217 code,</i>
           <p><i>plus others in common use</i></p>
         </td>
         <td>
           <p>Codes consisting of 3 ASCII letters that are or have
           been valid in ISO 4217, plus certain additional codes
           that are or have been in common use. The list of
           countries and time periods associated with each currency
           value is available in <a href=
           "tr35-numbers.html#Supplemental_Currency_Data">Supplemental
           Currency Data</a>, plus the default number of
           decimals.</p>
           <p>The XXX code is given a broader interpretation as
           <em>Unknown or Invalid Currency</em>.</p>
         </td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeDictionaryBreakExclusionIdentifier" name=
         "UnicodeDictionaryBreakExclusionIdentifier" id=
         "UnicodeDictionaryBreakExclusionIdentifier">Unicode Dictionary Break Exclusion Identifier</a>
         specifies scripts to be excluded from dictionary-based text break (for words and lines).
         The valid values are of one or more items of type SCRIPT_CODE as specified in the
         <em>name</em> attribute value in the <em>type</em>
         element of key name="dx" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/segmentation.xml">segmentation.xml</a>.</strong></td>
       </tr>
       <tr>
         <td>"dx"</td>
         <td>Dictionary break script exclusions</td>
         <td>
           <i><code><a href="#unicode_script_subtag">unicode_script_subtag</a></code> values</i>
         </td>
         <td>
           <p>One or more items of type SCRIPT_CODE, which are valid
           <code><a href="#unicode_script_subtag">unicode_script_subtag</a></code> values.</p>
           <p>The code Zyyy (Common) can be specified to exclude all scripts, in which case
           it should be the only SCRIPT_CODE value specified.</p>
         </td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeEmojiPresentationStyleIdentifier" name=
         "UnicodeEmojiPresentationStyleIdentifier" id=
         "UnicodeEmojiPresentationStyleIdentifier">Unicode Emoji
         Presentation Style Identifier</a> specifies a request for
         the preferred emoji presentation style. This can be used as
         part of the value for an HTML lang attribute, for example
         <code>&lt;html lang="sr-Latn-u-em-emoji"&gt;</code>. The
         valid values are those <em>name</em> attribute values in
         the <em>type</em> elements of key name="em" in
         bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/variant.xml">variant.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="3">"em"</td>
         <td rowspan="3">Emoji presentation style</td>
         <td>"emoji"</td>
         <td>Use an emoji presentation for emoji characters if
         possible.</td>
       </tr>
       <tr>
         <td>"text"</td>
         <td>Use a text presentation for emoji characters if
         possible.</td>
       </tr>
       <tr>
         <td>"default"</td>
         <td>Use the default presentation for emoji characters as
         specified in UTR #51 Section 4, <a href=
         "https://www.unicode.org/reports/tr51/#Presentation_Style">Presentation
         Style</a>.</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeFirstDayIdentifier" name=
         "UnicodeFirstDayIdentifier" id=
         "UnicodeFirstDayIdentifier">Unicode First Day
         Identifier</a> defines the preferred first day of the week
         for calendar display. Specifying "fw" in a locale
         identifier overrides the default value specified by
         supplemental week data (see Part 4 Dates, section 4.3
         <a href="tr35-dates.html#Week_Data">Week Data</a>). The
         valid values are those <em>name</em> attribute values in
         the <em>type</em> elements of key name="fw" in
         bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="4">"fw"</td>
         <td rowspan="4">First day of week</td>
         <td>"sun"</td>
         <td>Sunday</td>
       </tr>
       <tr>
         <td>"mon"</td>
         <td>Monday</td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td>"sat"</td>
         <td>Saturday</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeHourCycleIdentifier" name=
         "UnicodeHourCycleIdentifier" id=
         "UnicodeHourCycleIdentifier">Unicode Hour Cycle
         Identifier</a> defines the preferred time cycle. Specifying
         "hc" in a locale identifier overrides the default value
         specified by supplemental time data (see Part 4 Dates,
         section 4.4 <a href="tr35-dates.html#Time_Data">Time
         Data</a>). The valid values are those <em>name</em>
         attribute values in the <em>type</em> elements of key
         name="hc" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="4">"hc"</td>
         <td rowspan="4">Hour cycle</td>
         <td>"h12"</td>
         <td>Hour system using 1–12; corresponds to 'h' in
         patterns</td>
       </tr>
       <tr>
         <td>"h23"</td>
         <td>Hour system using 0–23; corresponds to 'H' in
         patterns</td>
       </tr>
       <tr>
         <td>"h11"</td>
         <td>Hour system using 0–11; corresponds to 'K' in
         patterns</td>
       </tr>
       <tr>
         <td>"h24"</td>
         <td>Hour system using 1–24; corresponds to 'k' in
         pattern</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeLineBreakStyleIdentifier" name=
         "UnicodeLineBreakStyleIdentifier" id=
         "UnicodeLineBreakStyleIdentifier">Unicode Line Break Style
         Identifier</a> defines a preferred line break style
         corresponding to the CSS level 3 <a href=
         "https://drafts.csswg.org/css-text/#line-break-property">line-break
         option</a>. Specifying "lb" in a locale identifier
         overrides the locale‘s default style (which may correspond
         to "normal" or "strict"). The valid values are those
         <em>name</em> attribute values in the <em>type</em>
         elements of key name="lb" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="3">"lb"</td>
         <td rowspan="3">Line break style</td>
         <td>"strict"</td>
         <td>CSS level 3 line-break=strict, e.g. treat CJ as NS</td>
       </tr>
       <tr>
         <td>"normal"</td>
         <td>CSS level 3 line-break=normal, e.g. treat CJ as ID,
         break before hyphens for ja,zh</td>
       </tr>
       <tr>
         <td>"loose"</td>
         <td>CSS lev 3 line-break=loose</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeLineBreakWordIdentifier" name=
         "UnicodeLineBreakWordIdentifier" id=
         "UnicodeLineBreakWordIdentifier">Unicode Line Break Word
         Identifier</a> defines preferred line break word handling
         behavior corresponding to the CSS level 3 <a href=
         "https://drafts.csswg.org/css-text/#word-break-property">word-break
         option</a>. The valid values are those <em>name</em>
         attribute values in the <em>type</em> elements of key
         name="lw" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="3">"lw"</td>
         <td rowspan="3">Line break word handling</td>
         <td>"normal"</td>
         <td>CSS level 3 word-break=normal, normal script/language
         behavior for midword breaks</td>
       </tr>
       <tr>
         <td>"breakall"</td>
         <td>CSS level 3 word-break=break-all, allow midword breaks
         unless forbidden by lb setting</td>
       </tr>
       <tr>
         <td>"keepall"</td>
         <td>CSS level 3 word-break=keep-all, prohibit midword
         breaks except for dictionary breaks</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeMeasurementSystemIdentifier" name=
         "UnicodeMeasurementSystemIdentifier" id=
         "UnicodeMeasurementSystemIdentifier">Unicode Measurement
         System Identifier</a> defines a preferred measurement
         system. Specifying "ms" in a locale identifier overrides
         the default value specified by supplemental measurement
         system data (see Part 2 General, section 5 <a href=
         "tr35-general.html#Measurement_System_Data">Measurement
         System Data</a>). The valid values are those <em>name</em>
         attribute values in the <em>type</em> elements of key
         name="ms" in bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/measure.xml">measure.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="3">"ms"</td>
         <td rowspan="3">Measurement system</td>
         <td>"metric"</td>
         <td>Metric System</td>
       </tr>
       <tr>
         <td>"ussystem"</td>
         <td>US System of measurement: feet, pints, etc.; pints are
         16oz</td>
       </tr>
       <tr>
         <td>"uksystem"</td>
         <td>UK System of measurement: feet, pints, etc.; pints are
         20oz</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeNumberSystemIdentifier" name=
         "UnicodeNumberSystemIdentifier" id=
         "UnicodeNumberSystemIdentifier">Unicode Number System
         Identifier</a> defines a type of number system. The valid
         values are those <em>name</em> attribute values in the
         <em>type</em> elements of bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/number.xml">number.xml</a>.</strong></td>
       </tr>
       <tr>
         <td rowspan="7">"nu"<br>
         (numbers)</td>
         <td rowspan="7">Numbering system</td>
         <td><i>Unicode script subtag</i></td>
         <td>
           <p>Four-letter types indicating the primary numbering
           system for the corresponding script represented in
           Unicode. Unless otherwise specified, it is a decimal
           numbering system using digits [:GeneralCategory=Nd:]. For
           example, "latn" refers to the ASCII / Western digits 0-9,
           while "taml" is an algorithmic (non-decimal) numbering
           system. (The code "tamldec" is indicates the "modern
           Tamil decimal digits".)<br></p>
           <p class="note">For more information, see <a href=
           "tr35-numbers.html#Numbering_Systems">Numbering
           Systems</a>.</p>
         </td>
       </tr>
       <tr>
         <td>"arabext"</td>
         <td>Extended Arabic-Indic digits ("arab" means the base
         Arabic-Indic digits)</td>
       </tr>
       <tr>
         <td>"armnlow"</td>
         <td>Armenian lowercase numerals</td>
       </tr>
       <tr>
         <td colspan="2">…</td>
       </tr>
       <tr>
         <td>"roman"</td>
         <td>Roman numerals</td>
       </tr>
       <tr>
         <td>"romanlow"</td>
         <td>Roman lowercase numerals</td>
       </tr>
       <tr>
         <td>"tamldec"</td>
         <td>Modern Tamil decimal digits</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href="#RegionOverride" name=
         "RegionOverride" id="RegionOverride">Region Override</a>
         specifies an alternate region to use for obtaining certain
         region-specific default values (those specified by the
         <a href="tr35-info.html#rgScope">&lt;rgScope&gt;</a>
         element), instead of using the region specified by the
         <a href="#unicode_region_subtag">unicode_region_subtag</a>
         in the Unicode Language Identifier (or inferred from the
         <a href=
         "#unicode_language_subtag">unicode_language_subtag</a>).</strong></td>
       </tr>
       <tr>
         <td rowspan="2">"rg"</td>
         <td rowspan="2">Region Override</td>
         <td>"uszzzz"<br>
         <br></td>
         <td rowspan="2">The value is a <a
         href= "#unicode_subdivision_id">unicode_subdivision_id</a>
         of type “unknown” or “regular”; this consists of a
         <a href=
         "#unicode_region_subtag">unicode_region_subtag</a> for a
         regular region (not a macroregion), suffixed
         either by “zzzz” (case is not
         significant) to designate the region
         as a whole, or by a unicode_subdivision_suffix to provide
         more specificity. For example, “en-GB-u-rg-uszzzz”
         represents a locale for British English but with
         region-specific defaults set to US for items such as
         default currency, default calendar and week data, default
         time cycle, and default measurement system and unit
         preferences.</td>
       </tr>
       <tr>
         <td>…</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a name=
         "unicode_subdivision_subtag_validity" id=
         "unicode_subdivision_subtag_validity"></a><a href=
         "#UnicodeSubdivisionIdentifier" name=
         "UnicodeSubdivisionIdentifier" id=
         "UnicodeSubdivisionIdentifier">Unicode Subdivision
         Identifier</a> defines a regional subdivision used for
         locales. The valid values are based on the
         <em>subdivisionContainment</em> element as described in
         <em>Section <a href="#Unicode_Subdivision_Codes">3.6.5
         Subdivision Codes</a></em>.</strong></td>
       </tr>
       <tr>
         <td rowspan="2">"sd"</td>
         <td rowspan="2">Regional Subdivision</td>
         <td>"gbsct"<br>
         <br></td>
         <td rowspan="2">A <a href=
         "#unicode_subdivision_id">unicode_subdivision_id</a>, which
         is a <a href=
         "#unicode_region_subtag">unicode_region_subtag</a>
         concatenated with a unicode_subdivision_suffix.<br>
         For example, <em>gbsct</em> is “gb”+“sct” (where sct
         represents the subdivision code for Scotland). Thus
         “en-GB-u-sd-gbsct” represents the language variant “English
         as used in Scotland”. And both “en-u-sd-usca” and
         “en-US-u-sd-usca” represent “English as used in
         California”. See <strong><em><a href=
         "#Unicode_Subdivision_Codes">3.6.5 Subdivision
         Codes</a></em></strong>.</td>
       </tr>
       <tr>
         <td>…</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeSentenceBreakSuppressionsIdentifier" name=
         "UnicodeSentenceBreakSuppressionsIdentifier" id=
         "UnicodeSentenceBreakSuppressionsIdentifier">Unicode
         Sentence Break Suppressions Identifier</a> defines a set of
         data to be used for suppressing certain sentence breaks
         that would otherwise be found by UAX #14 rules. The valid
         values are those <em>name</em> attribute values in the
         <em>type</em> elements of key name="ss" in bcp47/<a target=
         "_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
       </tr>
       <tr>
         <td rowspan="2">"ss"</td>
         <td rowspan="2">Sentence break suppressions</td>
         <td>"none"</td>
         <td>Don’t use sentence break suppressions data (the
         default).</td>
       </tr>
       <tr>
         <td>"standard"</td>
         <td>Use sentence break suppressions data of type
         "standard"</td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeTimezoneIdentifier" name=
         "UnicodeTimezoneIdentifier" id=
         "UnicodeTimezoneIdentifier">Unicode Timezone Identifier</a>
         defines a timezone. The valid values are those name
         attribute values in the <em>type</em> elements of
         bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/timezone.xml">timezone.xml</a>.</strong></td>
       </tr>
       <tr>
         <td>"tz"<br>
         (timezone)</td>
         <td>Time zone</td>
         <td><i>Unicode short time zone IDs</i></td>
         <td>
           <p>Short identifiers defined in terms of a TZ time zone
           database [<a href="#Olson">Olson</a>] identifier in the
           file common/bcp47/timezone.xml file, plus a few extra
           values.</p>
           <p>For more information, see <a href=
           "#Time_Zone_Identifiers">Section 3.7.1.2 Time Zone
           Identifiers</a>.</p>
           <p>CLDR provides data for normalizing timezone codes.</p>
         </td>
       </tr>
       <tr>
         <td colspan="4"><strong>A <a href=
         "#UnicodeVariantIdentifier" name="UnicodeVariantIdentifier"
         id="UnicodeVariantIdentifier">Unicode Variant
         Identifier</a> defines a special variant used for locales.
         The valid values are those name attribute values in the
         <em>type</em> elements of bcp47/<a target="_blank" href=
         "https://github.com/unicode-org/cldr/tree/latest/common/bcp47/variant.xml">variant.xml</a>.</strong></td>
       </tr>
       <tr>
         <td>"va"</td>
         <td>Common variant type</td>
         <td>"posix"</td>
         <td>POSIX style locale variant. About handling of the
         "POSIX" variant see <i>Section 3.8.2, <a href=
         "#Legacy_Variants">Legacy Variants</a></i>.</td>
       </tr>
     </table>
     <p>For more information on the allowed keys and types, see the
     specific elements below, and <a href=
     "#Unicode_Locale_Extension_Data_Files">Section 3.6.4 U
     Extension Data Files</a>.</p>
     <p>Additional keys or types might be added in future versions.
     Implementations of LDML should be robust to handle any
     syntactically valid key or type values.</p>
     <h4><a href="#Numbering%20System%20Data" name=
     "Numbering System Data">3.6.2 Numbering System Data</a></h4>
     <p>LDML supports multiple numbering systems. The identifiers
     for those numbering systems are defined in the file
     <strong>bcp47/number.xml</strong>. For example, for the 'trunk'
     version of the data see <a href=
     "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/number.xml">
     bcp47/number.xml</a>.<br></p>
     <p>Details about those numbering systems are defined in
     <strong>supplemental/numberingSystems.xml</strong>. For
     example, for the 'trunk' version of the data see <a href=
     "https://github.com/unicode-org/cldr/releases/tag/latest/common/supplemental/numberingSystems.xml">
     supplemental/numberingSystems.xml</a>.<br></p>
     <p>LDML makes certain stability guarantees on this
     data:&nbsp;<br></p>
     <ol>
       <li>Like other BCP 47 identifiers, once a numeric identifier
       is added to <strong>bcp47/number.xml</strong> or
       <strong>numberingSystems.xml</strong>, it will never be
       removed from either of those files.</li>
       <li>If an identifier has type="numeric" in
       numberingSystems.xml, then
         <ol>
           <li>It is a decimal, positional numbering system with an
           attribute digits=X, where X is a string with the 10
           digits in order used by the numbering system.</li>
           <li>The values of the type and digits will never
           change.</li>
         </ol>
       </li>
     </ol>
     <h4><a href="#Time_Zone_Identifiers" name=
     "Time_Zone_Identifiers" id="Time_Zone_Identifiers">3.6.3 Time
     Zone Identifiers</a></h4>
     <p>LDML inherits time zone IDs from the tz database [<a href=
     "#Olson">Olson</a>]. Because these IDs from the tz database do
     not satisfy the BCP 47 language subtag syntax requirements,
     CLDR defines short identifiers for the use in the Unicode
     locale extension. The short identifiers are defined in the file
     <strong>common/bcp47/timezone.xml</strong>.</p>
     <p>The short identifiers use UN/LOCODE [<a href=
     "#LOCODE">LOCODE</a>] (excluding a space character) codes where
     possible. For example, the short identifier for
     "America/Los_Angeles" is "uslax" (the LOCODE for Los Angeles,
     US is "US LAX"). Identifiers of length not equal to 5 are used
     where there is no corresponding UN/LOCODE, such as "usnavajo"
     for "America/Shiprock", or "utcw01" for "Etc/GMT+1", so that
     they do not overlap with future UN/LOCODE.</p>
     <p>Although the first two letters of a short identifier may
     match an ISO 3166 two-letter country code, a user should not
     assume that the time zone belongs to the country. The first two
     letters in an identifier of length not equal to 5 has no
     meaning. Also, the identifiers are stabilized, meaning that
     they will not change no matter what changes happen in the base
     standard. So if Hawaii leaves the US and joins Canada as a new
     province, the short time zone identifier "ushnl" would not
     change in CLDR even if the UN/LOCODE changes to "cahnl" or
     something else.</p>
     <p>There is a special code "unk" for an Unknown or Invalid time
     zone. This can be expressed in the tz database style ID
     "Etc/Unknown", although it is not defined in the tz
     database.</p>
     <p><b>Stability of Time Zone Identifiers</b></p>
     <p>Although the short time zone identifiers are guaranteed to
     be stable, the preferred IDs in the tz database (as those found
     in <strong>zone.tab</strong> file) might be changed time to
     time. For example, "Asia/Culcutta" was replaced with
     "Asia/Kolkata" and moved to <strong>backward</strong> file in
     the tz database. CLDR contains locale data using a time zone ID
     from the tz database as the key, stability of the IDs is
     cirtical.</p>
     <p>To maintain the stability of "long" IDs (for those inherited
     from the tz database), a special rule applied to the
     <i>alias</i> attribute in the &lt;type&gt; element for "tz" -
     the first "long" ID is the CLDR canonical "long" time zone
     ID.</p>
     <p>For example:</p>
     <blockquote>
       &lt;type name="inccu" alias="Asia/Calcutta Asia/Kolkata"
       description="Kolkata, India"/&gt;
     </blockquote>
     <p>Above &lt;type&gt; element defines the short time zone ID
     "inccu" (for the use in the Unicode locale extension),
     corresponding <em>CLDR canonical "long" ID</em>
     "Asia/Culcutta", and an alias "Asia/Kolkata".</p>
     <h4><a href="#Unicode_Locale_Extension_Data_Files" name=
     "Unicode_Locale_Extension_Data_Files" id=
     "Unicode_Locale_Extension_Data_Files">3.6.4 U Extension Data
     Files</a></h4>
     <p>The 'u' extension data is stored in multiple XML files
     located under common/bcp47 directory in CLDR. Each file
     contains the locale extension key/type values and their
     backward compatibility mappings appropriate for a particular
     domain. <a href=
     "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/collation.xml">
     common/bcp47/collation.xml</a> contains key/type values for
     collation, including optional collation parameters and valid
     type values for each key.</p>
     <p>The 't' extension data is stored in <a href=
     "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform.xml">
     common/bcp47/transform.xml</a>.</p>
     <p class="dtd">&lt;!ELEMENT keyword ( key* )&gt;</p>
     <p class="dtd">&lt;!ELEMENT key ( type* )&gt;<br>
     &lt;!ATTLIST key extension NMTOKEN #IMPLIED&gt;<br>
     &lt;!ATTLIST key name NMTOKEN #REQUIRED&gt;<br>
     &lt;!ATTLIST key description CDATA #IMPLIED&gt;<br>
     &lt;!ATTLIST key deprecated ( true | false ) "false"&gt;<br>
     &lt;!ATTLIST key preferred NMTOKEN #IMPLIED&gt;<br>
     &lt;!ATTLIST key alias NMTOKEN #IMPLIED&gt;<br>
     &lt;!ATTLIST key valueType (single | multiple | incremental |
     any) #IMPLIED &gt;<br>
     &lt;!ATTLIST key since CDATA #IMPLIED&gt;</p>
     <p class="dtd">&lt;!ELEMENT type EMPTY&gt;<br>
     &lt;!ATTLIST type name NMTOKEN #REQUIRED&gt;<br>
     &lt;!ATTLIST type description CDATA #IMPLIED&gt;<br>
     &lt;!ATTLIST type deprecated ( true | false ) "false"&gt;<br>
     &lt;!ATTLIST type preferred NMTOKEN #IMPLIED&gt;<br>
     &lt;!ATTLIST type alias CDATA #IMPLIED&gt;<br>
     &lt;!ATTLIST type since CDATA #IMPLIED&gt;</p>
     <p class="dtd">&lt;!ELEMENT attribute EMPTY&gt;<br>
     &lt;!ATTLIST attribute name NMTOKEN #REQUIRED&gt;<br>
     &lt;!ATTLIST attribute description CDATA #IMPLIED&gt;<br>
     &lt;!ATTLIST attribute deprecated ( true | false )
     "false"&gt;<br>
     &lt;!ATTLIST attribute preferred NMTOKEN #IMPLIED&gt;<br>
     &lt;!ATTLIST attribute since CDATA #IMPLIED&gt;</p>
     <p>The extension attribute in &lt;key&gt; element specifies the
     BCP 47 language tag extension type. The default value of the
     extension attribute is "u" (Unicode locale extension). The
     &lt;type&gt; element is only applicable to the enclosing
     &lt;key&gt;.</p>
     <p>In the Unicode locale extension 'u' and 't' data files, the
     common attributes for the &lt;key&gt;, &lt;type&gt; and
     &lt;attribute&gt; elements are as follows:</p>
     <dl>
       <dt><b>name</b></dt>
       <dd>
         <p>The key or type name used by Unicode locale extension
         with <a href="#Unicode_locale_identifier">'u' extension
         syntax</a> or the 't' extensions syntax. When <i>alias</i>
         below is absent, this name can be also used with the old
         style <a href="#Old_Locale_Extension_Syntax">"@key=type"
         syntax</a>.</p>
         <p>Most type names are <strong>literal type names</strong>,
         which match exactly the same value. All of these have at
         least one lowercase letter, such as "buddhist". There are a
         small number of <strong>indirect type names</strong>, such
         as "RG_KEY_VALUE". These have no lowercase letters. The
         interpretation of each one is listed below.</p>
         <h5><a name="CODEPOINTS" href="#CODEPOINTS" id=
         "CODEPOINTS">CODEPOINTS</a></h5>
         <p>The type name <strong>"CODEPOINTS"</strong> is reserved
         for a variable representing Unicode code point(s). The
         syntax is:</p>
         <table border="0">
           <tr>
             <th>&nbsp;</th>
             <th>
               <div align="center">
                 EBNF
               </div>
             </th>
           </tr>
           <tr>
             <td>
               <pre>codepoints</pre>
             </td>
             <td>
               <pre>= codepoint (sep codepoint)?</pre>
             </td>
           </tr>
           <tr>
             <td>
               <pre>codepoint</pre>
             </td>
             <td>
               <pre>= [0-9 A-F a-f]{4,6}</pre>
             </td>
           </tr>
         </table>
         <p>In addition, no codepoint may exceed 10FFFF. For
         example, "00A0", "300b", "10D40C" and "00C1-00E1" are
         valid, but "A0", "U060C" and "110000" are not.</p>
         <p>In the current version of CLDR, the type "CODEPOINTS" is
         only used for the deprecated locale extension key "vt"
         (variableTop). The subtags forming the type for "vt"
         represent an arbitrary string of characters. There is no
         formal limit in the number of characters, although
         practically anything above 1 will be rare, and anything
         longer than 4 might be useless. Repetition is allowed, for
         example, 0061-0061 ("aa") is a Valid type value for "vt",
         since the sequence may be a collating element. Order is
         vital: 0061-0062 ("ab") is different than 0062-0061 ("ba").
         Note that for variableTop any character sequence must be a
         contraction which yields exactly one primary weight.</p>
         <p>For example,</p>
         <blockquote>
           <p><strong>en-u-vt-00A4</strong> : this indicates
           English, with any characters sorting at or below " ¤" (at
           a primary level) considered Variable.</p>
         </blockquote>
         <p>By default in UCA, variable characters are ignored in
         sorting at a primary, secondary, and tertiary level. But in
         CLDR, they are not ignorable by default. For more
         information, see <a href=
         "tr35-collation.html#Setting_Options">Collation: Section
         3.3 <em>Setting Options</em></a> .</p>
         <h5><a name="REORDER_CODE" href="#REORDER_CODE" id=
         "REORDER_CODE">REORDER_CODE</a></h5>
         <p>The type name <strong>"REORDER_CODE"</strong> is
         reserved for reordering block names (e.g. "latn", "digit"
         and "others") defined in the <i><a href=
         "tr35-collation.html#Root_Collation">Root
         Collation</a></i>. The type "REORDER_CODE" is used for
         locale extension key "kr" (colReorder). The value of type
         for "kr" is represented by one or more reordering block
         names such as "latn-digit". For more information, see
         <a href="tr35-collation.html#Script_Reordering">Collation:
         Section 3.12 <em>Collation Reordering</em></a> .</p>
         <h5><a name="RG_KEY_VALUE" href="#RG_KEY_VALUE" id=
         "RG_KEY_VALUE">RG_KEY_VALUE</a></h5>
         <p>The type name <strong>"RG_KEY_VALUE"</strong> is
         reserved for region codes in the format required by the
         "rg" key; this is a subdivision
         code with idStatus='unknown' or 'regular' from the
         idValidity data in common/validity/subdivision.xml.</p>
         <h5><a name="SCRIPT_CODE" href="#SCRIPT_CODE" id=
         "SCRIPT_CODE">SCRIPT_CODE</a></h5>
         <p>The type name <strong>"SCRIPT_CODE"</strong> is
         reserved for <code><a href="#unicode_script_subtag">unicode_script_subtag</a></code>
         values (e.g. "thai", "laoo").
         The type "SCRIPT_CODE" is used for locale extension key "dx".
         The value of type for "dx" is represented by one or more SCRIPT_CODEs,
         such as "thai-laoo".</p>
         <h5><a name="SUBDIVISION_CODE" href="#SUBDIVISION_CODE" id=
         "SUBDIVISION_CODE">SUBDIVISION_CODE</a></h5>
         <p>The type name <strong>"SUBDIVISION_CODE"</strong> is
         reserved for subdivision codes in the format required by
         the "sd" key; this is a subdivision code from the
         idValidity data in common/validity/subdivision.xml,
         excluding those with idStatus='unknown'. Codes with
         idStatus='deprecated' should not be generated, and those
         with idStatus='private_use' are only to be used with prior
         agreement.</p>
         <h5><a name="PRIVATE_USE" href="#PRIVATE_USE" id=
         "PRIVATE_USE">PRIVATE_USE</a></h5>
         <p>The type name <strong>"PRIVATE_USE"</strong> is reserved
         for private use types. A valid type value is composed of
         one or more subtags separated by hyphens and each subtag
         consists of three to eight ASCII alphanumeric characters.
         In the current version of CLDR,
         <strong>"PRIVATE_USE"</strong> is only used for transform
         extension "x0".</p>
       </dd>
       <dt><b>valueType</b></dt>
       <dd>
         <p>The valueType attribute indicates how many subtags are
         valid for a given key:</p>
         <table class='simple' width="100%" border="1">
           <tbody>
             <tr>
               <th>single</th>
               <td>Either exactly one type value, or no type value
               (but only if the value of "true" would be valid).
               This is the default if no valueType attribute is
               present.</td>
             </tr>
             <tr>
               <th>incremental</th>
               <td>Multiple type values are allowed, but only if a
               prefix is also present, and the sequence is
               explicitly listed. Each successive type value
               indicates a refinement of its prefix. For
               example:<br>
               &lt;key name="ca" description="Calendar algorithm
               key" <strong>valueType="incremental"</strong>&gt;<br>
               &nbsp;&nbsp;&lt;type name="islamic"
               description="Islamic calendar"/&gt;<br>
               &nbsp;&nbsp;&lt;type name="islamic-umalqura"
               description="Islamic calendar, Umm al-Qura"/&gt;<br>
               Thus <em>ca-islamic-umalqura</em> is valid. However,
               <em>ca-gregory-japanese</em> is not valid, because
               "gregory-japanese" is not listed as a type.</td>
             </tr>
             <tr>
               <th>multiple</th>
               <td>Multiple type values are allowed, but each may
               only occur once. For example:<br>
               &lt;key name="kr" description="Collation reorder
               codes" <strong>valueType="multiple"</strong>&gt;<br>
               &nbsp;&nbsp;&lt;type name="REORDER_CODE" …/&gt;</td>
             </tr>
             <tr>
               <th>any</th>
               <td>Any number of type values are allowed, with none
               of the above restrictions. For example:<br>
               &lt;key extension="t" name="x0" description="Private
               use transform type key."
               <strong>valueType="any"</strong>&gt;<br>
               &nbsp;&nbsp;&lt;type name="PRIVATE_USE" …/&gt;</td>
             </tr>
           </tbody>
         </table>
       </dd>
       <dt><b>description</b></dt>
       <dd>
         <p>The description of the key, type or attribute element.
         There is also some informative text about certain keys and
         types in the Section 3.5 <a href=
         "#Key_And_Type_Definitions_">Key And Type
         Definitions</a>.</p>
       </dd>
       <dt><b>deprecated</b></dt>
       <dd>
         <p>The deprecation status of the key, type or attribute
         element. The value "true" indicates the element is
         deprecated and no longer used in the version of CLDR. The
         default value is "false".</p>
       </dd>
       <dt><b>preferred</b></dt>
       <dd>
         <p>The preferred value of the deprecated key, type or
         attribute element. When a key, type or attribute element is
         deprecated, this attribute is used for specifying a new
         canonical form if available.</p>
       </dd>
       <dt><b>alias</b> (Not applicable to &lt;attribute&gt;)</dt>
       <dd>
         <p>The BCP 47 form is the canonical form, and recommended.
         Other aliases are included only for backwards
         compatibility.</p>
       </dd>
       <dd><em>Example:</em></dd>
       <dd>
         <p>&lt;type name="phonebk"
         <strong>alias="phonebook"</strong> description="Phonebook
         style ordering (such as in German)"/&gt;<br></p>The
         preferred term, and the only one to be used in BCP 47, is
         the name: in this example, "phonebk".<br>
       </dd>
       <dd>
         <p>The alias is a key or type name used by Unicode locale
         extensions with the old <a href=
         "#Old_Locale_Extension_Syntax">"@key=type" syntax</a>. The
         attribute value for type may contain multiple names
         delimited by ASCII space characters. Of those aliases, the
         first name is the preferred value.</p>
       </dd>
       <dt><b>since</b></dt>
       <dd>The version of CLDR in which this key or type was
       introduced. Absence of this attribute value implies the key
       or type was available in CLDR 1.7.2.</dd>
     </dl>
     <p><em>Note: There are no values defined for the locale
     extension attribute in the current CLDR release.</em></p>
     <p>For example,</p>
     <pre>
 &lt;key name="co" alias="collation" description="Collation type key"&gt;
   &lt;type name="pinyin" description="Pinyin ordering for Latin and for CJK characters (used in Chinese)"/&gt;
 &lt;/key&gt;

 &lt;key name="ka" alias="colAlternate" description="Collation parameter key for alternate handling"&gt;
   &lt;type name="noignore" alias="non-ignorable" description="Variable collation elements are not reset to ignorable"/&gt;
   &lt;type name="shifted" description="Variable collation elements are reset to zero at levels one through three"/&gt;
 &lt;/key&gt;

 &lt;key name="tz" alias="timezone"&gt;
   ...
   &lt;type name="aumel" alias="Australia/Melbourne Australia/Victoria" description="Melbourne, Australia"/&gt;
   &lt;type name="aumqi" alias="Antarctica/Macquarie" description="Macquarie Island Station, Macquarie Island" since="1.8.1"/&gt;
   ...
 &lt;/key&gt;
     </pre>The data above indicates:
     <ul>
       <li>type "pinyin" is valid for key "co", thus "u-co-pinyin"
       is a valid Unicode locale extension.</li>
       <li>type "pinyin" is not valid for key "ka", thus
       "u-ka-pinyin" is not a valid Unicode locale extension.</li>
       <li>type "pinyin" has no <i>alias</i>, so
       "zh@collation=pinyin" is a valid Unicode locale identifier
       according to the old syntax.</li>
       <li>type "noignore" has an alias attribute, so
       "en@colAlternate=noignore" is not a valid Unicode locale
       identifier according to the old syntax.</li>
       <li>type "aumel" is valid for key "tz", supported by CLDR
       1.7.2 (default value) or later versions.</li>
       <li>type "aumqi" is valid for key "tz", supported by CLDR
       1.8.1 or later versions.</li>
     </ul>
     <p>It is strongly recommended that all API methods accept all
     possible aliases for keywords and types, but generate the
     canonical form. For example, "ar-u-ca-islamicc" would be
     equivalent to "ar-u-ca-islamic-civil" on input, but the latter
     should be output. The one exception is where an alias would
     only be well-formed with the old syntax, such as "gregorian"
     (for "gregory").</p>
     <h4><a href="#Unicode_Subdivision_Codes" name=
     "Unicode_Subdivision_Codes" id=
     "Unicode_Subdivision_Codes">3.6.5 Subdivision Codes</a></h4>
     <p>The subdivision codes designate a subdivision of a country
     or region. They are called various names, such as a
     <em>state</em> in the United States, or a <em>province</em> in
     Canada. The codes in CLDR are based on ISO 3166-2 subdivision
     codes. The ISO codes have a region code followed by a hyphen,
     then a suffix consisting of 1..3 ASCII letters or digits.</p>
     <p>The CLDR codes are designed to work in a <a href=
     '#unicode_locale_id'>unicode_locale_id</a> (BCP47), and are
     thus all lowercase, with no hyphen. For example, the following
     are valid, and mean “English as used in California, USA”.</p>
     <ul>
       <li>en-u-sd-<strong>usca</strong></li>
       <li>en-US-u-sd-<strong>usca</strong></li>
     </ul>
     <p>CLDR has additional subdivision codes. These may start with
     a 3-digit region code or use a suffix of 4 ASCII letters or
     digits, so they will not collide with the ISO codes.
     Subdivision codes for unknown values are the region code plus
     "zzzz", such as "uszzzz" for an unknown subdivision of the US.
     Other codes may be added for stability.</p>
     <p>Like BCP 47, CLDR requires stable codes, which are not
     guaranteed for ISO 3166-2 (nor have the ISO 3166-2 codes been
     stable in the past). If an ISO 3166-2 code is removed, it
     remains valid (though marked as deprecated) in CLDR. If an ICU
     3166-2 code is reused (for the same region), then CLDR will
     define a new equivalent code using these a 4-character
     suffixes.</p>
     <h5><a name="Validity" href="#Validity" id="Validity">3.6.5.1
     Validity</a></h5>
     <p>A <a href=
     "#unicode_subdivision_id">unicode_subdivision_id</a> is only
     valid when it is present in the subdivision.xml file as
     described in <em>Section 3.11 <a href="#Validity_Data">Validity
     Data</a></em>. The data is in a compressed form, and thus needs
     to be expanded before such a test is made.</p>
     <p><em>Examples:<br></em></p>
     <ul>
       <li><strong>usca</strong> is valid — there is an
       <strong>id</strong>
       element<code>&lt;id&nbsp;type="subdivision"…&gt;… usca
       …&lt;/id&gt;</code></li>
       <li><strong>ussct</strong> is invalid — there is no
       <strong>id</strong> element
       <code>&lt;id&nbsp;type="subdivision"…&gt;… ussct
       …&lt;/id&gt;</code></li>
     </ul>
     <p>If a <a href='#unicode_locale_id'>unicode_locale_id</a>
     contains both a <a href=
     "#unicode_region_subtag">unicode_region_subtag</a> and a
     <a href="#unicode_subdivision_id">unicode_subdivision_id</a>,
     it is only valid if the <a href=
     "#unicode_subdivision_id">unicode_subdivision_id</a> starts
     with the <a href=
     "#unicode_region_subtag">unicode_region_subtag</a>
     (case-insensitively).<br></p>
     <p>It is recommended that a <a href=
     '#unicode_locale_id'>unicode_locale_id</a> contain a <a href=
     "#unicode_region_subtag">unicode_region_subtag</a> if it
     contains a <a href=
     "#unicode_subdivision_id">unicode_subdivision_id</a> and the
     region would not be added by adding likely subtags. That
     produces better behavior if the <a href=
     "#unicode_subdivision_id">unicode_subdivision_id</a> is ignored
     by an implementation or if the language tag is truncated.</p>
     <p>Examples:<br></p>
     <ul>
       <li>en-<strong>US</strong>-u-sd-<strong>us</strong>ca is
       valid — the region "US" matches the first part of "usca"</li>
       <li>en-u-sd-<strong>us</strong>ca is valid — it still works
       after adding likely subtags.</li>
       <li>en-<strong>CA</strong>-u-sd-<strong>gb</strong>sct is
       invalid — the region "CA" does not match the first part of
       "gbsct". An implementation should disregard the subdivision
       id (or return an error).</li>
       <li>en-u-sd-<strong>gb</strong>sct is valid but not
       recommended — an implementation that ignores the <a href=
       "#unicode_subdivision_id">unicode_subdivision_id</a> can get
       the wrong fallback behavior, or could add likely subtags and
       get the invalid
       en<strong>-Latn-US</strong>-u-sd-<strong>gb</strong>sct</li>
     </ul>
     <p>In version 28.0, the subdivisions in the validity files used
     the ISO format, uppercase with a hyphen separating two
     components, instead of the BCP 47 format.</p>
     <h3><a name="t_Extension" id="t_Extension"></a><a name=
     "BCP47_T_Extension" href="#BCP47_T_Extension" id=
     "BCP47_T_Extension">3.7 Unicode BCP 47 T Extension</a></h3>
     <p>The Unicode Consortium has registered and is the maintaining
     authority for two BCP 47 language tag extensions: the extension
     'u' for Unicode locale extension [<a href=
     "#RFC6067">RFC6067</a>] and extension 't' for transformed
     content [<a href="#RFC6497">RFC6497</a>]. The Unicode BCP 47
     extension data defines the complete list of valid subtags.
     While the title of the RFC is “Transformed Content”, the
     abstract makes it clear that the scope is broader than the term
     "transformed" might indicate to a casual
     reader:&nbsp;“including content that has been transliterated,
     transcribed, or translated, or&nbsp;<em>in some other way
     influenced by the source. It also provides for additional
     information used for identification.</em>”</p>
     <p><strong>The -t- Extension.</strong> The syntax of 't'
     extension subtags is defined by the rule
     <code>unicode_locale_extensions</code> in <a href=
     "#Unicode_locale_identifier"><em>Section 3.2 Unicode locale
     identifier</em></a>, except the separator of subtags
     <code>sep</code> must be always hyphen '-' when the extension
     is used as a part of BCP 47 language tag. For information about
     the registration process, meaning, and usage of the 't'
     extension, see [<a href="#RFC6497">RFC6497</a>].</p>
     <p>These subtags are all in lowercase (that is the canonical
     casing for these subtags), however, subtags are
     case-insensitive and casing does not carry any specific
     meaning. All subtags within the Unicode extensions are
     alphanumeric characters in length of two to eight that meet the
     rule <code>extension</code> in the [<a href=
     "#BCP47">BCP47</a>].</p>
     <p>The following keys are defined for the -t- extension:</p>
     <table class='simple'>
       <tbody>
         <tr>
           <th>Keys</th>
           <th>Description</th>
           <th>Values in latest release</th>
         </tr>
         <tr>
           <td>m0</td>
           <td><strong>Transform extension mechanism:</strong> to
           reference an authority or rules for a type of
           transformation</td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform.xml">
           transform.xml</a></td>
         </tr>
         <tr>
           <td nowrap>s0, d0</td>
           <td><strong>Transform source/destination:</strong> for
           non-languages/scripts, such as fullwidth-halfwidth
           conversion.</td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform-destination.xml">
           transform-destination.xml</a></td>
         </tr>
         <tr>
           <td>i0</td>
           <td><strong>Input Method Engine transform:</strong> Used
           to indicate an input method transformation, such as one
           used by a client-side input method. The first subfield in
           a sequence would typically be a 'platform' or vendor
           designation.</td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform_ime.xml">
           transform_ime.xml</a></td>
         </tr>
         <tr>
           <td>k0</td>
           <td><strong>Keyboard transform:</strong> Used to indicate
           a keyboard transformation, such as one used by a
           client-side virtual keyboard. The first subfield in a
           sequence would typically be a 'platform' designation,
           representing the platform that the keyboard is intended
           for. The keyboard might or might not correspond to a
           keyboard mapping shipped by the vendor for the platform.
           One or more subsequent fields may occur, but are only
           added where needed to distinguish from others.</td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform_keyboard.xml">
           transform_keyboard.xml</a></td>
         </tr>
         <tr>
           <td>t0</td>
           <td><strong>Machine Translation:</strong> Used to
           indicate content that has been machine translated, or a
           request for a particular type of machine translation of
           content. The first subfield in a sequence would typically
           be a 'platform' or vendor designation.</td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform_mt.xml">
           transform_mt.xml</a></td>
         </tr>
         <tr>
           <td nowrap>h0</td>
           <td><strong>Hybrid Locale Identifiers:</strong> h0 with
           the value 'hybrid' indicates that the -t- value is a
           language that is mixed into the main language tag to form
           a hybrid. For more information, and examples, see
           <em>Section 3.10.2 <a href="#Hybrid_Locale">Hybrid Locale
           Identifiers</a>.</em></td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform_hybrid.xml">
           transform_hybrid.xml</a></td>
         </tr>
         <tr>
           <td>x0</td>
           <td><strong>Private use transform</strong></td>
           <td><a href=
           "https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform_private_use.xml">
           transform_private_use.xml</a></td>
         </tr>
       </tbody>
     </table>
     <h4><a href="#Transformed_Content_Data_File" name=
     "Transformed_Content_Data_File" id=
     "Transformed_Content_Data_File">3.7.1 T Extension Data
     Files</a></h4>
     <p>The overall structure of the data files is the similar to
     the U Extension, with the following exceptions.</p>
     <p>In the transformed content 't' data file, the name attribute
     in a &lt;key&gt; element defines a valid field separator
     subtag. The name attribute in an enclosed &lt;type&gt; element
     defines a valid field subtag for the field separator subtag.
     For example:</p>
     <pre>
 &lt;key extension="t" name="m0"
     description="Transform extension mechanism"&gt;
         &lt;type name="ungegn"
                 description="United Nations Group of Experts on Geographical Names"
       since="21"/&gt;
 &lt;key&gt;
 </pre>The data above indicates:
     <ul>
       <li>"m0" is a valid field separator for the transformed
       content extension 't'.</li>
       <li>field subtag "ungegn" is valid for field separator
       "m0".</li>
       <li>field subtag "ungegn" was introduced in CLDR 21.</li>
     </ul>
     <p>The attributes are:</p>
     <dl>
       <dt><b>name</b></dt>
       <dd>The name of the mechanism, limited to 3-8 characters (or
       sequences of them). Any indirect type names are listed in
       3.6.4 <a href="#Unicode_Locale_Extension_Data_Files">U
       Extension Data Files</a>.</dd>
       <dt><b>description</b></dt>
       <dd>A description of the name, with all and only that
       information necessary to distinguish one name from | American
       Library others with which it might be confused. Descriptions
       are not intended to provide general background
       information.</dd>
       <dt><b>since</b></dt>
       <dd>Indicates the first version of CLDR where the name
       appears. (Required for new items.)</dd>
       <dt>&nbsp;</dt>
       <dt><b>alias</b></dt>
       <dd>Alternative name, not limited in number of characters.
       Aliases are intended for compatibility, not to provide all
       possible alternate names or designations.
       <em>(Optional)</em></dd>
     </dl>
     <p>For information about the registration process, meaning, and
     usage of the 't' extension, see [<a href=
     "#RFC6497">RFC6497</a>].</p>
     <h3><a name="Compatibility_with_Older_Identifiers" href=
     "#Compatibility_with_Older_Identifiers" id=
     "Compatibility_with_Older_Identifiers">3.8 Compatibility with
     Older Identifiers</a></h3>
     <p>LDML version before 1.7.2 used slightly different syntax for
     variant subtags and locale extensions. Implementations of LDML
     may provide backward compatible identifier support as described
     in following sections.</p>
     <h4><a name="Old_Locale_Extension_Syntax" href=
     "#Old_Locale_Extension_Syntax" id=
     "Old_Locale_Extension_Syntax">3.8.1 Old Locale Extension
     Syntax</a></h4>
     <p>LDML 1.7 or older specification used different syntax for
     representing unicode locale extensions. The previous definition
     of Unicode locale extensions had the following structure:</p>
     <table border="0">
       <tr>
         <th>&nbsp;</th>
         <th>
           <div align="center">
             EBNF
           </div>
         </th>
       </tr>
       <tr>
         <td>old_unicode_locale_extensions</td>
         <td>
           <pre>= "@" old_key "=" old_type
  (";" old_key "=" old_type)*</pre>
         </td>
       </tr>
     </table>
     <p>The new specification mandates keys to be two alphanumeric
     characters and types to be three to eight alphanumeric
     characters. As the result, new codes were assigned to all
     existing keys and some types. For example, a new key "co"
     replaced the previous key "collation", a new type "phonebk"
     replaced the previous type "phonebook". However, the existing
     collation type "big5han" already satisfied the new requirement,
     so no new type code was assigned to the type. All new keys and
     types introduced after LDML 1.7 satisfy the new requirement, so
     they do not have aliases dedicated for the old syntax, except
     time zone types. The conversion between old types and new types
     can be done regardless of key, with one known exception (old
     type "traditional" is mapped to new type "trad" for collation
     and "traditio" for numbering system), and this relationship
     will be maintained in the future versions unless otherwise
     noted.</p>
     <p>The new specification introduced a new field
     <code>attribute</code> in addition to key/type pairs in the
     Unicode locale extension. When it is necessary to map a new
     Unicode locale identifier with <code>attribute</code> field to
     a well-formed old locale identifier, a special key name
     <i>attribute</i> with the value of entire
     <code>attribute</code> subtags in the new identifier is used.
     For example, a new identifier
     <code>ja-u-xxx-yyy-ca-japanese</code> is mapped to an old
     identifier <code>ja@attribute=xxx-yyy;calendar=japanese</code>
     .</p>
     <p>The chart below shows some example mappings between the new
     syntax and the old syntax.</p>
     <table>
       <caption>
         <a name="Locale_Extension_Mappings" href=
         "#Locale_Extension_Mappings" id=
         "Locale_Extension_Mappings">Locale Extension Mappings</a>
       </caption>
       <tr>
         <th>Old (LDML 1.7 or older)</th>
         <th>New</th>
       </tr>
       <tr>
         <td>de_DE@collation=phonebook</td>
         <td>de_DE_u_co_phonebk</td>
       </tr>
       <tr>
         <td>zh_Hant_TW@collation=big5han</td>
         <td>zh_Hant_TW_u_co_big5han</td>
       </tr>
       <tr>
         <td>th_TH@calendar=gregorian;numbers=thai</td>
         <td>th_TH_u_ca_gregory_nu_thai</td>
       </tr>
       <tr>
         <td>en_US_POSIX@timezone=America/Los_Angeles</td>
         <td>en_US_u_tz_uslax_va_posix</td>
       </tr>
     </table>
     <p>Where the old API is supplied the bcp47 language code, or
     vice versa, the recommendation is to:</p>
     <ol>
       <li>Have all methods that take the old syntax also take the
       new syntax, interpreted correctly. For example,
       "zh-TW-u-co-pinyin" and "zh_TW@collation=pinyin" would both
       be interpreted as meaning the same.</li>
       <li>Have all methods (both for old and new syntax) accept all
       possible aliases for keywords and types. For example,
       "ar-u-ca-islamicc" would be equivalent to
       "ar-u-ca-islamic-civil".
         <ul>
           <li>The one exception is where an alias would only be
           well-formed with the old syntax, such as "gregorian" (for
           "gregory").</li>
         </ul>
       </li>
       <li>Where an API cannot successfully accept the alternate
       syntax, throw an exception (or otherwise indicate an error)
       so that people can detect that they are using the wrong
       method (or wrong input).</li>
       <li>Provide a method that tests a purported locale ID string
       to determine its status:
         <ol>
           <li><strong>well-formed</strong> - syntactically
           correct</li>
           <li><strong>valid</strong> - well-formed and only uses
           registered language subtags, extensions, keywords,
           types...</li>
           <li><strong>canonical</strong> - valid and no deprecated
           codes or structure.</li>
         </ol>
       </li>
     </ol>
     <h4><a name="Legacy_Variants" href="#Legacy_Variants" id=
     "Legacy_Variants">3.8.2 Legacy Variants</a></h4>
     <p>Old LDML specification allowed codes other than registered
     [<a href="#BCP47">BCP47</a>] variant subtags used in Unicode
     language and locale identifiers for representing variations of
     locale data. Unicode locale identifiers including such variant
     codes can be converted to the new [<a href="#BCP47">BCP47</a>]
     compatible identifiers by following the descriptions below:</p>
     <table>
       <caption>
         <a name="Legacy_Variant_Mappings" href=
         "#Legacy_Variant_Mappings" id=
         "Legacy_Variant_Mappings">Legacy Variant Mappings</a>
       </caption>
       <tr>
         <th>Variant Code</th>
         <th>Description</th>
       </tr>
       <tr>
         <td>AALAND</td>
         <td>Åland, variant of "sv" Swedish used in Finland. Use
         "sv_AX" to indicate this.</td>
       </tr>
       <tr>
         <td>BOKMAL</td>
         <td>Bokmål, variant of "no" Norwegian. Use primary language
         subtag "nb" to indicate this.</td>
       </tr>
       <tr>
         <td>NYNORSK</td>
         <td>Nynorsk, variant of "no" Norwegian. Use primary
         language subtag "nn" to indicate this.</td>
       </tr>
       <tr>
         <td>POSIX</td>
         <td>POSIX variation of locale data. Use Unicode locale
         extension "-u-va-posix" to indicate this.</td>
       </tr>
       <tr>
         <td>POLYTONI</td>
         <td>Polytonic, variant of "el" Greek. Use [<a href=
         "#BCP47">BCP47</a>] variant subtag "polyton" to indicate
         this.</td>
       </tr>
       <tr>
         <td>SAAHO</td>
         <td>The Saaho variant of Afar. Use primary language subtag
         "ssy" to indicated this.</td>
       </tr>
     </table>
     <p>When converting to old syntax, the Unicode locale extension
     "-u-va-posix" should be converted to the "POSIX" variant,
     <i>not</i> to old extension syntax like "@va=posix". This is an
     exception: The other mappings above should not be reversed.</p>
     <p>Examples:</p>
     <ul>
       <li>en_US_POSIX ↔ en-US-u-va-posix</li>
       <li>en_US_POSIX@colNumeric=yes ↔ en-US-u-kn-va-posix</li>
       <li>en-US-POSIX-u-kn-true → en-US-u-kn-va-posix</li>
       <li>en-US-POSIX-u-kn-va-posix → en-US-u-kn-va-posix</li>
     </ul>
     <h4><a name="Relation_to_OpenI18n" href="#Relation_to_OpenI18n"
     id="Relation_to_OpenI18n">3.8.3 Relation to OpenI18n</a></h4>
     <p>The locale id format generally follows the description in
     the <i>OpenI18N Locale Naming Guideline</i> [<a href=
     "#NamingGuideline">NamingGuideline</a>], with some
     enhancements. The main differences from the those guidelines
     are that the locale id:</p>
     <ol type="a">
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">does not
       include a charset (since the data in LDML format always
       provides a representation of all Unicode characters. The
       repository is stored in UTF-8, although that can be
       transcoded to other encodings as well.),</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">adds the
       ability to have a variant, as in Java</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">adds the
       ability to discriminate the written language by script (or
       script variant).</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">is a
       superset of [<a href="#BCP47">BCP47</a>] codes.</li>
     </ol>
     <h3><a name="Transmitting_Locale_Information" href=
     "#Transmitting_Locale_Information" id=
     "Transmitting_Locale_Information">3.9 Transmitting Locale
     Information</a></h3>
     <p>In a world of on-demand software components, with arbitrary
     connections between those components, it is important to get a
     sense of where localization should be done, and how to transmit
     enough information so that it can be done at that appropriate
     place. End-users need to get messages localized to their
     languages, messages that not only contain a translation of
     text, but also contain variables such as date, time, number
     formats, and currencies formatted according to the users'
     conventions. The strategy for doing the so-called <i>JIT
     localization</i> is made up of two parts:</p>
     <ol>
       <li>Store and transmit <i>neutral-format</i> data wherever
       possible.
         <ul>
           <li>Neutral-format data is data that is kept in a
           standard format, no matter what the local user's
           environment is. Neutral-format is also (loosely) called
           <i>binary data</i>, even though it actually could be
           represented in many different ways, including a textual
           representation such as in XML.</li>
           <li>Such data should use accepted standards where
           possible, such as for currency codes.</li>
           <li>Textual data should also be in a uniform character
           set (Unicode/10646) to avoid possible data corruption
           problems when converting between encodings.</li>
         </ul>
       </li>
       <li>Localize that data as "<i>close</i>" to the end-user as
       possible.</li>
     </ol>
     <p>There are a number of advantages to this strategy. The
     longer the data is kept in a neutral format, the more flexible
     the entire system is. On a practical level, if transmitted data
     is neutral-format, then it is much easier to manipulate the
     data, debug the processing of the data, and maintain the
     software connections between components.</p>
     <p>Once data has been localized into a given language, it can
     be quite difficult to programmatically convert that data into
     another format, if required. This is especially true if the
     data contains a mixture of translated text and formatted
     variables. Once information has been localized into, say,
     Romanian, it is much more difficult to localize that data into,
     say, French. Parsing is more difficult than formatting, and may
     run up against different ambiguities in interpreting text that
     has been localized, even if the original translated message
     text is available (which it may not be).</p>
     <p>Moreover, the closer we are to end-user, the more we know
     about that user's preferred formats. If we format dates, for
     example, at the user's machine, then it can easily take into
     account any customizations that the user has specified. If the
     formatting is done elsewhere, either we have to transmit
     whatever user customizations are in play, or we only transmit
     the user's locale code, which may only approximate the desired
     format. Thus the closer the localization is to the end user,
     the less we need to ship all of the user's preferences around
     to all the places that localization could possibly need to be
     done.</p>
     <p>Even though localization should be done as close to the
     end-user as possible, there will be cases where different
     components need to be aware of whatever settings are
     appropriate for doing the localization. Thus information such
     as a locale code or time zone needs to be communicated between
     different components.</p>
     <h4><a name="Message_Formatting_and_Exceptions" href=
     "#Message_Formatting_and_Exceptions" id=
     "Message_Formatting_and_Exceptions">3.9.1 Message Formatting
     and Exceptions</a></h4>
     <p>Windows (<a href=
     "https://msdn.microsoft.com/en-us/library/ms679351.aspx">FormatMessage</a>,
     <a href=
     "https://msdn.microsoft.com/en-us/library/aa331875.aspx">String.Format</a>),
     Java (<a href=
     "https://docs.oracle.com/javase/7/docs/api/java/text/MessageFormat.html">MessageFormat</a>)
     and ICU (<a href=
     "http://www.icu-project.org/apiref/icu4c/classMessageFormat.html">MessageFormat</a>,
     <a href=
     "http://www.icu-project.org/apiref/icu4c/umsg_8h.html">umsg</a>)
     all provide methods of formatting variables (dates, times, etc)
     and inserting them at arbitrary positions in a string. This
     avoids the manual string concatenation that causes severe
     problems for localization. The question is, where to do this?
     It is especially important since the original code site that
     originates a particular message may be far down in the bowels
     of a component, and passed up to the top of the component with
     an exception. So we will take that case as representative of
     this class of issues.</p>
     <p>There are circumstances where the message can be
     communicated with a language-neutral code, such as a numeric
     error code or mnemonic string key, that is understood outside
     of the component. If there are arguments that need to accompany
     that message, such as a number of files or a datetime, those
     need to accompany the numeric code so that when the
     localization is finally at some point, the full information can
     be presented to the end-user. This is the best case for
     localization.</p>
     <p>More often, the exact messages that could originate from
     within the component are not known outside of the component
     itself; or at least they may not be known by the component that
     is finally displaying text to the user. In such a case, the
     information as to the user's locale needs to be communicated in
     some way to the component that is doing the localization. That
     locale information does not necessarily need to be communicated
     deep within the component; ideally, any exceptions should
     bundle up some language-neutral message ID, plus the arguments
     needed to format the message (for example, datetime), but not
     do the localization at the throw site. This approach has the
     advantages noted above for JIT localization.</p>
     <p>In addition, exceptions are often caught at a higher level;
     they do not end up being displayed to any end-user at all. By
     avoiding the localization at the throw site, it the cost of
     doing formatting, when that formatting is not really necessary.
     In fact, in many running programs most of the exceptions that
     are thrown at a low level never end up being presented to an
     end-user, so this can have considerable performance
     benefits.</p>
     <h3><a name="Language_and_Locale_IDs" href=
     "#Language_and_Locale_IDs" id="Language_and_Locale_IDs">3.10
     Unicode Language and Locale IDs</a></h3>
     <p>People have very slippery notions of what distinguishes a
     language code versus a locale code. The problem is that both
     are somewhat nebulous concepts.</p>
     <p>In practice, many people use [<a href="#BCP47">BCP47</a>]
     codes to mean locale codes instead of strictly language codes.
     It is easy to see why this came about; because [<a href=
     "#BCP47">BCP47</a>] includes an explicit region (territory)
     code, for most people it was sufficient for use as a locale
     code as well. For example, when typical web software receives
     an [<a href="#BCP47">BCP47</a>] code, it will use it as a
     locale code. Other typical software will do the same: in
     practice, language codes and locale codes are treated
     interchangeably. Some people recommend distinguishing on the
     basis of "-" versus "_" (for example, <i>zh-TW</i> for language
     code, <i>zh_TW</i> for locale code), but in practice that does
     not work because of the free variation out in the world in the
     use of these separators. Notice that Windows, for example, uses
     "-" as a separator in its locale codes. So pragmatically one is
     forced to treat "-" and "_" as equivalent when interpreting
     either one on input.</p>
     <p>Another reason for the conflation of these codes is that
     <i>very</i> little data in most systems is distinguished by
     region alone; currency codes and measurement systems being some
     of the few. Sometimes date or number formats are mentioned as
     regional, but that really does not make much sense. If people
     see the sentence "You will have to adjust the value to
     १,२३४.५६७ from ૭૧,૨૩૪.૫૬" (using Indic digits), they would say
     that sentence is simply not English. Number format is far more
     closely associated with language than it is with region. The
     same is true for date formats: people would never expect to see
     intermixed a date in the format "2003年4月1日" (using Kanji) in
     text purporting to be purely English. There are regional
     differences in date and number format — differences which can
     be important — but those are different in kind than other
     language differences between regions.</p>
     <p>As far as we are concerned — <i>as a completely practical
     matter</i> — two languages are different if they require
     substantially different localized resources. Distinctions
     according to spoken form are important in some contexts, but
     the written form is by far and away the most important issue
     for data interchange. Unfortunately, this is not the principle
     used in [<a href="#ISO639">ISO639</a>], which has the fairly
     unproductive notion (for data interchange) that only spoken
     language matters (it is also not completely consistent about
     this, however).</p>
     <p>[<a href="#BCP47">BCP47</a>] <i><b>can</b></i> express a
     difference if the use of written languages happens to
     correspond to region boundaries expressed as [<a href=
     "#ISO3166">ISO3166</a>] region codes, and has recently added
     codes that allow it to express some important cases that are
     not distinguished by [<a href="#ISO3166">ISO3166</a>] codes.
     These written languages include simplified and traditional
     Chinese (both used in Hong Kong S.A.R.); Serbian in Latin
     script; Azerbaijani in Arab script, and so on.</p>
     <p>Notice also that <i>currency codes</i> are different than
     <i>currency localizations</i>. The currency localizations
     should largely be in the language-based resource bundles, not
     in the territory-based resource bundles. Thus, the resource
     bundle <i>en</i> contains the localized mappings in English for
     a range of different currency codes: USD → US$, RUR → Rub, AUD
     → $A and so on. Of course, some currency symbols are used for
     more than one currency, and in such cases specializations
     appear in the territory-based bundles. Continuing the example,
     <i>en_US</i> would have USD → $, while <i>en_AU</i> would have
     AUD → $. (In protocols, the currency codes should always
     accompany any currency amounts; otherwise the data is
     ambiguous, and software is forced to use the user's territory
     to guess at the currency. For some informal discussion of this,
     see <a href=
     "http://source.icu-project.org/repos/icu/icuhtml/trunk/design/jit_localization.html">
     JIT Localization</a>.)</p>
     <h4><a name="Written_Language" href="#Written_Language" id=
     "Written_Language">3.10.1 Written Language</a></h4>
     <p>Criteria for what makes a written language should be purely
     pragmatic; <i>what would copy-editors say?</i> If one gave them
     text like the following, they would respond that is far from
     acceptable English for publication, and ask for it to be
     redone:</p>
     <ol>
       <li type="A">"Theatre Center News: The date of the last
       version of this document was 2003年3月20日. A copy can be
       obtained for $50,0 or 1.234,57 грн. We would like to
       acknowledge contributions by the following authors (in
       alphabetical order): Alaa Ghoneim, Behdad Esfahbod, Ahmed
       Talaat, Eric Mader, Asmus Freytag, Avery Bishop, and Doug
       Felt."</li>
     </ol>
     <p>So one would change it to either B or C below, depending on
     which orthographic variant of English was the target for the
     publication:</p>
     <ol type="A" start="2">
       <li>"Theater Center News: The date of the last version of
       this document was 3/20/2003. A copy can be obtained for
       $50.00 or 1,234.57 Ukrainian Hryvni. We would like to
       acknowledge contributions by the following authors (in
       alphabetical order): Alaa Ghoneim, Ahmed Talaat, Asmus
       Freytag, Avery Bishop, Behdad Esfahbod, Doug Felt, Eric
       Mader."</li>
       <li>"Theatre Centre News: The date of the last version of
       this document was 20/3/2003. A copy can be obtained for
       $50.00 or 1,234.57 Ukrainian Hryvni. We would like to
       acknowledge contributions by the following authors (in
       alphabetical order): Alaa Ghoneim, Ahmed Talaat, Asmus
       Freytag, Avery Bishop, Behdad Esfahbod, Doug Felt, Eric
       Mader."</li>
     </ol>
     <p>Clearly there are many acceptable variations on this text.
     For example, copy editors might still quibble with the use of
     first versus last name sorting in the list, but clearly the
     first list was <i>not</i> acceptable English alphabetical
     order. And in quoting a name, like "Theatre Centre News", one
     may leave it in the source orthography even if it differs from
     the publication target orthography. And so on. However, just as
     clearly, there limits on what is acceptable English, and
     "2003年3月20日", for example, is <i>not</i>.</p>
     <p>Note that the language of locale data may differ from the
     language of localized software or web sites, when those latter
     are not localized into the user's preferred language. In such
     cases, the kind of incongruous juxtapositions described above
     may well appear, but this situation is usually preferable to
     forcing unfamiliar date or number formats on the user as
     well.</p>
     <h4><a name="Hybrid_Locale" href="#Hybrid_Locale" id=
     "Hybrid_Locale">3.10.2 Hybrid Locale Identifiers</a></h4>
     <p>Hybrid locales have intermixed content from 2 (or more)
     languages, often with one language's grammatical structure
     applied to words in another. These are commonly referred to
     with portmanteau words such as&nbsp;<em>Franglais, <a href=
     "https://en.oxforddictionaries.com/definition/spanglish">Spanglish</a></em>
     or <em>Denglish</em>. Hybrid locales do not&nbsp;<em>not</em>
     reference text simply containing two languages: a book of
     parallel text containing English and French, such as the
     following, is not Franglais:</p>
     <table style='margin-left:2em; margin-right:2em'>
       <tbody>
         <tr>
           <td width='50%' style='font-family:serif'>On the 24th of
           May, 1863, my uncle, Professor Liedenbrock, rushed into
           his little house, No. 19 Königstrasse, one of the oldest
           streets in the oldest portion of the city of
           Hamburg…</td>
           <td style='font-family:serif'>Le 24 mai 1863, un
           dimanche, mon oncle, le professeur Lidenbrock, revint
           précipitamment vers sa petite maison située au numéro 19
           de Königstrasse, l’une des plus anciennes rues du vieux
           quartier de Hambourg…</td>
         </tr>
       </tbody>
     </table>
     <p>While text in a document can be tagged as partly in one
     language and partly in another, that is not the same having a
     hybrid locale. There is a difference between having a Spanglish
     document, and a Spanish document that has some passages quoted
     in English. Fine-grained tagging doesn't handle grammatical
     combinations like Denglisch “<a href=
     "https://www.duden.de/rechtschreibung/downloaden">gedownloadet</a>”,
     which is neither English nor German — similarly the Franglais
     “<a href=
     'https://www.le-dictionnaire.com/definition.php?mot=downloader'>downloadé</a>”.
     More importantly, it doesn’t work for the very common use case
     for a <a href="#unicode_locale_id">unicode_locale_id</a>:
     <i>locale selection</i>.</p>
     <p>To communicate requests for localized content and
     internationalization services, locales are used. When people
     pick a language from a menu, internally they are picking a
     locale (en-GB, es-419, etc.). To allow an application to
     support Spanglish or Hinglish locale selection, <a href=
     "#unicode_locale_id">unicode_locale_id</a>s can represent
     hybrid locales using the T extension key-value 'h0-hybrid'.
     (For more information on the T extension, see <em>Section 3.7
     <a href="#t_Extension">Unicode BCP 47 T
     Extension</a>.</em>)</p>
     <p>Examples:</p>
     <table class='simple'>
       <tbody>
         <tr>
           <td>hi-t-<u>en-h0-hybrid</u></td>
           <td>Hinglish</td>
           <td>Hindi-English hybrid locale</td>
         </tr>
         <tr>
           <td>ta-t-<u>en-h0-hybrid</u></td>
           <td>Tanglish</td>
           <td>Tamil-English hybrid locale</td>
         </tr>
         <tr>
           <td>ba-t-<u>en-h0-hybrid</u></td>
           <td>Banglish</td>
           <td>Bangla-English hybrid locale</td>
         </tr>
         <tr>
           <td colspan="3">…</td>
         </tr>
         <tr>
           <td>en-t-<u>hi-h0-hybrid</u></td>
           <td>Hinglish</td>
           <td>English-Hindi hybrid locale</td>
         </tr>
         <tr>
           <td>en-t-<u>zh-h0-hybrid</u></td>
           <td>Chinglish</td>
           <td>English-Chinese hybrid locale</td>
         </tr>
         <tr>
           <td colspan="3">…</td>
         </tr>
       </tbody>
     </table>
     <blockquote>
       <p><em>Note: The <a href=
       "#unicode_language_id">unicode_language_id</a> should be the
       language used as the ‘scaffold’: for the fallback locale for
       internationalization services, typically used for more of the
       core vocabulary/structure in the content. Thus Hinglish
       should be represented as hi-t-h0-en where Hindi is the
       scaffold, and as en-t-h0-hi where English is.</em></p>
     </blockquote>
     <p>The value of -t- is a full <em><a href=
     "#unicode_language_id">unicode_language_id</a></em>, and can
     contain subtags for script or region where it is important to
     include them, as in the following. It may be useful in order to
     emphasize the script, even where it is the default script for
     the language, if it is not the same as the script of the main
     language tag.</p>
     <table class='simple'>
       <tbody>
         <tr>
           <td>ru-t<u>-en-latn-gb-h0-hybrid</u></td>
           <td>Runglish</td>
           <td>Russian with an admixture of British English in Latin
           script</td>
         </tr>
         <tr>
           <td>ru-t-<u>en-cyrl-gb-h0-hybrid</u></td>
           <td>Runglish</td>
           <td>Russian with an admixture of British English in
           Cyrillic script</td>
         </tr>
       </tbody>
     </table>
     <p>Should there ever be strong need for hybrids of more than
     two languages or for other purposes such as hybrid languages as
     the source of translated content, additional structure could be
     added.</p>
     <h3><a name="Validity_Data" href="#Validity_Data" id=
     "Validity_Data">3.11 Validity Data</a></h3>
     <p class='dtd'>&lt;!ELEMENT idValidity (id*) &gt;<br>
     &lt;!ELEMENT id ( #PCDATA ) &gt;<br>
     &lt;!ATTLIST id type NMTOKEN #REQUIRED &gt;<br>
     &lt;!ATTLIST id idStatus NMTOKEN #REQUIRED &gt;</p>
     <p>The directory <a href=
     'https://github.com/unicode-org/cldr/releases/tag/latest/common/validity/'>common/validity</a>
     contains machine-readable data for validating the language,
     region, script, and variant subtags, as well as currency,
     subdivisions and measure units. Each file contains a number of
     subtags with the following <strong>idStatus</strong>
     values:</p>
     <ul>
       <li><strong>regular</strong> — the standard codes used for
       the specific type of subtag</li>
       <li><strong>special</strong> — certain exceptional language
       codes like 'mul' <em>(languages only)</em></li>
       <li><strong>unknown</strong> — the code used to indicate the
       "unknown", "undetermined" or "invalid" values. For more
       information, see <em>Section 3.5.1 <a href=
       "#Unknown_or_Invalid_Identifiers">Unknown or Invalid
       Identifiers</a></em>.</li>
       <li>
         <strong>macroregion</strong> — the standard codes that are
         macroregions <em>(for regions only).</em>
         <ul>
           <li>Note that some two-letter region codes are
           macroregions, and (in the future) some three-digit codes
           may be regular codes.</li>
           <li>For details as to which regions are contained within
           which macroregions, see the
           <strong>&lt;containment&gt;</strong> element of the
           supplemental data.</li>
         </ul>
       </li>
       <li><strong>deprecated</strong> — codes that should not be
       used. The <strong>&lt;alias&gt;</strong> element in the
       supplementalMeta file contains more information about these
       codes, and which codes should be used instead.</li>
       <li><strong>private_use</strong> — codes that, for CLDR, are
 		  considered private use. Note that some private-use
 		  codes in a source standard such as BCP47 have defined CLDR semantics, and are considered regular
       codes. For more information, see <em>Section 3.5.3 <a href=
       "#Private_Use_Codes">Private Use Codes</a>.</em></li>
       <li><strong>reserved</strong> — codes that are private use in a source standard, but are reserved for future use as regular codes by CLDR.</li>
     </ul>
     <p>The list of subtags for each idStatus use a compact format
     as a space-delimited list of StringRanges, as defined in
     <em>Section <a href="#String_Range">5.3.4 String
     Range</a>.</em> The separator for each StringRange is a
     "~".</p>
     <p>Each measure unit is a sequence of subtags, such as
     “angle-arc-minute”. The first subtag provides a general
     “category” of the unit.</p>
     <p>In version 28.0, the subdivisions in the validity files used
     the ISO format, uppercase with a hyphen separating two
     components, instead of the BCP 47 format.</p>
     <h2><a name="Locale_Inheritance" href="#Locale_Inheritance" id=
     "Locale_Inheritance">4 Locale Inheritance and Matching</a></h2>
     <p>The XML format relies on an inheritance model, whereby the
     resources are collected into <i>bundles</i>, and the bundles
     organized into a tree. Data for the many Spanish locales does
     not need to be duplicated across all of the countries having
     Spanish as a national language. Instead, common data is
     collected in the Spanish language locale, and territory locales
     only need to supply differences. The parent of all of the
     language locales is a generic locale known as <i>root</i>.
     Wherever possible, the resources in the root are language &amp;
     territory neutral. For example, the collation (sorting) order
     in the root is based on the [<a href="#DUCET">DUCET</a>]
     (see<em><a href="tr35-collation.html#Root_Collation">Root
     Collation</a></em>). Since English language collation has the
     same ordering as the root locale, the 'en' locale data does not
     need to supply any collation data, nor do the 'en_US', 'en_GB'
     or the any of the various other locales that use English.</p>
     <p>Given a particular locale id "en_US_someVariant", the search
     chain for a particular resource is the following.</p>
     <blockquote>
       <pre>en_US_someVariant
 en_US
 en
 root</pre>
     </blockquote>
     <p><em>The inheritance is often not simple truncation, as will
     be seen later in this section.</em></p>
     <p>If a type and key are supplied in the locale id, then
     logically the chain from that id to the root is searched for a
     resource tag with a given type, all the way up to root. If no
     resource is found with that tag and type, then the chain is
     searched again without the type.</p>
     <p>Thus the data for any given locale will only contain
     resources that are different from the parent locale. For
     example, most territory locales will inherit the bulk of their
     data from the language locale: "en" will contain the bulk of
     the data: "en_IE" will only contain a few items like currency.
     All data that is inherited from a parent is presumed to be
     valid, just as valid as if it were physically present in the
     file. This provides for much smaller resource bundles, and much
     simpler (and less error-prone) maintenance. At the script or
     region level, the "primary" child locale will be empty, since
     its parent will contain all of the appropriate resources for
     it. For more information see <i>CLDR Information : Section 9.3
     <a href="tr35-info.html#Default_Content">Default
     Content</a>.</i></p>
     <p>Certain data items depend only on the region specified in a
     locale id (by a <a href=
     "#unicode_region_subtag_validity">unicode_region_subtag</a> or
     an “rg” <a href="#RegionOverride">Region Override</a> key) ,
     and are obtained from supplemental data rather than through
     locale resources. For example:</p>
     <ul>
       <li>The currency for the specified region (see <a href=
       "tr35-numbers.html#Supplemental_Currency_Data">Supplemental
       Currency Data</a>)</li>
       <li>The measurement system for the specified region (see
       <a href=
       "tr35-general.html#Measurement_System_Data">Measurement
       System Data</a>)</li>
       <li>The week conventions for the specified region (see
       <a href="tr35-dates.html#Week_Data">Week Data</a>)</li>
     </ul>
     <p>(For more information on the specific items handled this
     way, see <a href=
     "tr35-info.html#Territory_Based_Preferences">Territory-Based
     Preferences</a>.) These items will be correct for the specified
     region regardless of whether a locale bundle actually exists
     with the same combination of language and region as in the
     locale id. For example, suppose data is requested for the
     locale id "fr_US" and there is no bundle for that combination.
     Data obtained via locale inheritance, such as currency patterns
     and currency symbols, will be obtained from the parent locale
     "fr". However, currency amounts would be formatted by default
     using US dollars, just displayed in the manner governed by the
     locale "fr". When a locale id does not specify a region, the
     region-specific items such as those above are obtained from the
     likely region for the locale (obtained via <a href=
     "#Likely_Subtags">Likely Subtags</a>).</p>
     <p>For the relationship between Inheritance, DefaultContent,
     LikelySubtags, and LocaleMatching, see Section 4.2.6 <a href=
     "tr35.html#Inheritance_vs_Related">Inheritance vs Related
     Information</a>.</p>
     <h3><a href="#Lookup" name="Lookup" id="Lookup">4.1
     Lookup</a></h3>
     <p>If a language has more than one script in customary modern
     use, then the CLDR file structure in common/main follows the
     following model:</p>
     <blockquote>
       <p>lang<br>
       lang_script<br>
       lang_script_region<br>
       lang_region <i>(aliases to lang_script_region)</i></p>
     </blockquote>
     <h4><a href="#Bundle_vs_Item_Lookup" name=
     "Bundle_vs_Item_Lookup" id="Bundle_vs_Item_Lookup">4.1.1 Bundle
     vs Item Lookup</a></h4>
     <p>There are actually two different kinds of inheritance
     fallback: <em>resource&nbsp;bundle&nbsp;lookup</em> and
     <em>resource&nbsp;item&nbsp;lookup</em>. For the former, a
     process is looking to find the first, best resource bundle it
     can; for the later, it is fallback&nbsp;within&nbsp;bundles on
     individual items, like the translated name for the region "CN"
     in Breton.</p>
     <p>These are closely related, but distinct, processes. They are
     illustrated in the table <a href="#Lookup-Differences">Lookup
     Differences</a>, where "key" stands for zero or more key/type
     pairs. Logically speaking, when looking up an item for a given
     locale, you first do a resource bundle lookup to find the best
     bundle for the locale, then you do a inherited item lookup
     starting with that resource bundle.</p>
     <p>The table <a href="#Lookup-Differences">Lookup
     Differences</a> uses the naïve resource bundle lookup for
     illustration. More sophisticated systems will get far better
     results for resource bundle lookup if they use the algorithm
     described in <em>Section 4.4 <a href=
     "#LanguageMatching">Language Matching</a></em>. That algorithm
     takes into account both the user’s desired locale(s) and the
     application’s supported locales, in order to get the best
     match.</p>
     <p>If the naïve resource bundle lookup is used, the desired
     locale needs to be canonicalized using 4.3 <a href=
     "#Likely_Subtags">Likely Subtags</a> and the supplemental alias
     information, so that locales that CLDR considers identical are
     treated as such. Thus eng-Latn-GB should be mapped to en-GB,
     and cmn-TW mapped to zh-Hant-TW.</p>
     <p>For the purposes of CLDR, everything with the &lt;ldml&gt;
     dtd is treated logically as if it is one resource bundle, even
     if the implementation separates data into separate physical
     resource bundles. For example, suppose that there is a main XML
     file for Nama (naq), but there are no &lt;unit&gt; elements for
     it because the units are all inherited from root. If the
     &lt;unit&gt; elements are separated into a separate data tree
     for modularity in the implementation, the Nama &lt;unit&gt;
     resource bundle would be empty. However, for purposes of
     resource-bundle lookup the resource bundle lookup still stops
     at naq.xml.</p>
     <div id="iqaw2" style="margin-top: 0px; margin-bottom: 0px;">
       <table class='simple' id="a1bn" border="1" cellpadding="3"
       cellspacing="0">
         <caption>
           <a href="#Lookup-Differences" name="Lookup-Differences"
           id="Lookup-Differences">Lookup Differences</a>
         </caption>
         <tbody id="iqaw3">
           <tr id="x40y0">
             <th id="x40y1" style="vertical-align: top;" nowrap>
             Lookup Type</th>
             <th id="x40y3" style="vertical-align: top;" nowrap>
             Example</th>
             <th id="x40y5" style="vertical-align: top;">
             Comments</th>
           </tr>
           <tr id="iqaw4">
             <td id="iqaw5" style="vertical-align: top;" nowrap>
               <p id="rkc40"><strong>Resource bundle</strong>
               lookup</p>
             </td>
             <td id="iqaw7" style="vertical-align: top;" nowrap>
               <p>se-FI&nbsp;→</p>
               <p>se&nbsp; →</p>
               <p><em>default-locale*&nbsp;&nbsp;→</em></p>
               <p>root</p>
             </td>
             <td id="rkc41" style="vertical-align: top;">
               <p>* The default-locale may have its own inheritance
               change; for example, it may be "en-GB&nbsp;→&nbsp;en"
               In that case, the chain is expanded by inserting the
               chain, resulting in:</p>
               <blockquote>
                 <p>se-FI →</p>
                 <p>se →</p>
                 <p>fi →</p>
                 <p><em>en-GB →</em></p>
                 <p><em>en →</em></p>
                 <p>root</p>
               </blockquote>
             </td>
           </tr>
           <tr id="iqaw9">
             <td id="iqaw10" style="vertical-align: top;" nowrap>
               <p><strong>Inherited item</strong> lookup</p>
             </td>
             <td id="iqaw12" style="vertical-align: top;" nowrap>
               <p>se-FI+key&nbsp;→</p>
               <p>se+key →</p>
               <p><em>root_alias*+key&nbsp;</em></p>
               <p>→&nbsp;root+key</p>
             </td>
             <td id="rkc43" style="vertical-align: top;">
               <p>* If there is a root_alias to another key or
               locale, then insert that entire chain. For example,
               suppose that months for another calendar system have
               a root alias to Gregorian months. In that case, the
               root alias would change the key, and retry from se-FI
               downward. This can happen multiple times.</p>
               <blockquote>
                 <p>se-FI+key&nbsp;→</p>
                 <p>se+key →</p>
                 <p>root_alias*+key →</p>
                 <p><em>se-FI+key2&nbsp;→</em></p>
                 <p><em>se+key2 →</em></p>
                 <p>root_alias*+key2 →</p>
                 <p>root+key2</p>
               </blockquote>
             </td>
           </tr>
         </tbody>
       </table>
     </div>
     <p>Both the resource bundle inheritance and the inherited item
     inheritance use the parentLocale data, where available, instead
     of simple trunctation.</p>
     <p>The fallback is a bit different for these two cases;
     internal aliases and keys are are not involved in the bundle
     lookup, and the default locale is not involved in the item
     lookup. If the default-locale were used in the resource-item
     lookup, then strange results will occur. For example, suppose
     that the default locale is Swedish, and there is a Nama locale
     but no specific inherited item for collation. If the
     default-locale were used in resource-item lookup, it would
     produce odd and unexpected results for Nama sorting.</p>
     <p>The default locale is not even always used in resource
     bundle inheritance. For the following services, the fallback is
     always directly to the root locale rather than through default
     locale.</p>
     <ul>
       <li>collation</li>
       <li>break iteration</li>
       <li>case mapping</li>
       <li>transliteration
         <ul>
           <li>The lookup for transliteration is yet more
           complicated because of the interplay of source and target
           locales: see <em>Part 2 General, Section
           10.1&nbsp;<a href=
           "https://www.unicode.org/reports/tr35/tr35-general.html#Inheritance">Inheritance.</a></em></li>
         </ul>
       </li>
     </ul>
     <p>Thus if there is no Akan locale, for example, asking for a
     collation for Akan should produce the root collation, <em>not
     the Swedish collation.</em></p>
     <p>The inherited item lookup must remain stable, because the
     resources are built with a certain fallback in mind; changing
     the core fallback order can render the bundle structure
     incoherent.</p>
     <p>Resource bundle lookup, on the other hand, is more flexible;
     changes in the view of the "best" match between the input
     request and the output bundle are more tolerant, when represent
     overall improvements for users. For more information, see
     <i><a href="#Fallback_Elements">A.1 Element
     fallback</a></i>.</p>
     <p>Where the LDML inheritance relationship does not match a
     target system, such as POSIX, the data logically should be
     fully resolved in converting to a format for use by that
     system, by adding <i>all</i> inherited data to each locale data
     set.</p>
     <p>For a more complete description of how inheritance applies
     to data, and the use of keywords, see <i><a href=
     "#Inheritance_and_Validity">Section 4.2 Inheritance</a></i>
     .</p>
     <p>The locale data does not contain general character
     properties that are derived from the <i>Unicode Character
     Database</i> [<a href=
     "https://unicode.org/reports/tr41/#UAX44">UAX44</a>]. That data
     being common across locales, it is not duplicated in the
     bundles. Constructing a POSIX locale from the CLDR data
     requires use of UCD data. In addition, POSIX locales may also
     specify the character encoding, which requires the data to be
     transformed into that target encoding.</p>
     <p><b>Warning:</b> If a locale has a different script than its
     parent (for example, sr_Latn), then special attention must be
     paid to make sure that all inheritance is covered. For example,
     auxiliary exemplar characters may need to be empty ("[]") to
     block inheritance.</p>
     <p><strong>Empty Override:</strong> There is one special value
     reserved in LDML to indicate that a child locale is to have no
     value for a path, even if the parent locale has a value for
     that path. That value is "∅∅∅". For example, if there is no
     phrase for "two days ago" in a language, that can be indicated
     with:</p>
     <pre>&lt;field type="day"&gt;
   &lt;relative type="-2"&gt;∅∅∅&lt;/relative&gt;
 </pre>
     <h4><a name="Multiple_Inheritance" id=
     "Multiple_Inheritance"></a><a name="Lateral_Inheritance" href=
     "#Lateral_Inheritance" id="Lateral_Inheritance">4.1.2 Lateral
     Inheritance</a></h4>
     <p>In the following instances, resources may inherit from
     within the same locale, <em>before inheriting from the parent</em>. </p>

     <table border="1" cellpadding="3" cellspacing=
     "0" class='simple' >
       <tbody>
         <tr>
           <th nowrap style="vertical-align: top;">Element</th>
           <th nowrap style="vertical-align: top;">Source</th>
           <th nowrap style="vertical-align: top;">Context</th>
         </tr>
         <tr>
           <td  style="vertical-align: top;">currency/pattern</td>
           <td   style="vertical-align: top;">currencyFormat</td>
           <td   style="vertical-align: top;">numberSystem = defaultNumberingSystem, unless otherwise specified*<br>
             currencyFormatLength  type=none, unless otherwise specified<br>
             currencyFormat type=&quot;standard&quot;, unless otherwise specified</td>
         </tr>
         <tr>
           <td  style="vertical-align: top;">currency/decimal</td>
           <td   style="vertical-align: top;">symbols/decimal</td>
           <td  style="vertical-align: top;">numberSystem = defaultNumberingSystem, unless otherwise specified</td>
         </tr>
         <tr>
           <td  style="vertical-align: top;">currency/group</td>
           <td  style="vertical-align: top;">symbols/group</td>
           <td style="vertical-align: top;">numberSystem = defaultNumberingSystem, unless otherwise specified</td>
         </tr>
       </tbody>
     </table>
     <p>* The &quot;unless otherwise specified&quot; clause is for when an API or other context indicates a different choice, such as <span style="vertical-align: top;">currencyFormat type=&quot;accounting&quot;</span>.    </p>
     <p>For example, with 			/currency [@type=&quot;CVE&quot;], the decimal symbol for almost all locales is the value from symbols/decimal, but for pt_CV it is explicitly							&lt;decimal&gt;$&lt;/decimal&gt;.</p>
     <p>&nbsp;</p>
     <p>The following attributes use lateral inheritance for all elements with the DTD root = ldml, except where otherwise noted. The process is applied recursively.</p>
     <table border="1" cellpadding="3" cellspacing=
     "0" class='simple' >
       <tbody>
         <tr>
           <th nowrap style="vertical-align: top;">Atttribute</th>
           <th nowrap style="vertical-align: top;">Fallback</th>
           <th nowrap style="vertical-align: top;">Exception Elements</th>
         </tr>
         <tr>
           <td  style="vertical-align: top;">case</td>
           <td   style="vertical-align: top;">&quot;nominative&quot; → ∅</td>
           <td   style="vertical-align: top;">caseMinimalPairs</td>
         </tr>
         <tr>
           <td  style="vertical-align: top;">gender</td>
           <td   style="vertical-align: top;">default_gender(locale) → ∅</td>
           <td   style="vertical-align: top;">genderMinimalPairs</td>
         </tr>
         <tr>
           <td  style="vertical-align: top;">count</td>
           <td  style="vertical-align: top;">plural_rules(locale, x)  → &quot;other&quot;  → ∅</td>
           <td  style="vertical-align: top;">minDays, pluralMinimalPairs</td>
         </tr>
         <tr>
           <td  style="vertical-align: top;">ordinal</td>
           <td  style="vertical-align: top;">plural_rules(locale, x)  → &quot;other&quot;  → ∅</td>
           <td  style="vertical-align: top;">ordinalMinimalPairs</td>
         </tr>
       </tbody>
     </table>
     <p>The gender fallback is to neuter if the locale has a neuter gender, otherwise masculine. This may be extended in the future if necessary. See also <a href="tr35-general.html#Grammatical_Features">Part 2, Section 15, Grammatical Features</a>.</p>

     <p>For example,    if there is no value for a path, and that path has a
       [@count="x"] attribute and value, then:</p>
     <ol>
       <li>If &quot;x&quot; is numeric, the path falls back to the path with [@count=«the plural rules category for x for that locale»], within that the same locale.
         <ol>
           <li>For example, [@count="0"] for English falls back to @count="other"], while for French falls back to [@count="one"].</li>
         </ol>
       </li>
       <li>If "x" is anything but "other", it falls back to
         a path [@count="other"], within that the same locale.</li>
       <li>If &quot;x&quot; is &quot;other&quot;,
        it falls back to the path
       that is completely missing the count item, within that the same locale.</li>
       <li>If there is no value for that path the same locale, the same
       process is used for the original path in the parent locale.</li>
     </ol>

 	  	    <p>A path may have multiple attributes with lateral inheritance. In such a case, all of the combinations are tried, and in the order supplied above. For example (this is the very worst case):</p>
 	  	    <p> 				/compoundUnitPattern1[@count=&quot;few&quot;][@gender=&quot;feminine&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
 	  	    <p>/compoundUnitPattern1[@count=&quot;few&quot;][@gender=&quot;feminine&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span>      </p>
 	  	    <p>/compoundUnitPattern1[@count=&quot;few&quot;][@gender=&quot;feminine&quot;]<span style="vertical-align: top;"> →</span></p>
 	  	    <p>/compoundUnitPattern1[@count=&quot;few&quot;][@gender=&quot;neuter&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;few&quot;][@gender=&quot;neuter&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;few&quot;][@gender=&quot;neuter&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;few&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;few&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;few&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>&nbsp;</p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@gender=&quot;feminine&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@gender=&quot;feminine&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@gender=&quot;feminine&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@gender=&quot;neuter&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@gender=&quot;neuter&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@gender=&quot;neuter&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@count=&quot;other&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>&nbsp;</p>
             <p>/compoundUnitPattern1[@gender=&quot;feminine&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@gender=&quot;feminine&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@gender=&quot;feminine&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@gender=&quot;neuter&quot;][@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@gender=&quot;neuter&quot;][@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@gender=&quot;neuter&quot;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@case=&quot;accusative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1[@case=&quot;nominative&quot;&gt;]<span style="vertical-align: top;"> →</span></p>
             <p>/compoundUnitPattern1</p>

     <p>&nbsp;</p>
     <p><em>Examples:</em></p>
     <table class='simple' border="1" cellpadding="3" cellspacing=
     "0" id="a1bn3">
       <caption>
         <a name="Count_Fallback_normal" href=
         "#Count_Fallback_normal" id="Count_Fallback_normal">Count
         Fallback: normal</a>
       </caption>
       <tbody>
         <tr>
           <th nowrap style="vertical-align: top;">Locale</th>
           <th nowrap style="vertical-align: top;">Path</th>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr-CA</td>
           <td nowrap id="iqaw" style="vertical-align: top;">
           <code>//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr-CA</td>
           <td nowrap id="iqaw16" style="vertical-align: top;">
           <code>//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr</td>
           <td nowrap id="iqaw19" style="vertical-align: top;">
           <code>//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr</td>
           <td nowrap id="iqaw18" style="vertical-align: top;">
           <code>//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">root</td>
           <td nowrap id="iqaw21" style="vertical-align: top;">
           <code>//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">root</td>
           <td nowrap id="iqaw20" style="vertical-align: top;">
           <code>//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong></code></td>
         </tr>
       </tbody>
     </table>
     <p>Note that there may be an alias in root that changes the
     path and starts again from the requested locale, such as:</p>
     <p><code>&lt;unitLength type="<strong>narrow</strong>"&gt;<br>
     &nbsp;&nbsp;&nbsp;&lt;alias source="locale"
     path="../unitLength[@type='<strong>short</strong>']"/&gt;<br>
     &lt;/unitLength&gt;</code></p>
     <table class='simple' border="1" cellpadding="3" cellspacing=
     "0" id="a1bn2">
       <caption>
         <a name="Count_Fallback_currency" href=
         "#Count_Fallback_currency" id=
         "Count_Fallback_currency">Count Fallback: currency</a>
       </caption>
       <tbody>
         <tr>
           <th nowrap style="vertical-align: top;">Locale</th>
           <th nowrap style="vertical-align: top;">Path</th>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr-CA</td>
           <td nowrap id="iqaw11" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr-CA</td>
           <td nowrap id="iqaw6" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr-CA</td>
           <td nowrap id="iqaw8" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr</td>
           <td nowrap id="iqaw15" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr</td>
           <td nowrap id="iqaw14" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">fr</td>
           <td nowrap id="iqaw13" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">root</td>
           <td nowrap id="iqaw25" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">root</td>
           <td nowrap id="iqaw24" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong></code></td>
         </tr>
         <tr>
           <td nowrap style="vertical-align: top;">root</td>
           <td nowrap id="iqaw23" style="vertical-align: top;">
           <code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
         </tr>
       </tbody>
     </table><br>
     <h4><a name="Parent_Locales" href="#Parent_Locales" id=
     "Parent_Locales">4.1.3 Parent Locales</a></h4>
     <p class="dtd">&lt;!ELEMENT parentLocales ( parentLocale* )
     &gt;<br>
     &lt;!ELEMENT parentLocale EMPTY &gt;<br>
     &lt;!ATTLIST parentLocale parent NMTOKEN #REQUIRED &gt;<br>
     &lt;!ATTLIST parentLocale locales NMTOKENS #REQUIRED &gt;</p>
     <p>In some cases, the normal truncation inheritance does not
     function well. This happens when:</p>
     <ol>
       <li>The child locale is of a different script. In this case,
       mixing elements from the parent into the child data results
       in a mishmash.</li>
       <li>A large number of child locales behave similarly, and
       differently from the truncation parent.</li>
     </ol>
     <p>The <span class="element">parentLocale</span> element is
     used to override the normal inheritance when accessing CLDR
     data.</p>
     <p>For case 1, the children are script locales, and the parent
     is "root". For example:</p>
     <pre>
     &lt;parentLocale parent="root" locales="az_Cyrl ha_Arab … zh_Hant"/&gt;</pre>
     <p>For case 2, the children and parent share the same primary
     language, but the region is changed. For example:</p>
     <pre>
     &lt;parentLocale parent="es_419" locales="es_AR es_BO … es_UY es_VE"/&gt;</pre>
     <p>Collation data, however, is an exception. Since collation
     rules do not truly inherit data from the parent, the
     parentLocale element is not necessary and not used for
     collation. Thus, for a locale like zh_Hant in the example
     above, the parentLocale element would dictate the parent as
     "root" when referring to main locale data, but for collation
     data, the parent locale would still be "zh", even though the
     parentLocale element is present for that locale.</p>
     <p>Since parentLocale information is not localizable on a per
     locale basis, the parentLocale information is contained in
     CLDR’s <a href="tr35-info.html">supplemental data.</a></p>
     <p>When a <span class="element">parentLocale</span> element is
     used to override normal inheritance, the following invariants
     must always be true:</p>
     <ol>
       <li>If X is the parentLocale of Y, then either X is the root
       locale, or X has the same base language code as Y. For
       example, the parent of "en" cannot be "fr", and the parent of
       "en_YY" cannot be "fr" or "fr_XX".</li>
       <li>If X is the parentLocale of Y, Y must not be a base
       language locale. For example, the parent of "en" cannot be
       "en_XX".</li>
       <li>There can never be cycles, such as: X parent of Y ...
       parent of X.</li>
     </ol>
     <h3><a name="Inheritance_and_Validity" href=
     "#Inheritance_and_Validity" id="Inheritance_and_Validity">4.2
     Inheritance and Validity</a></h3>
     <p>The following describes in more detail how to determine the
     exact inheritance of elements, and the validity of a given
     element in LDML.</p>
     <h4><a name="Definitions" href="#Definitions" id=
     "Definitions">4.2.1 Definitions</a></h4>
     <p><i>Blocking</i> elements are those whose subelements do not
     inherit from parent locales. For example, a &lt;collation&gt;
     element is a blocking element: everything in a
     &lt;collation&gt; element is treated as a single lump of data,
     as far as inheritance is concerned. For more information, see
     <a href="#Valid_Attribute_Values">Section 5.5 Valid Attribute
     Values</a>.</p>
     <p>Attributes that serve to distinguish multiple elements at
     the same level are called <i>distinguishing</i> attributes. For
     example, the <i>type</i> attribute distinguishes different
     elements in lists of translations, such as:</p>
     <pre>&lt;language type="aa"&gt;Afar&lt;/language&gt;
 &lt;language type="ab"&gt;Abkhazian&lt;/language&gt;</pre>
     <p>Distinguishing attributes affect inheritance; two elements
     with different distinguishing attributes are treated as
     different for purposes of inheritance. For more information,
     see <a href="#Valid_Attribute_Values">Section 5.5 Valid
     Attribute Values</a>. Other attributes are called
     nondistinguishing (or informational) attributes. These carry
     separate information, and do not affect inheritance.</p>
     <p>For any element in an XML file, <i>an element chain</i> is a
     resolved [<a href="#XPath">XPath</a>] leading from the root to
     an element, with attributes on each element in alphabetical
     order. So in, say, <a href=
     "https://github.com/unicode-org/cldr/blob/master/common/main/el.xml">https://github.com/unicode-org/cldr/blob/master/common/main/el.xml</a>
     we may have:</p>
     <pre>&lt;ldml&gt;
   &lt;identity&gt;
     &lt;version number="1.1" /&gt;
     &lt;language type="el" /&gt;
   &lt;/identity&gt;
   &lt;localeDisplayNames&gt;
     &lt;languages&gt;
       &lt;language type="ar"&gt;Αραβικά&lt;/language&gt;
 ...</pre>
     <p>Which gives the following element chains (among others):</p>
     <ul>
       <li>//ldml/identity/version[@number="1.1"]</li>
       <li>
       //ldml/localeDisplayNames/languages/language[@type="ar"]</li>
     </ul>
     <p>An element chain A is an <i>extension</i> of an element
     chain B if B is equivalent to an initial portion of A. For
     example, #2 below is an extension of #1. (Equivalent, depending
     on the tree, may not be "identical to". See below for an
     example.)</p>
     <ol>
       <li>//ldml/localeDisplayNames</li>
       <li>
       //ldml/localeDisplayNames/languages/language[@type="ar"]</li>
     </ol>
     <p>An LDML file can be thought of as an ordered list of
     <i>element pairs</i>: &lt;element chain, data&gt;, where the
     element chains are all the chains for the end-nodes. (This
     works because of restrictions on the structure of LDML,
     including that it does not allow mixed content.) The ordering
     is the ordering that the element chains are found in the file,
     and thus determined by the DTD.</p>
     <p>For example, some of those pairs would be the following.
     Notice that the first has the null string as element
     contents.</p>
     <ul>
       <li><b>&lt;</b>//ldml/identity/version[@number="1.1"]<b>,</b>
       ""<b>&gt;</b></li>
       <li>
       <b>&lt;</b>//ldml/localeDisplayNames/languages/language[@type="ar"]<b>,</b>
       "Αραβικά"<b>&gt;</b></li>
     </ul>
     <blockquote>
       <p><b>Note:</b> There are two exceptions to this:</p>
       <ol>
         <li>Blocking nodes and their contents are treated as a
         single end node.</li>
         <li>In terms of computing inheritance, the element pair
         consists of the element chain plus all distinguishing
         attributes; the value consists of the value (if any) plus
         any nondistinguishing attributes.</li>
       </ol>
       <blockquote>
         <p>Thus instead of the element pair being (a) below, it is
         (b):</p>
         <ol type="a">
           <li>
           <b>&lt;</b>//ldml/dates/calendars/calendar[@type='gregorian']/week/weekendStart[@day='sun'][@time='00:00']<b>,</b><br>

           <b>""&gt;</b></li>
           <li>
           <b>&lt;</b>//ldml/dates/calendars/calendar[@type='gregorian']/week/weekendStart<b>,</b><br>

           [@day='sun'][@time='00:00']<b>&gt;</b></li>
         </ol>
       </blockquote>
     </blockquote>
     <p>Two LDML element chains are <i>equivalent</i> when they
     would be identical if all attributes and their values were
     removed — except for distinguishing attributes. Thus the
     following are equivalent:</p>
     <ul>
       <li>
       <code>//ldml/localeDisplayNames/languages/language[@type="ar"]</code></li>
       <li>
       <code>//ldml/localeDisplayNames/languages/language[@type="ar"][@draft="unconfirmed"]</code></li>
     </ul>
     <p>For any locale ID, an <i>locale chain</i> is an ordered list
     starting with the root and leading down to the ID. For
     example:</p>
     <blockquote>
       <p>&lt;root, de, de_DE, de_DE_xxx&gt;</p>
     </blockquote>
     <h4><a name="Resolved_Data_File" href="#Resolved_Data_File" id=
     "Resolved_Data_File">4.2.2 Resolved Data File</a></h4>
     <p>To produce fully resolved locale data file from CLDR for a
     locale ID L, you start with L, and successively add unique
     items from the parent locales until you get up to root. More
     formally, this can be expressed as the following procedure.</p>
     <ol>
       <li>Let Result be initially L.</li>
       <li>For each Li in the locale chain for L, starting at L and
       going up to root:
         <ol>
           <li>Let Temp be a copy of the pairs in the LDML file for
           Li</li>
           <li>Replace each alias in Temp by the resolved list of
           pairs it points to.
             <ol>
               <li>The resolved list of pairs is obtained by
               recursively applying this procedure.</li>
               <li>That alias now blocks any inheritance from the
               parent. (See <i><a href="#Common_Elements">Section
               5.1 Common Elements</a></i> for an example.)</li>
             </ol>
           </li>
           <li>For each element pair P in Temp:
             <ol>
               <li>If P does not contain a blocking element, and
               Result does not have an element pair Q with an
               equivalent element chain, add P to Result.</li>
             </ol>
           </li>
         </ol>
       </li>
     </ol>
     <p><b>Notes:</b></p>
     <ul>
       <li>When adding an element pair to a result, it has to go in
       the right order for it to be valid according to the DTD.</li>
       <li>The identity element and its children are unaffected by
       resolution.</li>
       <li>The LDML data must be constructed so as to avoid
       circularity in step 2.2.</li>
     </ul>
     <h4><a name="Valid_Data" href="#Valid_Data" id=
     "Valid_Data">4.2.3 Valid Data</a></h4>
     <p>The attribute <i>draft="x"</i> in LDML means that the data
     has not been approved by the subcommittee. (For more
     information, see <a href=
     "http://cldr.unicode.org/index/process">Process</a>). However,
     some data that is not explicitly marked as <i>draft</i> may be
     implicitly <i>draft</i>, either because it inherits it from a
     parent, or from an enclosing element.</p>
     <p><b>Example 2.</b> Suppose that new locale data is added for
     af (Afrikaans). To indicate that all of the data is
     <i>unconfirmed</i>, the attribute can be added to the top
     level.</p>
     <p><code>&lt;ldml version="1.1" draft="unconfirmed"&gt;<br>
     &nbsp;&lt;identity&gt;<br>
     &nbsp; &lt;version number="1.1" /&gt;<br>
     &nbsp; &lt;language type="af" /&gt;<br>
     &nbsp;&lt;/identity&gt;<br>
     &nbsp;&lt;characters&gt;...&lt;/characters&gt;<br>
     &nbsp;&lt;localeDisplayNames&gt;...&lt;/localeDisplayNames&gt;<br>

     &lt;/ldml&gt;</code></p>
     <p>Any data can be added to that file, and the status will all
     be draft=<i>unconfirmed</i>. Once an item is vetted—<i>whether
     it is inherited or explicitly in the file</i>—then its status
     can be changed to <i>approved</i>. This can be done either by
     leaving draft="unconfirmed" on the enclosing element and
     marking the child with draft="approved", such as:</p>
     <p><code>&lt;ldml version="1.1" draft="unconfirmed"&gt;<br>
     &nbsp;&lt;identity&gt;<br>
     &nbsp; &lt;version number="1.1" /&gt;<br>
     &nbsp; &lt;language type="af" /&gt;<br>
     &nbsp;&lt;/identity&gt;<br>
     &nbsp;&lt;characters
     draft="approved"&gt;...&lt;/characters&gt;<br>
     &nbsp;&lt;localeDisplayNames&gt;...&lt;/localeDisplayNames&gt;<br>

     &nbsp;&lt;dates/&gt;<br>
     &nbsp;&lt;numbers/&gt;<br>
     &nbsp;&lt;collations/&gt;<br>
     &lt;/ldml&gt;</code></p>
     <p>However, normally the draft attributes should be
     canonicalized, which means they are pushed down to leaf nodes
     as described in <i><a href="#Canonical_Form">Section 5.6
     Canonical Form</a></i>. If an LDML file does has draft
     attributes that are not on leaf nodes, the file should be
     interpreted as if it were the canonicalized version of that
     file.</p>
     <p>More formally, here is how to determine whether data for an
     element chain E is implicitly or explicitly draft, given a
     locale L. Sections 1, 2, and 4 are simply formalizations of
     what is in LDML already. Item 3 adds the new element.</p>
     <h4><a name="Checking_for_Draft_Status" href=
     "#Checking_for_Draft_Status" id=
     "Checking_for_Draft_Status">4.2.4 Checking for Draft
     Status</a></h4>
     <ol>
       <li>
         <b>Parent Locale Inheritance</b>
         <ol>
           <li>Walk through the locale chain until you find a locale
           ID L' with a data file D. (L' may equal L).</li>
           <li>Produce the fully resolved data file D' for D.</li>
           <li>In D', find the first element pair whose element
           chain E' is either equivalent to or an extension of
           E.</li>
           <li>If there is no such E', return <i>true</i></li>
           <li>If E' is not equivalent to E, truncate E' to the
           length of E.</li>
         </ol>
       </li>
       <li>
         <b>Enclosing Element Inheritance</b>
         <ol>
           <li>Walk through the elements in E', from back to front.
             <ol>
               <li>If you ever encounter draft=<i>x</i>, return
               <i>x</i></li>
             </ol>
           </li>
           <li>If L' = L, return <i>false</i></li>
         </ol>
       </li>
       <li>
         <b>Missing File Inheritance</b>
         <ol>
           <li>Otherwise, walk again through the elements in E',
           from back to front.
             <ol>
               <li>If you encounter a validSubLocales attribute
               (deprecated):
                 <ol>
                   <li>If L is in the attribute value, return
                   <i>false</i></li>
                   <li>Otherwise return <i>true</i></li>
                 </ol>
               </li>
             </ol>
           </li>
         </ol>
       </li>
       <li>
         <b>Otherwise</b>
         <ol>
           <li>Return <i>true</i></li>
         </ol>
       </li>
     </ol>
     <p>The validSubLocales in the most specific (farthest from root
     file) locale file "wins" through the full resolution step (data
     from more specific files replacing data from less specific
     ones).</p>
     <h4><a name="Keyword_and_Default_Resolution" href=
     "#Keyword_and_Default_Resolution" id=
     "Keyword_and_Default_Resolution">4.2.5 Keyword and Default
     Resolution</a></h4>
     <p>When accessing data based on keywords, the following process
     is used. Consider the following example:</p>
     <ul>
       <li>The locale 'de' has collation types A, B, C, and no
       &lt;default&gt; element</li>
       <li>The locale 'de_CH' has &lt;default type='B'&gt;</li>
     </ul>
     <p>Here are the searches for various combinations.</p>
     <table class='simple' border="1" cellpadding="0" cellspacing=
     "0">
       <tr>
         <td><strong>User Input</strong></td>
         <td><strong>Lookup in Locale</strong></td>
         <td><strong>For</strong></td>
         <td><strong>Comment</strong></td>
       </tr>
       <tr>
         <td rowspan="3">de_CH<br>
         <em>no keyword</em></td>
         <td>de_CH</td>
         <td>default collation type</td>
         <td>finds "B"</td>
       </tr>
       <tr>
         <td>de_CH</td>
         <td>collation type=B</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>de</td>
         <td>collation type=B</td>
         <td><em>found</em></td>
       </tr>
       <tr>
         <td rowspan="4">de<br>
         <em>no keyword</em></td>
         <td>de</td>
         <td>default collation type</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>default collation type</td>
         <td>finds "standard"</td>
       </tr>
       <tr>
         <td>de</td>
         <td>collation type=standard</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>collation type=standard</td>
         <td><i>found</i></td>
       </tr>
       <tr>
         <td>de_u_co_A</td>
         <td>de</td>
         <td>collation type=A</td>
         <td><i>found</i></td>
       </tr>
       <tr>
         <td rowspan="2">de_u_co_standard</td>
         <td>de</td>
         <td>collation type=standard</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>collation type=standard</td>
         <td><i>found</i></td>
       </tr>
       <tr>
         <td rowspan="6">de_u_co_foobar</td>
         <td>de</td>
         <td>collation type=foobar</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>collation type=foobar</td>
         <td>not found, starts looking for default</td>
       </tr>
       <tr>
         <td>de</td>
         <td>default collation type</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>default collation type</td>
         <td>finds "standard"</td>
       </tr>
       <tr>
         <td>de</td>
         <td>collation type=standard</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>collation type=standard</td>
         <td><i>found</i></td>
       </tr>
     </table>
     <p>Examples of "search" collator lookup; 'de' has a
     language-specific version, but 'en' does not:</p>
     <table class='simple' border="1" cellpadding="0" cellspacing=
     "0">
       <tr>
         <td><strong>User Input</strong></td>
         <td><strong>Lookup in Locale</strong></td>
         <td><strong>For</strong></td>
         <td><strong>Comment</strong></td>
       </tr>
       <tr>
         <td rowspan="2">de_CH_u_co_search</td>
         <td>de_CH</td>
         <td>collation type=search</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>de</td>
         <td>collation type=search</td>
         <td><i>found</i></td>
       </tr>
       <tr>
         <td rowspan="3">en_US_u_co_search</td>
         <td>en_US</td>
         <td>collation type=search</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>en</td>
         <td>collation type=search</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>root</td>
         <td>collation type=search</td>
         <td><i>found</i></td>
       </tr>
     </table>
     <p>Examples of lookup for Chinese collation types. Note:</p>
     <ul>
       <li>All of the Chinese-specific collation types are provided
       in the 'zh' locale</li>
       <li>For 'zh' the &lt;default&gt; element specifies "pinyin";
       for 'zh_Hant' the &lt;default&gt; element specifies "stroke".
       However any of the available Chinese collation types can be
       explicitly requested for any Chinese locale.</li>
     </ul>
     <table class='simple' border="1" cellpadding="0" cellspacing=
     "0">
       <tr>
         <td><strong>User Input</strong></td>
         <td><strong>Lookup in Locale</strong></td>
         <td><strong>For</strong></td>
         <td><strong>Comment</strong></td>
       </tr>
       <tr>
         <td rowspan="3">zh_Hant<br>
         <em>no keyword</em></td>
         <td>zh_Hant</td>
         <td>default collation type</td>
         <td>finds "stroke"</td>
       </tr>
       <tr>
         <td>zh_Hant</td>
         <td>collation type=stroke</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>zh</td>
         <td>collation type=stroke</td>
         <td><i>found</i></td>
       </tr>
       <tr>
         <td rowspan="3">zh_Hant_HK_u_co_pinyin</td>
         <td>zh_Hant_HK</td>
         <td>collation type=pinyin</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>zh_Hant</td>
         <td>collation type=pinyin</td>
         <td>not found</td>
       </tr>
       <tr>
         <td>zh</td>
         <td>collation type=pinyin</td>
         <td><i>found</i></td>
       </tr>
       <tr>
         <td rowspan="2">zh<br>
         <em>no keyword</em></td>
         <td>zh</td>
         <td>default collation type</td>
         <td>finds "pinyin"</td>
       </tr>
       <tr>
         <td>zh</td>
         <td>collation type=pinyin</td>
         <td><i>found</i></td>
       </tr>
     </table>
     <blockquote>
       <p><b>Note:</b> It is an invariant that the default in root
       for a given element must<br>
       always be a value that exists in root. So you can not have
       the following in root:</p>
     </blockquote>
     <p><code>&lt;someElements&gt;<br>
     &nbsp; &lt;default type='a'/&gt;<br>
     &nbsp; &lt;someElement type='b'&gt;...&lt;/someElement&gt;<br>
     &nbsp; &lt;someElement type='c'&gt;...&lt;/someElement&gt;<br>
     <b>&nbsp; &lt;!-- no 'a' --&gt;</b><br>
     &lt;/someElements&gt;</code></p>
     <p>For identifiers, such as language codes, script codes,
     region codes, variant codes, types, keywords, currency symbols
     or currency display names, the default value is the identifier
     itself whenever if no value is found in the root. Thus if there
     is no display name for the region code 'QA' in root, then the
     display name is simply 'QA'.</p>
     <h4><a name="Inheritance_vs_Related" href=
     "#Inheritance_vs_Related" id="Inheritance_vs_Related">4.2.6
     Inheritance vs Related Information</a></h4>
     <p>There are related types of data and processing that are easy
     to confuse:</p>
     <table class='simple'>
       <tr>
         <td rowspan="4">
           <p><strong>Inheritance</strong></p>
         </td>
         <td colspan="2">Part of the internal mechanism used by CLDR
         to organize and manage locale data. This is used to share
         common resources, and ease maintenance, and provide the
         best fallback behavior in the absence of data. <em>Should
         not be used for locale matching or likely
         subtags.</em></td>
       </tr>
       <tr>
         <td><em>Example:</em></td>
         <td>parent(en_AU) ⇒ en_001<br>
         parent(en_001) ⇒ en<br>
         parent(en) ⇒ root</td>
       </tr>
       <tr>
         <td><em>Data:</em></td>
         <td>supplementalData.xml &lt;parentLocale&gt;</td>
       </tr>
       <tr>
         <td><em>Spec:</em></td>
         <td><strong>Section <a href="#Inheritance_and_Validity">4.2
         Inheritance and Validity</a></strong></td>
       </tr>
       <tr>
         <td rowspan="4"><strong>DefaultContent</strong></td>
         <td colspan="2">Part of the internal mechanism used by CLDR
         to manage locale data. A particular sublocale is designated
         the defaultContent for a parent, so that the parent
         exhibits consistent behavior. <em>Should not be used for
         locale matching or likely subtags.</em></td>
       </tr>
       <tr>
         <td><em>Example:</em></td>
         <td>addLikelySubtags(sr-ME) ⇒ sr-Latn-ME,
         minimize(de-Latn-DE) ⇒ de</td>
       </tr>
       <tr>
         <td><em>Data:</em></td>
         <td>supplementalMetadata.xml &lt;defaultContent&gt;</td>
       </tr>
       <tr>
         <td><em>Spec:</em></td>
         <td><strong>Part 6: Section 9.3&nbsp;<a href=
         "tr35-info.html#Default_Content">Default
         Content</a></strong></td>
       </tr>
       <tr>
         <td rowspan="4"><strong>LikelySubtags</strong></td>
         <td colspan="2">Provides most likely full subtag (script
         and region) in the absence of other information. A core
         component of LocaleMatching.</td>
       </tr>
       <tr>
         <td><em>Example:</em></td>
         <td>addLikelySubtags(zh) ⇒ zh-Hans-CN<br>
         addLikelySubtags(zh-TW) ⇒ zh-Hant-TW<br>
         minimize(zh-Hans, favorRegion) ⇒ zh-TW</td>
       </tr>
       <tr>
         <td><em>Data:</em></td>
         <td>likelySubtags.xml &lt;likelySubtags&gt;</td>
       </tr>
       <tr>
         <td><em>Spec:</em></td>
         <td><strong>Section <a href="#Likely_Subtags">4.3 Likely
         Subtags</a></strong></td>
       </tr>
       <tr>
         <td rowspan="4"><strong>LocaleMatching</strong></td>
         <td colspan="2">Provides the best match for the user’s
         language(s) among an application’s supported
         languages.</td>
       </tr>
       <tr>
         <td><em>Example:</em></td>
         <td>bestLocale(userLangs=&lt;en, fr&gt;,
         appLangs=&lt;fr-CA, ru&gt;) ⇒ fr-CA</td>
       </tr>
       <tr>
         <td><em>Data:</em></td>
         <td>languageInfo.xml &lt;languageMatching&gt;</td>
       </tr>
       <tr>
         <td><em>Spec:</em></td>
         <td><strong>Section <a href="#LanguageMatching">4.4
         Language Matching</a></strong></td>
       </tr>
     </table>
     <h3><a name="Likely_Subtags" href="#Likely_Subtags" id=
     "Likely_Subtags">4.3 Likely Subtags</a></h3>
     <p class="dtd">&lt;!ELEMENT likelySubtag EMPTY &gt;<br>
     &lt;!ATTLIST likelySubtag from NMTOKEN #REQUIRED&gt;<br>
     &lt;!ATTLIST likelySubtag to NMTOKEN #REQUIRED&gt;</p>
     <p>There are a number of situations where it is useful to be
     able to find the most likely language, script, or region. For
     example, given the language "zh" and the region "TW", what is
     the most likely script? Given the script "Thai" what is the
     most likely language or region? Given the region TW, what is
     the most likely language and script?</p>
     <p>Conversely, given a locale, it is useful to find out which
     fields (language, script, or region) may be superfluous, in the
     sense that they contain the likely tags. For example, "en_Latn"
     can be simplified down to "en" since "Latn" is the likely
     script for "en"; "ja_Jpan_JP" can be simplified down to
     "ja".</p>
     <p>The <i>likelySubtag</i> supplemental data provides default
     information for computing these values. This data is based on
     the default content data, the population data, and the
     suppress-script data in [<a href="#BCP47">BCP47</a>]. It is
     heuristically derived, and may change over time.</p>
     <p>For the relationship between Inheritance, DefaultContent,
     LikelySubtags, and LocaleMatching, see <strong><em>Section
     4.2.6 <a href="tr35.html#Inheritance_vs_Related">Inheritance vs
     Related Information</a></em></strong>.</p>
     <p>To look up data in the table, see if a locale matches one of
     the <b>from</b> attribute values. If so, fetch the
     corresponding <b>to</b> attribute value. For example, the
     Chinese data looks like the following:</p>
     <blockquote>
       <p class="example">&lt;likelySubtag from="zh"
       to="zh_Hans_CN"/&gt;<br>
       &lt;likelySubtag from="zh_HK" to="zh_Hant_HK"/&gt;<br>
       &lt;likelySubtag from="zh_Hani" to="zh_Hani_CN"/&gt;<br>
       &lt;likelySubtag from="zh_Hant" to="zh_Hant_TW"/&gt;<br>
       &lt;likelySubtag from="zh_MO" to="zh_Hant_MO"/&gt;<br>
       &lt;likelySubtag from="zh_TW" to="zh_Hant_TW"/&gt;</p>
     </blockquote>
     <p>So looking up "zh_TW" returns "zh_Hant_TW", while looking up
     "zh" returns "zh_Hans_CN".</p>
     <p>In more detail, the data is designed to be used in the
     following operations.</p>
     <p>Note that as of CLDR v24, any field present in the 'from'
     field, is also present in the 'to' field, so an input field
     will not change in "Add Likely Subtags" operation. The data and
     operations can also be used with language tags using [<a href=
     "#BCP47">BCP47</a>] syntax, with the appropriate changes. In
     addition, certain common 'denormalized' language subtags such
     as 'iw' (for 'he') may occur in both the 'from' and 'to'
     fields. This allows for implementations that use those
     denormalized subtags to use the data with only minor changes to
     the operations.</p>
     <p>An implementation may choose  exclude language tags with the language subtag &quot;und&quot; from the following operation. In such a case, only the canonicalization is done. An implementation can declare that it is doing the exclusion, or can take a parameter that controls whether or not to do it.</p>
     <p>&nbsp;</p>
     <p><i><b>Add Likely Subtags:</b></i> <em>Given a source locale
     X, to return a locale Y where the empty subtags have been
     filled in by the most likely subtags.</em> This is written as X
     ⇒ Y ("X maximizes to Y").</p>
     <p>A subtag is called <em>empty</em> if it is a missing script
     or region subtag, or it is a base language subtag with the
     value "und". In the description below, a subscript on a subtag
     <em>x</em> indicates which tag it is from:
     <em>x<sub>s</sub></em> is in the source,
     <em>x<sub>m</sub></em>is in a match, and <em>x<sub>r</sub></em>
     is in the final result.</p>
     <p>This operation is performed in the following way.</p>
     <ol>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <strong>Canonicalize.</strong>
         <ol>
           <li>Make sure the input locale is in canonical form: uses
           the right separator, and has the right casing.</li>
           <li style="margin-top: 0.5em; margin-bottom: 0.5em">
           Replace any deprecated subtags with their canonical
           values using the &lt;alias&gt; data in supplemental
           metadata. Use the first value in the replacement list, if
           it exists. Language tag replacements may have multiple
           parts, such as "sh" ➞ "sr_Latn" or mo" ➞ "ro_MD". In such
           a case, the original script and/or region are retained if
           there is one. Thus "sh_Arab_AQ" ➞ "sr_Arab_AQ", not
           "sr_Latn_AQ".</li>
           <li>If the tag is a legacy language tag
           (marked as “Type: grandfathered” in BCP 47; see &lt;variable
           id="$grandfathered" type="choice"&gt; in the supplemental
           data), then return it.</li>
           <li>Remove the script code 'Zzzz' and the region code
           'ZZ' if they occur.</li>
           <li>Get the components of the cleaned-up source tag
           <em>(language<sub>s</sub>, script<sub>s</sub>,</em> and
           <em>region<sub>s</sub></em>), plus any variants and
           extensions.</li>
         </ol>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <strong>Lookup.</strong> Lookup each of the following in
         order, and stop on the first match:
         <ol>
           <li style="margin-top: 0.5em; margin-bottom: 0.5em">
           <em>language<sub>s</sub>_script<sub>s</sub>_region<sub>s</sub></em></li>
           <li style="margin-top: 0.5em; margin-bottom: 0.5em">
           <em>language<sub>s</sub>_region<sub>s</sub></em></li>
           <li style="margin-top: 0.5em; margin-bottom: 0.5em">
           <em>language<sub>s</sub>_script<sub>s</sub></em></li>
           <li style="margin-top: 0.5em; margin-bottom: 0.5em">
           <em><em>language<sub>s</sub></em></em></li>
           <li>und<em>_script<sub>s</sub></em>      </li>
         </ol>
       </li>
       <li>
         <strong>Return</strong>
         <ol>
           <li>If there is no match,either return
             <ol>
               <li>an error value, or</li>
               <li>the match for "und" (in APIs where a valid
               language tag is required).</li>
             </ol>
           </li>
           <li>Otherwise there is a match = <span style=
           "margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>m</sub>_script<sub>m</sub>_region<sub>m</sub></em></span></li>
           <li>Let x<sub>r</sub> = x<sub>s</sub> if x<sub>s</sub> is
           not empty, and x<sub>m</sub> otherwise.</li>
           <li>R<span style=
           "margin-top: 0.5em; margin-bottom: 0.5em">eturn the
           language tag composed of <em>language<sub>r</sub> _
           script<sub>r</sub> _ region<sub>r</sub></em> + variants +
           extensions</span> .</li>
         </ol>
       </li>
     </ol>
     <p>The lookup can be optimized. For example, if any of the tags
     in Step 2 are the same as previous ones in that list, they do
     not need to be tested.</p>
     <p><i>Example1:</i></p>
     <ul>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>Input is ZH-ZZZZ-SG.</p>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>Normalize to zh_SG.</p>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>Lookup in table. No match.</p>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>Lookup zh, and get the match (zh_Hans_CN). Substitute
         SG, and return zh_Hans_SG.</p>
       </li>
     </ul>
     <p>To find the most likely language for a country, or language
     for a script, use "und" as the language subtag. For example,
     looking up "und_TW" returns zh_Hant_TW.</p>
     <p>A goal of the algorithm is that if X ⇒ Y, and X' results
     from replacing an empty subtag in X by the corresponding
     subtag in Y, then X' ⇒ Y. For example, if und_AF ⇒ fa_Arab_AF,
     then:</p>
     <ul>
       <li>fa_Arab_AF ⇒ fa_Arab_AF</li>
       <li>und_Arab_AF ⇒ fa_Arab_AF</li>
       <li>fa_AF ⇒ fa_Arab_AF</li>
     </ul>
     <p>There are a small number of exceptions to this goal in the
     current data, where X ∈ {und_Bopo, und_Brai, und_Cakm,
     und_Limb, und_Shaw}.</p>
     <p><b><i>Remove</i></b> <i><b>Likely Subtags:</b> Given a
     locale, remove any fields that Add Likely Subtags would
     add.</i></p>
     <p>The reverse operation removes fields that would be added by
     the first operation.</p>
     <ol>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">First get
       max = AddLikelySubtags(inputLocale). If an error is signaled,
       return it.</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">Remove
       the variants from max.</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">Get the
 	      components of the max (<em>language<sub>max</sub></em>,
 	      <em>script<sub>max</sub></em>, <em>region<sub>max</sub></em>).</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">Then for
       <i>trial</i> in {<em>language<sub>max</sub></em>,
 	      <em>language<sub>max</sub>_region<sub>max</sub></em>,
 	      <em>language<sub>max</sub>_script<sub>max</sub></em>}
         <ul>
           <li style="margin-top: 0.5em; margin-bottom: 0.5em">If
           AddLikelySubtags(<i>trial</i>) = max, then return
           <i>trial</i> + variants.</li>
         </ul>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">If you do
       not get a match, return max + variants.</li>
     </ol>
     <p>Example:</p>
     <ul>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>Input is zh_Hant. Maximize to get zh_Hant_TW.</p>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>zh =&gt; zh_Hans_CN. No match, so continue.</p>
       </li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
         <p>zh_TW =&gt; zh_Hant_TW. Matches, so return zh_TW.</p>
       </li>
     </ul>
     <p>A variant of this favors the script over the region, thus
     using {language, language_script, language_region} in the
     above. If that variant is used, then the result in this example
     would be zh_Hant instead of zh_TW.</p>
     <h3><a name="LanguageMatching" href="#LanguageMatching" id=
     "LanguageMatching">4.4 Language Matching</a></h3>
     <p class="dtd">&lt;!ELEMENT languageMatching ( languageMatches*
     ) &gt;<br>
     &lt;!ELEMENT languageMatches ( paradigmLocales*,
     matchVariable*, languageMatch* ) &gt;<br>
     &lt;!ATTLIST languageMatches type NMTOKEN #REQUIRED &gt;</p>
     <p class="dtd">&lt;!ELEMENT languageMatch EMPTY &gt;<br>
     &lt;!ATTLIST languageMatch desired CDATA #REQUIRED &gt;<br>
     &lt;!ATTLIST languageMatch supported CDATA #REQUIRED &gt;<br>
     &lt;!ATTLIST languageMatch percent NMTOKEN #REQUIRED &gt;<br>
     &lt;!ATTLIST languageMatch distance NMTOKEN #IMPLIED &gt;<br>
     &lt;!ATTLIST languageMatch oneway ( true | false ) #IMPLIED
     &gt;</p>
     <p class="dtd">&lt;!ELEMENT languageMatches ( paradigmLocales*,
     matchVariable*, languageMatch* ) &gt;<br>
     &lt;!ATTLIST languageMatches type NMTOKEN #REQUIRED &gt;</p>
     <p class="dtd">&lt;!ELEMENT paradigmLocales EMPTY &gt;<br>
     &lt;!ATTLIST paradigmLocales locales NMTOKENS #REQUIRED
     &gt;</p>
     <p>Implementers are often faced with the issue of how to match
     the user's requested languages with their product's supported
     languages. For example, suppose that a product supports {ja-JP,
     de, zh-TW}. If the user understands written American English,
     German, French, Swiss German, and Italian, then
     <strong>de</strong> would be the best match; if s/he
     understands only Chinese (zh), then zh-TW would be the best
     match.</p>
     <p>The standard truncation-fallback algorithm does not work
     well when faced with the complexities of natural language. The
     language matching data is designed to fill that gap. Stated in
     those terms, language matching can have the effect of a more
     complex fallback, such as:</p>
     <p>sr-Cyrl-RS<br>
     sr-Cyrl<br>
     sr-Latn-RS<br>
     sr-Latn<br>
     sr<br>
     hr-Latn<br>
     hr</p>
     <p>Language matching is used to find the best supported locale
     ID given a requested list of languages. The requested list
     could come from different sources, such as such as the user's
     list of preferred languages in the OS Settings, or from a
     browser Accept-Language list. For example, if my native tongue
     is English, I can understand Swiss German and German, my French
     is rusty but usable, and Italian basic, ideally an
     implementation would allow me to select {gsw, de, fr} as my
     preferred list of languages, skipping Italian because my
     comprehension is not good enough for arbitrary content.</p>
     <p>Language Matching can also be used to get fallback data
     elements. In many cases, there may not be full data for a
     particular locale. For example, for a Breton speaker, the best
     fallback if data is unavailable might be French. That is,
     suppose we have found a Breton bundle, but it does not contain
     translation for the key "CN" (for the country China). It is
     best to return "chine", rather than falling back to the value
     default language such as Russian and getting "Кітай".&nbsp; The
     language matching data can be used to get the closest fallback
     locales (of those supported) to a given language.</p>
     <p>For the relationship between Inheritance, DefaultContent,
     LikelySubtags, and LocaleMatching, see <strong><em>Section
     4.2.6 <a href="tr35.html#Inheritance_vs_Related">Inheritance vs
     Related Information</a></em></strong>.</p>
     <p>When such fallback is used for inherited item lookup, the
     normal order of inheritance is used for inherited item lookup,
     except that before using any data from <strong>root</strong>,
     the data for the fallback locales would be used if available.
     Language matching does not interact with the fallback of
     resources&nbsp;<em>within the locale-parent chain</em>. For
     example, suppose that we are looking for the value for a
     particular path <strong>P</strong> in <strong>nb-NO</strong>.
     In the absence of aliases, normally the following lookup is
     used.</p>
     <blockquote>
       <p><strong>nb-NO</strong> → <strong>nb</strong> →
       <strong>root</strong></p>
     </blockquote>
     <p>That is, we first look in <strong>nb-NO</strong>. If there
     is no value for <strong>P</strong> there, then we look in
     <strong>nb</strong>. If there is no value for
     <strong>P</strong> there, we return the value for
     <strong>P</strong> in root (or a code value, if there is
     nothing there). Remember that if there is an alias element
     along this path, then the lookup may restart with a different
     path in <strong>nb-NO</strong> (or another locale).</p>
     <p>However, suppose that <strong>nb-NO</strong> has the
     fallback values <strong>[nn da sv en]</strong>, derived from
     language matching. In that case, an implementation <em>may</em>
     progressively lookup each of the listed locales, with the
     appropriate substitutions, returning the first value that is
     not found in <strong>root</strong>. This follows roughly the
     following pseudocode:</p>
     <ul>
       <li>value = lookup(P, nb-NO); if (locationFound != root)
       return value;</li>
       <li>value = lookup(P, nn-NO); if (locationFound != root)
       return value;</li>
       <li>value = lookup(P, da-NO); if (locationFound != root)
       return value;</li>
       <li>value = lookup(P, sv-NO); if (locationFound != root)
       return value;</li>
       <li>value = lookup(P, en-NO); return value;</li>
     </ul>
     <p>The locales in the fallback list are not used recursively.
     For example, for the lookup of a path in nb-NO, if
     <strong>fr</strong> were a fallback value for
     <strong>da</strong>, it would not matter for the above process.
     Only the original language matters.</p>
     <p>The language matching data is intended to be used according
     to the following algorithm. This is a logical description, and
     can be optimized for production in many ways. In this
     algorithm, the languageMatching data is interpreted as an
     ordered list.</p>
     <p>Distances between given pair of subtags can be larger or smaller than the typical distances. For example, the distance between en and en-GB can be greater than those between en-GB and en-IE. In some cases, language and/or script differences can be as small as the typical region difference. (Example: sr-Latn vs. sr-Cyrl).</p>
     <p>The distances resulting from the table are not linear, but are rather chosen to produce expected results. So a distance of 10 is not necessarily twice as &quot;bad&quot; as a distance of 5. Implementations may want to have a mode where script distances should swamp language distances. The tables are built such that this can be accomplished by multiplying the language distance by 0.25.</p>
     <p>The language matching algorithm takes a list of a user’s
     desired languages, and a list of the application’s supported
     languages.</p>
     <ul>
       <li>Set the best weighted distance BWD to ∞</li>
       <li>Set the best desired language BD to null</li>
       <li>Set the best supported language BS to null</li>
       <li>For each desired language D
         <ul>
 			<li>Compute a demotion value F, based on the position in
           the list.
             <ul>
               <li>This demotion value is up to the implementation,
               but is typically a positive value that increases
               according to how far D is from the start of the
               desired language list.</li>
             </ul>
           </li>
           <li>For each supported language S
             <ul>
               <li>Find the matching distance MD as described
               below.</li>
               <li>Compute the weighted distance as F + MD</li>
               <li>If WD &lt; BD
                 <ul>
                   <li>BWD = WD</li>
                   <li>BD = D</li>
                   <li>BS = S</li>
                 </ul>
               </li>
             </ul>
           </li>
         </ul>
       </li>
       <li>If the BWD is less than a threshold, return &lt;BD, BS&gt;
         <ul>
         <li>The threshold is implementation-defined, typically
           set to greater than a default region difference, and less
         than a default script difference.</li>
         </ul>
       </li>
       <li>Otherwise BD = the default supported language (like
       English); return &lt;BD, null&gt;</li>
     </ul>
     <p>To find the matching distance MD between any two languages,
     perform the following steps.</p>
     <ol>
       <li>Maximize each language using Section 4.3 <a href=
       "#Likely_Subtags">Likely Subtags</a>.
         <ul>
           <li>und is a special case: see below.</li>
         </ul>
       </li>
       <li>Set the match-distance MD to 0</li>
       <li>For each subtag in {language, script, region}
 <ol>
         <li>If respective subtags in each language tag are
           identical, remove the subtag from each (logically) and
         continue.</li>
         <li>Traverse the languageMatching data until a match is
           found.
           <ul>
             <li>* matches any field.</li>
             <li>If the oneway flag is false, then the match is
             symmetric; otherwise only match one direction.</li>
             <li>For region matching, use the mechanisms in <strong>Section 4.4.1 <a href=
               "#EnhancedLanguageMatching">Enhanced Language
             Matching</a></strong>.</li>
           </ul>
         </li>
 		  <li>Add the <strong>distance</strong> attribute value  to MD.
 		    <ul>
 		      <li>This used to be a <strong>percent</strong> attribute value, which was 100 - the distance attribute value.</li>
 	        </ul>
 		  </li>
 	    <li>Remove the subtag from each (logically)</li>
         </ol>
       </li>
       <li>Return MD</li>
     </ol>
     <p>It is typically useful to set the discount factor between
     successive elements of the desired languages list to be
     slightly greater than the default region difference. That
     avoids the following problem:<br></p>
     <p><em>Supported languages:</em> "de, fr, ja"<br></p>
     <p><em>User's desired languages:</em> "de-AT, fr"</p>
     <p>This user would expect to get "de", not "fr". In practice,
     when a user selects a list of preferred languages, they don't
     include all the regional variants ahead of their second base
     language. Yet while the user's desired languages really doesn't
     tell us the priority ranking among their languages, normally
     the fall-off between the user's languages is substantially
     greater than regional variants. But unless F is greater than
     the distance between de-AT and de-DE, then the user’s
     second-choice language would be returned.</p>
     <p>The base language subtag "und" is a special case. Suppose we
     have the following situation:</p>
     <ul>
       <li>desired languages: {und, it}</li>
       <li>supported languages: {en, it}</li>
       <li>resulting language: en<br></li>
     </ul>
     <p>Part of this is because 'und' has a special function in BCP
     47; it stands in for 'no supplied base language'. To prevent
     this from happening, if the desired base language is und, the
     language matcher should not apply likely subtags to
     it.&nbsp;</p>
     <p>Examples:</p>
     <p>For example, suppose that nn-DE and nb-FR are being
     compared. They are first maximized to nn-Latn-DE and
     nb-Latn-FR, respectively. The list is searched. The first match
     is with "*-*-*", for a match of 96%. The languages are
     truncated to nn-Latn and nb-Latn, then to nn and nb. The first
     match is also for a value of 96%, so the result is 92%.</p>
     <p>Note that language matching is orthogonal to the how closely
     two languages are related linguistically. For example, Breton
     is more closely related to Welsh than to French, but French is
     the better match (because it is more likely that a Breton
     reader will understand French than Welsh). This also
     illustrates that the matches are often asymmetric: it is not
     likely that a French reader will understand Breton.</p>
     <p>The "*" acts as a wild card, as shown in the following
     example:</p>
     <p class="example">&lt;languageMatch desired="es-*-ES"
     supported="es-*-ES" percent="100"/&gt;<br>
     &lt;!-- Latin American Spanishes are closer to each other.
     Approximate by having es-ES be further from everything
     else.--&gt;</p>
     <p>&nbsp;</p>
     <p class="example">&lt;languageMatch desired="es-*-ES"
     supported="es-*-*" percent="93"/&gt;</p>
     <p class="example"><br>
     &lt;languageMatch desired="*" supported="*"
     percent="1"/&gt;<br>
     &lt;!-- [Default value - must be at end!] Normally there is no
     comprehension of different languages.--&gt;</p>
     <p class="example"><br>
     &lt;languageMatch desired="*-*" supported="*-*"
     percent="20"/&gt;<br>
     &lt;!-- [Default value - must be at end!] Normally there is
     little comprehension of different scripts.--&gt;</p>
     <p class="example"><br>
     &lt;languageMatch desired="*-*-*" supported="*-*-*"
     percent="96"/&gt;<br>
     &lt;!-- [Default value - must be at end!] Normally there are
     small differences across regions.--&gt;</p>
     <p>When the language+region is not matched, and there is
     otherwise no reason to pick among the supported regions for
     that language, then some measure of geographic "closeness" can
     be used. The results may be more understandable by users.
     Looking for en-SK, for example, should fall back to something
     within Europe (eg en-GB) in preference to something far away
     and unrelated (eg en-SG). Such a closeness metric does not need
     to be exact; a small amount of data can be used to give an
     approximate distance between any two regions. However, any such
     data must be used carefully; although Hong Kong is closer to
     India than to the UK, it is unlikely that en-IN would be a
     better match to en-HK than en-GB would.</p>
     <h4><a name="EnhancedLanguageMatching" href=
     "#EnhancedLanguageMatching" id="EnhancedLanguageMatching">4.4.1
     Enhanced Language Matching</a></h4>
     <p>The enhanced format for language matching adds structure to
     enable better matching of languages. It is distinguished by
     having a suffix "_new" on the type, as in the example below.
     The extended structure allows matching to take into account
     broad similarities that would give better results. For example,
     for English the regions that are or inherit from US
     (AS|GU|MH|MP|PR|UM|VI|US) form a “cluster”. Each region in that
     cluster should be closer to each other than to any other
     region. And a region outside the cluster should be closer to
     another region outside that cluster than to one inside. We get
     this issue with the “world languages” like English, Spanish,
     Portuguese, Arabic, etc.</p>
     <p><em>Example:</em></p>
     <pre>
     &lt;languageMatches type="written_new"&gt;<br>  &lt;paradigmLocales locales="en en-GB es es-419 pt-BR pt-PT"/&gt;<br> &lt;matchVariable id="$enUS" value="AS+GU+MH+MP+PR+UM+US+VI"/&gt;<br>       &lt;matchVariable id="$cnsar" value="HK+MO"/&gt;<br>        &lt;matchVariable id="$americas" value="019"/&gt;<br>       &lt;matchVariable id="$maghreb" value="MA+DZ+TN+LY+MR+EH"/&gt;<br>  &lt;languageMatch desired="no" supported="nb" distance="1"/&gt;&lt;!-- no ⇒ nb --&gt;<br>…
         &lt;languageMatch desired="ar_*_$maghreb" supported="ar_*_$maghreb" distance="4"/&gt;
                 &lt;!-- ar; *; $maghreb ⇒ ar; *; $maghreb --&gt;
         &lt;languageMatch desired="ar_*_$!maghreb"    supported="ar_*_$!maghreb"    distance="4"/&gt;
                 &lt;!-- ar; *; $!maghreb ⇒ ar; *; $!maghreb --&gt;<br>…</pre>
     <p>The <strong>matchVariable</strong> allows for a rule to
     matche to multiple regions, as illustrated by
     <strong>$maghreb</strong>. The syntax is simple: it allows for
     + for <em>union</em> and - for <em>set difference</em>, but no
     precedence. So A+B-A+D is interpreted as (((A+B)-A)+D), not as
     (A+B)-(A+D). The variable <strong>id</strong> has a value of
     the form [$][a-zA-Z0-9]+. If $X is defined, then $!X
     automatically means all those regions that are not in $X.</p>
     <p dir="ltr">When the set is interpreted, then macrolanguages
     are (logically) transformed into a list of their contents, so
     “053+GB” → “AU+GB+NF+NZ”. This is done recursively, so 009 →
     “053+054+057+061+QO” → “AU+NF+NZ+FJ+NC+PG+SB +VU...”. Note that
     we use 019 for all of the Americas in the variables above,
     because en-US should be in the same cluster as es-419 and its
     contents.</p>
     <p>In the rules, the percent value (100..0) is replaced by a
     <strong>distance</strong> value, which is the inverse
     (0..100).</p>
     <p dir="ltr">These new variables and rules divide up the world
     into clusters, where items in the same clusters (for specific
     languages) get the normal regional difference, and items in
     different clusters get different weights.</p><br>
     <p dir="ltr">Each cluster can have one or more associated
     <strong>paradigmLocales</strong>. These are locales that are
     preferred within a cluster. So when matching desired=[en-SA]
     against [en-GU en en-IN en-GB], the value en-GB is returned.
     Both of {en-GU en} are in a different cluster. While {en-IN
     en-GB} are in the same cluster, and the same distance from
     en-SA, the preference is given to en-GB because it is in the
     paradigm locales. It would be possible to express this in
     rules, but using this mechanism handles these very common cases
     without bulking up the tables.<br></p>
     <p dir="ltr">The <strong>paradigmLocales</strong> also allow
     matching to macroregions. For example, desired=[es-419] should
     match to {es-MX} more closely than to {es}, and vice versa:
     {es-MX} should match more closely to {es-419} than to {es}. But
     es-MX should match more closely to es-419 than to any of the
     other es-419 sublocales. In general, in the absence of other
     distance data, there is a ‘paradigm’ in each cluster that the
     others should match more closely to: en(-US), en-GB, es(-ES),
     es-419, ru(-RU)...</p>
     <h2><a name="XML_Format" href="#XML_Format" id="XML_Format">5
     XML Format</a></h2>
     <p>There are two kinds of data that can be expressed in LDML:
     language-dependent data and supplementary data. In either case,
     data can be split across multiple files, which can be in
     multiple directory trees.</p>
     <p>For example, the language-dependent data for Japanese in
     CLDR is present in the following files:</p>
     <ul>
       <li>common/collation/ja.xml</li>
       <li>common/main/ja.xml</li>
       <li>common/rbnf/ja.xml</li>
       <li>common/segmentations/ja.xml</li>
     </ul>
     <p>Data for cased languages such as French are in files
     like:</p>
     <ul>
       <li>common/casing/fr.xml</li>
     </ul>
     <p>The status of the data is the same, whether or not data is
     split. That is, for the purpose of validation and lookup, all
     of the data for the above ja.xml files is treated as if it was
     in a single file. These files have the &lt;ldml&gt; root
     element and use ldml.dtd. The file name must match the identity
     element. For example, the &lt;ldml&gt; file pa_Arab_PK.xml must
     contain the following elements:</p>
     <pre>
                         <strong>&lt;ldml&gt;</strong><br>       &lt;identity&gt;<br>            …<br>           <strong>&lt;language type="pa"/&gt;<br>               &lt;script type="Arab"/&gt;<br>               &lt;territory type="PK"/&gt;</strong><br>     &lt;/identity&gt;
 …</pre>
     <p>Supplemental data can have different root elements,
     currently: ldmlBCP47, supplementalData, keyboard, and platform.
     Keyboard and platform files are considered distinct. The
     ldmlBCP47 files and supplementalData files that have the same
     root are all logically part of the same file; they are simply
     split into separate files for convenience. Implementations may
     split the files in different ways, also for their convenience.
     The files in /properties are also supplemental data files, but
     are structured like UCD properties.</p>
     <p>For example, supplemental data relating to Japan or the
     Japanese writing are in:</p>
     <ul>
       <li>common/supplemental/ (in many files, such as
       supplementalData.xml)</li>
       <li>common/transforms/Hiragana-Katakana.xml</li>
       <li>common/transforms/Hiragana-Latin.xml</li>
       <li>common/properties/scriptMetadata.txt</li>
       <li>common/bcp47/calendar.xml</li>
       <li>uca/allkeys_CLDR.txt (sorting)</li>
       <li>/keyboards/chromeos/ja-t-k0-chromeos.xml</li>
       <li>...</li>
     </ul>
     <p>Like the &lt;ldml&gt; files, the keyboard file names must
     match internal data: in particular, the locale attribute on the
     keyboard element must have a value that corresponds to the file
     name, such as &lt;keyboard locale="af-t-k0-android"&gt; for the
     file af-t-k0-android.xml.</p>
     <p>The following sections describe the structure of the XML
     format for language-dependent data. The more precise syntax is
     in the ldml.dtd file<i>; however, the DTD does not describe all
     the constraints on the structure.</i></p>
     <p>To start with, the root element is &lt;ldml&gt;, with the
     following DTD entry:</p>
     <p class='dtd'>&lt;!ELEMENT ldml
     (identity,(alias|(fallback*,localeDisplayNames?,layout?,contextTransforms?,characters?,<br>

     delimiters?,measurement?,dates?,numbers?,units?,listPatterns?,collations?,posix?,<br>

     segmentations?,rbnf?,annotations?,metadata?,references?,special*)))&gt;</p>
     <p>The XML structure is stable over releases. Elements and
     attributes may be deprecated: they are retained in the DTD but
     their usage is strongly discouraged. In most cases, an
     alternate structure is provided for expressing the information.
     There is only one exception: newer DTDs cannot be used with
     version 1.1 files, without some modification.</p>
     <p>In general, all translatable text in this format is in
     element contents, while attributes are reserved for types and
     non-translated information (such as numbers or dates). The
     reason that attributes are not used for translatable text is
     that spaces are not preserved, and we cannot predict where
     spaces may be significant in translated material.</p>
     <p>There are two kinds of elements in LDML: <i>rule</i>
     elements and <i>structure</i> elements. For structure elements,
     there are restrictions to allow for effective inheritance and
     processing:</p>
     <ol>
       <li>There is no "mixed" content: if an element has textual
       content, then it cannot contain any elements.</li>
       <li>The [<a href="#XPath">XPath</a>] leading to the content
       is unique; no two different pieces of textual content have
       the same [<a href="#XPath">XPath</a>].</li>
     </ol>
     <p>Rule elements do not have this restriction, but also do not
     inherit, except as an entire block. The rule elements are
     listed in serialElements in the supplemental metadata. See also
     <i><a href="#Inheritance_and_Validity">Section 4.2 Inheritance
     and Validity</a></i>. For more technical details, see <a href=
     "http://cldr.unicode.org/development/updating-dtds">Updating-DTDs</a>.</p>
     <p>Note that the data in examples given below is purely
     illustrative, and does not match any particular language. For a
     more detailed example of this format, see [<a href=
     "#LDML">Example</a>]. There is also a DTD for this format, but
     <i>remember that the DTD alone is not sufficient to understand
     the semantics, the constraints, nor&nbsp; the
     interrelationships between the different elements and
     attributes</i>. You may wish to have copies of each of these to
     hand as you proceed through the rest of this document.</p>
     <p>In particular, all elements allow for draft versions to
     coexist in the file at the same time. Thus most elements are
     marked in the DTD as allowing multiple instances. However,
     unless an element is listed as a serialElement, or has a
     distinguishing attribute, it can only occur once as a
     subelement of a given element. Thus, for example, the following
     is illegal even though allowed by the DTD:</p>
     <p>&lt;languages&gt;<br>
     &nbsp; &lt;language type="aa"&gt;...&lt;/language&gt;<br>
     &nbsp; &lt;language type="aa"&gt;..&lt;/language&gt;</p>
     <p>There must be only one instance of these per parent, unless
     there are other distinguishing attributes (such as an alt
     element).</p>
     <p>In general, LDML data should be in NFC format. However,
     certain elements may need to contain characters that are not in
     NFC, including exemplars, transforms, segmentations, and
     p/s/t/i/pc/sc/tc/ic rules in collation. These elements must not
     be normalized (either to NFC or NFD), or their meaning may be
     changed. Thus LDML documents must not be normalized as a whole.
     To prevent problems with normalization, no element value can
     start with a combining slash (U+0338 COMBINING LONG SOLIDUS
     OVERLAY).</p>
     <p>Lists, such as <span class=
     "attribute">singleCountries</span> are space-delimited. That
     means that they are separated by one or more XML whitespace
     characters,</p>
     <ul>
       <li>singleCountries</li>
       <li>preferenceOrdering</li>
       <li>references</li>
     </ul>
     <h3><a name="Common_Elements" href="#Common_Elements" id=
     "Common_Elements">5.1 Common Elements</a></h3>
     <p>At any level in any element, two special elements are
     allowed.</p>
     <h4><a name="special" href="#special" id="special">5.1.1
     Element special</a></h4>
     <p>This element is designed to allow for arbitrary additional
     annotation and data that is product-specific. It has one
     required attribute <span class="attribute">xmlns</span>, which
     specifies the XML <a href=
     "https://www.w3.org/TR/REC-xml-names/">namespace</a> of the
     special data. For example, the following used the version 1.0
     POSIX special element.</p>
     <pre>&lt;!DOCTYPE ldml SYSTEM "<span style=
     "color: blue">https://unicode.org/cldr/dtd/1.0/ldml.dtd</span>" [
     &lt;!ENTITY % posix SYSTEM "<span style=
 "color: blue">https://unicode.org/cldr/dtd/1.0/ldmlPOSIX.dtd</span>"&gt;
 <span style="color: blue">%posix;</span>
 ]&gt;
 &lt;ldml&gt;
 ...
 &lt;special xmlns:posix="<span style=
 "color: blue">https://www.opengroup.org/regproducts/xu.htm</span>"&gt;
         <span style=
 "color: green">&lt;!-- old abbreviations for pre-GUI days --&gt;</span>
         &lt;posix:messages&gt;
             &lt;posix:yesstr&gt;<span style=
 "color: blue">Yes</span>&lt;/posix:yesstr&gt;
             &lt;posix:nostr&gt;<span style=
 "color: blue">No</span>&lt;/posix:nostr&gt;
             &lt;posix:yesexpr&gt;<span style=
 "color: blue">^[Yy].*</span>&lt;/posix:yesexpr&gt;
             &lt;posix:noexpr&gt;<span style=
 "color: blue">^[Nn].*</span>&lt;/posix:noexpr&gt;
         &lt;/posix:messages&gt;
     &lt;/special&gt;
 &lt;/ldml&gt;
 </pre>
     <h5><a name="Sample_Special_Elements" href=
     "#Sample_Special_Elements" id="Sample_Special_Elements">5.1.1.1
     Sample Special Elements</a></h5>
     <p>The elements in this section are <i><b>not</b></i> part of
     the Locale Data Markup Language 1.0 specification. Instead,
     they are special elements used for application-specific data to
     be stored in the Common Locale Repository. They may change or
     be removed future versions of this document, and are present
     her more as examples of how to extend the format. (Some of
     these items may move into a future version of the Locale Data
     Markup Language specification.)</p>
     <ul>
       <li><a href=
       "https://unicode.org/cldr/dtd/1.1/ldmlICU.dtd">https://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</a></li>
       <li><a href=
       "https://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd">https://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</a></li>
     </ul>
     <p>The above examples are old versions: consult the
     documentation for the specific application to see which should
     be used.</p>
     <p>These DTDs use namespaces and the special element. To
     include one or more, use the following pattern to import the
     special DTDs that are used in the file:</p>
     <pre>&lt;?xml version="<span style=
     "color: blue">1.0</span>" encoding="<span style=
     "color: blue">UTF-8</span>" ?&gt;
 &lt;!DOCTYPE ldml SYSTEM "<span style=
 "color: blue">https://unicode.org/cldr/dtd/1.1/ldml.dtd</span>" [
     &lt;!ENTITY % <span style=
 "color: blue">icu</span> SYSTEM "<span style=
 "color: blue">https://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>"&gt;
     &lt;!ENTITY % <span style=
 "color: blue">openOffice</span> SYSTEM "<span style=
 "color: blue">https://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</span>"&gt;
 <span style="color: blue">%icu;
 %openOffice;
 </span>]&gt;</pre>
     <p>Thus to include just the ICU DTD, one uses:</p>
     <pre>&lt;?xml version="<span style=
     "color: blue">1.0</span>" encoding="<span style=
     "color: blue">UTF-8</span>" ?&gt;
 &lt;!DOCTYPE ldml SYSTEM "<span style=
 "color: blue">https://unicode.org/cldr/dtd/1.1/ldml.dtd</span>" [
     &lt;!ENTITY % icu SYSTEM "<span style=
 "color: blue">https://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>"&gt;
 <span style="color: blue">%icu;
 </span>]&gt;</pre>
     <blockquote>
       <p><b>Note:</b> A previous version of this document contained
       a special element for <a href=
       "http://www.open-std.org/jtc1/sc22/wg20/docs/n897-14652w25.pdf">
       ISO TR 14652</a> compatibility data. That element has been
       withdrawn, pending further investigation, since 14652 is a
       Type 1 TR: "when the required support cannot be obtained for
       the publication of an International Standard, despite
       repeated effort". See the ballot comments on <a href=
       "http://www.open-std.org/jtc1/sc22/wg20/docs/n948-J1N6769-14652.pdf">
       14652 Comments</a> for details on the 14652 defects. For
       example, most of these patterns make little provision for
       substantial changes in format when elements are empty, so are
       not particularly useful in practice. Compare, for example,
       the mail-merge capabilities of production software such as
       Microsoft Word or OpenOffice.</p>
       <p><b>Note:</b> While the CLDR specification guarantees
       backwards compatibility, the definition of specials is up to
       other organizations. Any assurance of backwards compatibility
       is up to those organizations.</p>
     </blockquote>
     <p>A number of the elements above can have extra information
     for <a name="OpenOffice" href="#OpenOffice" id=
     "OpenOffice">openoffice.org</a>, such as the following
     example:</p>
     <pre>    &lt;special xmlns:openOffice="<span style=
     "color: blue">https://www.openoffice.org</span>"&gt;
         &lt;openOffice:search&gt;
             &lt;openOffice:searchOptions&gt;
                 &lt;openOffice:transliterationModules&gt;<span style="color: blue">IGNORE_CASE</span>&lt;/openOffice:transliterationModules&gt;
             &lt;/openOffice:searchOptions&gt;
         &lt;/openOffice:search&gt;
     &lt;/special&gt;
 </pre>
     <h4><a name="Alias_Elements" href="#Alias_Elements" id=
     "Alias_Elements">5.1.2 Element alias</a></h4>
     <p class="dtd">&lt;!ELEMENT alias (special*) &gt;<br>
     &lt;!ATTLIST alias source NMTOKEN #REQUIRED &gt;<br>
     &lt;!ATTLIST alias path CDATA #IMPLIED&gt;</p>
     <p>The contents of any element in root can be replaced by an
     alias, which points to the path where the data can be
     found.</p>
     <p>Aliases will only ever appear in root with the form
     //ldml/.../alias[@source="locale"][@path="..."].</p>
     <p>Consider the following example in root:</p>
     <pre>
       &lt;calendar type="gregorian"&gt;<br> &lt;months&gt;<br>      &lt;default choice="format"/&gt;<br>      &lt;monthContext type="format"&gt;<br>            &lt;default choice="wide"/&gt;<br>            &lt;monthWidth type="abbreviated"&gt;<br>             <strong>&lt;alias source="locale" path="../monthWidth[@type='wide']"/&gt;</strong><br>                      &lt;/monthWidth&gt;</pre>
     <p>If the locale "de_DE" is being accessed for a month name for
     format/abbreviated, then a resource bundle at "de_DE" will be
     searched for a resource element at the that path. If not found
     there, then the resource bundle at "de" will be searched, and
     so on. When the alias is found in root, then the search is
     restarted, but searching for format/<strong>wide</strong>
     element instead of format/abbreviated.</p>
     <p>If the <b>path</b> attribute is present, then its value is
     an [<a href="#XPath">XPath</a>] that points to a different node
     in the tree. For example:</p>
     <pre>
     &lt;alias source="locale" path="../monthWidth[@type='wide']"/&gt;</pre>
     <p>The default value if the path is not present is the same
     position in the tree. All of the attributes in the [<a href=
     "#XPath">XPath</a>] must be <i>distinguishing</i> elements. For
     more details, see <a href="#Inheritance_and_Validity">Section
     4.2 Inheritance and Validity</a>.</p>
     <p>There is a special value for the source attribute, the
     constant <b>source="locale"</b>. This special value is
     equivalent to the locale being resolved. For example, consider
     the following example, where locale data for 'de' is being
     resolved:</p>
     <div align="center">
       <center>
         <table border="1" cellpadding="0" cellspacing="1">
           <caption>
             <a name="Inheritance_with_source_locale_" href=
             "#Inheritance_with_source_locale_" id=
             "Inheritance_with_source_locale_">Inheritance with
             source="locale"</a>
           </caption>
           <tr>
             <th>Root</th>
             <th>de</th>
             <th bgcolor="#C0C0C0">Resolved</th>
           </tr>
           <tr>
             <td><code>&lt;x&gt;<br>
             &nbsp; &lt;a&gt;1&lt;/a&gt;<br>
             &nbsp; &lt;b&gt;2&lt;/b&gt;<br>
             &nbsp; &lt;c&gt;3&lt;/c&gt;<br>
             <br>
             &lt;/x&gt;</code></td>
             <td><code>&lt;x&gt;<br>
             &nbsp;&lt;a&gt;11&lt;/a&gt;<br>
             &nbsp;&lt;b&gt;12&lt;/b&gt;<br>
             <br>
             &nbsp;&lt;d&gt;14&lt;/d&gt;<br>
             &lt;/x&gt;</code></td>
             <td bgcolor="#C0C0C0"><code>&lt;x&gt;<br>
             &nbsp;&lt;a&gt;11&lt;/a&gt;<br>
             &nbsp;&lt;b&gt;12&lt;/b&gt;<br>
             &nbsp;<span style=
             "background-color: #FFFF00"><span class=
             "inherited"><span style=
             "font-weight: 400;">&lt;c&gt;3&lt;/c&gt;</span></span></span><br>

             &nbsp;&lt;d&gt;14&lt;/d&gt;<br>
             &lt;/x&gt;</code></td>
           </tr>
           <tr>
             <td><code>&lt;y&gt;<br>
             &nbsp;&lt;alias source="locale" path="../x"&gt;<br>
             &lt;/y&gt;</code></td>
             <td><code>&lt;y&gt;<br>
             <br>
             &nbsp;&lt;b&gt;22&lt;/b&gt;<br>
             <br>
             <br>
             &nbsp;&lt;e&gt;25&lt;/e&gt;<br>
             &lt;/y&gt;</code></td>
             <td bgcolor="#C0C0C0"><code>&lt;y&gt;<br>
             &nbsp;<span style=
             "background-color: #FFFF00"><span class=
             "inherited"><span style=
             "font-weight: 400;">&lt;a&gt;11&lt;/a&gt;</span></span></span><br>

             &nbsp;&lt;b&gt;22&lt;/b&gt;<br>
             &nbsp;<span style=
             "background-color: #FFFF00"><span class=
             "inherited"><span style=
             "font-weight: 400;">&lt;c&gt;3&lt;/c&gt;</span></span></span><br>

             &nbsp;<span style=
             "background-color: #FFFF00"><span class=
             "inherited"><span style=
             "font-weight: 400;">&lt;d&gt;14&lt;/d&gt;</span></span></span><br>

             &nbsp;&lt;e&gt;25&lt;/e&gt;<br>
             &lt;/y&gt;</code></td>
           </tr>
         </table>
       </center>
     </div>
     <p>The first row shows the inheritance within the &lt;x&gt;
     element, whereby &lt;c&gt; is inherited from root. The second
     shows the inheritance within the &lt;y&gt; element, whereby
     &lt;a&gt;, &lt;c&gt;, and &lt;d&gt; are inherited also from
     root, but from an alias there. The alias in root is logically
     replaced not by the elements in root itself, but by elements in
     the 'target' locale.</p>
     <p>For more details on data resolution, see <a href=
     "#Inheritance_and_Validity">Section 4.2 Inheritance and
     Validity</a>.</p>
     <p>Aliases must be resolved recursively. An alias may point to
     another path that results in another alias being found, and so
     on. For example, looking up Thai buddhist abbreviated months
     for the locale <strong>xx-YY</strong> may result in the
     following chain of aliases being followed:</p>
     <blockquote>
       <p>
       ../../calendar[@type="buddhist"]/months/monthContext[@type="format"]/monthWidth[@type="abbreviated"]</p>
       <p>xx-YY → xx → root // finds alias that changes path to:</p>
       <p>
       ../../calendar[@type="gregorian"]/months/monthContext[@type="format"]/monthWidth[@type="abbreviated"]</p>
       <p>xx-YY → xx → root // finds alias that changes path to:</p>
       <p>
       ../../calendar[@type="gregorian"]/months/monthContext[@type="format"]/monthWidth[@type="wide"]</p>
       <p>xx-YY → xx // finds value here</p>
     </blockquote>
     <p>It is an error to have a circular chain of aliases. That is,
     a collection of LDML XML documents must not have situations
     where a sequence of alias lookups (including inheritance and
     lateral inheritance) can be followed indefinitely without
     terminating.</p>
     <h4><a name="Element_displayName" href="#Element_displayName"
     id="Element_displayName">5.1.3 Element displayName</a></h4>
     <p>Many elements can have a display name. This is a translated
     name that can be presented to users when discussing the
     particular service. For example, a number format, used to
     format numbers using the conventions of that locale, can have
     translated name for presentation in GUIs.</p>
     <pre>  &lt;numberFormat&gt;
     &lt;displayName&gt;<span style=
 "color: blue">Prozentformat</span>&lt;/displayName&gt;
 ...
   &lt;numberFormat&gt;</pre>
     <p>Where present, the display names must be unique; that is,
     two distinct code would not get the same display name.&nbsp;
     (There is one exception to this: in time zones, where parsing
     results would give the same GMT offset, the standard and
     daylight display names can be the same across different time
     zone IDs.) Any translations should follow customary practice
     for the locale in question. For more information, see [<a href=
     "#DataFormats">Data Formats</a>].</p>
     <h4><a name="Escaping_Characters" href="#Escaping_Characters"
     id="Escaping_Characters">5.1.4 Escaping Characters</a></h4>
     <p>Unfortunately, XML does not have the capability to contain
     all Unicode code points. Due to this, in certain instances
     extra syntax is required to represent those code points that
     cannot be otherwise represented in element content. The
     escaping syntax is only defined on a few types of elements,
     such as in collation or exemplar sets, and uses the appropriate
     syntax for that type.</p>
     <p>The element &lt;cp&gt;, which was formerly used for this
     purpose, has been deprecated.</p>
     <h3><a name="Common_Attributes" href="#Common_Attributes" id=
     "Common_Attributes">5.2 Common Attributes</a></h3>
     <h4><a name="Attribute_type" href="#Attribute_type" id=
     "Attribute_type">5.2.1 Attribute type</a></h4>
     <p>The attribute <i>type</i> is also used to indicate an
     alternate resource that can be selected with a matching
     type=option in the locale id modifiers, or be referenced by a
     default element. For example:</p>
     <pre>&lt;ldml&gt;
   ...
   &lt;currencies&gt;
     &lt;currency&gt;<span style=
 "color: blue">...</span>&lt;/currency&gt;
     &lt;currency type="<span style=
 "color: blue">preEuro</span>"&gt;<span style=
 "color: blue">...</span>&lt;/currency&gt;
   &lt;/currencies&gt;
 &lt;/ldml&gt;</pre>
     <h4><a name="Attribute_draft" href="#Attribute_draft" id=
     "Attribute_draft">5.2.2 Attribute draft</a></h4>
     <p>If this attribute is present, it indicates the status of all
     the data in this element and any subelements (unless they have
     a contrary <i>draft</i> value), as per the following:</p>
     <ul>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
       <i>approved:</i> fully approved by the technical committee
       (equals the CLDR 1.3 value of <i>false</i>, or an absent
       <i>draft</i> attribute). This does not mean that the data is
       guaranteed to be error-free—this is the best judgment of the
       committee.</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
       <i>contributed</i>: partially approved by the technical
       committee.</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
       <i>provisional</i>: partially confirmed. Implementations may
       choose to accept the provisional data, especially if there is
       no translated alternative.</li>
       <li style="margin-top: 0.5em; margin-bottom: 0.5em">
       <i>unconfirmed</i>: no confirmation available.</li>
     </ul>
     <p>For more information on precisely how these values are
     computed for any given release, see
 	<a href=
     "http://cldr.unicode.org/index/process#TOC-Data--Submission-and-Vetting">
     Data Submission and Vetting Process</a> on the CLDR
     website.</p>
     <p>The draft attribute should only occur on "leaf" elements,
     and is deprecated elsewhere. For a more formal description of
     how elements are inherited, and what their draft status is, see
     <i><a href="#Inheritance_and_Validity">Section 4.2 Inheritance
     and Validity</a></i>.</p>
     <h4><a name="alt_attribute" href="#alt_attribute" id=
     "alt_attribute">5.2.3 Attribute alt</a></h4>
     <p>This attribute labels an alternative value for an element.
     The value is a <i>descriptor</i> indicates what kind of
     alternative it is, and takes one of the following</p>
     <ul>
       <li><i>variantname</i> meaning that the value is a variant of
       the normal value, and may be used in its place in certain
       circumstances. If a variant value is absent for a particular
       locale, the normal value is used. The variant mechanism
       should only be used when such a fallback is acceptable.</li>
       <li><span style="color: blue">proposed</span>, optionally
       followed by a number, indicating that the value is a proposed
       replacement for an existing value.</li>
       <li><i>variantname</i><span style=
       "color: blue">-proposed</span>, optionally followed by a
       number, indicating that the value is a proposed replacement
       variant value.</li>
     </ul>
     <p>"<span style="color: blue">proposed</span>" should only be
     present if the draft status is not "approved". It indicates
     that the data is proposed replacement data that has been added
     provisionally until the differences between it and the other
     data can be vetted. For example, suppose that the translation
     for September for some language is "Settembru", and a bug
     report is filed that that should be "Settembro". The new data
     can be entered in, but marked as <i>alt="proposed"</i> until it
     is vetted.</p>
     <pre>...
 &lt;month type="9"&gt;Settembru&lt;/month&gt;
 &lt;month type="9" draft="unconfirmed" alt="proposed"&gt;Settembro&lt;/month&gt;
 &lt;month type="10"&gt;...</pre>
     <p>Now assume another bug report comes in, saying that the
     correct form is actually "Settembre". Another alternative can
     be added:</p>
     <pre>...
 &lt;month type="9" draft="unconfirmed" alt="proposed2"&gt;Settembre&lt;/month&gt;
 ...</pre>
     <p>The values for <i>variantname</i> at this time include
     "<span style="color: blue">variant</span>", "<span style=
     "color: blue">list</span>", "<span style=
     "color: blue">email</span>", "<span style=
     "color: blue">www</span>", "<span class=
     "attributeValue">short</span>", and "<span style=
     "color: blue">secondary</span>".</p>
     <p>For a more complete description of how draft applies to
     data, see <i><a href="#Inheritance_and_Validity">Section 4.2
     Inheritance and Validity</a></i>.</p>
     <p class="element2">Attribute <a name="references_attribute"
     href="#references_attribute" id=
     "references_attribute">references</a></p>
     <p>The value of this attribute is a token representing a
     reference for the information in the element, including
     standards that it may conform to. &lt;references&gt;. (In older
     versions of CLDR, the value of the attribute was freeform text.
     That format is deprecated.)</p>
     <p><i>Example:</i></p>
     <p class="example">&lt;territory type="UM"
     references="R222"&gt;USAs yttre öar&lt;/territory&gt;</p>
     <p>The reference element may be inherited. Thus, for example,
     R222 may be used in sv_SE.xml even though it is not defined
     there, if it is defined in sv.xml.</p>
     <p>&lt;... allow="verbatim" ...&gt; (deprecated)</p>
     <p>This attribute was originally intended for use in marking
     display names whose capitalization differed from what was
     indicated by the now-deprecated &lt;inText&gt; element
     (perhaps, for example, because the names included a proper
     noun). It was never supported in the dtd and is not needed for
     use with the new &lt;contextTransforms&gt; element.</p>
     <h3><a name="Common_Structures" href="#Common_Structures" id=
     "Common_Structures">5.3 Common Structures</a></h3>
     <h4><a name="Date_Ranges" href="#Date_Ranges" id=
     "Date_Ranges">5.3.1 Date and Date Ranges</a></h4>
     <p>When attribute specify date ranges, it is usually done with
     attributes <i>from</i> and <i>to</i>. The <i>from</i> attribute
     specifies the starting point, and the <i>to</i> attribute
     specifies the end point. The deprecated <i>time</i> attribute
     was formerly used to specify time with the deprecated
     weekEndStart and weekEndEnd elements, which were themselves
     inherently <i>from</i> or <i>to</i>.</p>
     <p>The data format is a restricted ISO 8601 format, restricted
     to the fields <i>year, month, day, hour, minute,</i> and
     <i>second</i> in that order, with "-" used as a separator
     between date fields, a space used as the separator between the
     date and the time fields, and ":" used as a separator between
     the time fields. If the minute or minute and second are absent,
     they are interpreted as zero. If the hour is also missing, then
     it is interpreted based on whether the attribute is <i>from</i>
     or <i>to</i>.</p>
     <ul>
       <li>
         <p class="note"><i>from</i> defaults to "00:00:00"
         (midnight at the start of the day).</p>
       </li>
       <li>
         <p class="note"><i>to</i> defaults to "24:00:00" (midnight
         at the end of the day).</p>
       </li>
     </ul>
     <p class="note">That is, Friday at 24:00:00 is the same time as
     Saturday at 00:00:00. Thus when the hour is missing, the
     <i>from and to</i> are interpreted inclusively: the range
     includes all of the day mentioned.</p>
     <p class="note">For example, the following are equivalent:</p>
     <table style="margin-top: 0.5em; margin-bottom: 0.5em" id=
     "table25">
       <tr>
         <td>&lt;usesMetazone from="1991-10-27" to="2006-04-02"
         .../&gt;</td>
       </tr>
       <tr>
         <td>&lt;usesMetazone from="1991-10-27 00:00:00"
         to="2006-04-02 24:00:00" .../&gt;</td>
       </tr>
       <tr>
         <td>&lt;usesMetazone from="1991-10-<font color=
         "#FF0000"><b>26 24</b></font>:00:00"
         to="2006-04-<font color="#FF0000"><b>03
         00</b></font>:00:00" .../&gt;</td>
       </tr>
     </table>
     <p>If the <i>from</i> element is missing, it is assumed to be
     as far backwards in time as there is data for; if the <i>to</i>
     element is missing, then it is from this point onwards, with no
     known end point.</p>
     <p>The dates and times are specified in local time, unless
     otherwise noted. (In particular, the metazone values are in UTC
     (also known as GMT).</p>
     <h4><a name="Text_Directionality" href="#Text_Directionality"
     id="Text_Directionality">5.3.2 Text Directionality</a></h4>
     <p>The content of certain elements, such as date or number
     formats, may consist of several sub-elements with an inherent
     order (for example, the year, month, and day for dates). In
     some cases, the order of these sub-elements may be changed
     depending on the bidirectional context in which the element is
     embedded.</p>
     <p>For example, short date formats in languages such as Arabic
     may contain neutral or weak characters at the beginning or end
     of the element content. In such a case, the overall order of
     the sub-elements may change depending on the surrounding
     text.</p>
     <p>Element content whose display may be affected in this way
     should include an explicit direction mark, such as U+200E
     LEFT-TO-RIGHT MARK or U+200F RIGHT-TO-LEFT MARK, at the
     beginning or end of the element content, or both.</p>
     <h4><a name="Unicode_Sets" href="#Unicode_Sets" id=
     "Unicode_Sets">5.3.3 Unicode Sets</a></h4>
     <p>Some attribute values or element contents use
     <em>UnicodeSet</em> notation. A UnicodeSet represents a finite
     set of Unicode code points and strings, and is defined by lists
     of code points and strings, Unicode property sets, and set
     operators, all bounded by square brackets. In this context, a
     code point means a string consisting of exactly one code
     point.</p>
     <p>A UnicodeSet implements the semantics in <i>UTS #18: Unicode
     Regular Expressions</i> [<a href=
     "https://www.unicode.org/reports/tr41/#UTS18">UTS18</a>] Levels
     1 &amp; 2 that are relevant to determining sets of characters.
     Note however that it may deviate from the syntax provided in
     [<a href=
     "https://www.unicode.org/reports/tr41/#UTS18">UTS18</a>], which
     is illustrative rather than a requirement. There is one
     exception to the supported semantics, Section <a href=
     "https://unicode.org/reports/tr18/#RL2.6">RL2.6</a>
     <em>Wildcards in Property Values</em>. That feature can be
     supported in clients such as ICU by implementing a “hook” as is
     done in the <a href=
     "https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7Bname%3D%2FAPPLE%2F%7D">
     online UnicodeSet utilities</a>.</p>
     <p>A UnicodeSet may be cited in specifications outside of the
     domain of LDML. In such a case, the specification may specify a
     subset of the syntax provided here.</p>
     <p>The following provides EBNF syntax for a UnicodeSet:</p>
     <div align='center'>
       <table class='simple'>
         <tr>
           <th>Symbol</th>
           <th>Expression</th>
           <th>Examples</th>
         </tr>
         <tr>
           <th>root</th>
           <td><code>= prop<br>
           | '[-]'<br>
           | '[' [\-\^]? s seq+ ']'</code></td>
           <td>\p{x=y},<br>
           [abc]</td>
         </tr>
         <tr>
           <th>seq</th>
           <td><code>= root (s [\&amp;\-] s root)* s<br>
           | range s</code></td>
           <td>[abc]-[cde], a<br></td>
         </tr>
         <tr>
           <th>range</th>
           <td><code>= char ('-' char)?<br>
           | '{' (s char)+ s '}'</code></td>
           <td>a, a-c, {abc}</td>
         </tr>
         <tr>
           <th>prop</th>
           <td><code>= '\' [pP] '{' propName ([≠=] s value1+)?
           '}'<br>
           | '[:' '^'? propName ([≠=] s value2+)? ':]'</code></td>
           <td>\p{x=y}, [:x=y:]<br></td>
         </tr>
         <tr>
           <th>propName</th>
           <td><code>= s [A-Za-z0-9] [A-Za-z0-9_\x20]* s</code></td>
           <td>General_Category,<br>
           General Category</td>
         </tr>
         <tr>
           <th>value1</th>
           <td><code>= [^\}]<br>
           | '\' quoted</code></td>
           <td>Lm,<br>
           \n,<br>
           \}</td>
         </tr>
         <tr>
           <th>value2</th>
           <td><code>= [^:]<br>
           | '\' quoted</code></td>
           <td>Lm,<br>
           \n,<br>
           \:</td>
         </tr>
         <tr>
           <th>char</th>
           <td><code>= [^\&amp; \- \[ \[ \] \\ \} \{ [:Pat_WS:]]<br>
           | '\' quoted</code></td>
           <td>a, b, c, \n</td>
         </tr>
         <tr>
           <th>quoted</th>
           <td><code>= 'u' (hex{4} | bracketedHex)<br>
           | 'x' (hex{2} | bracketedHex)<br>
           | 'U00' ('0' hex{5} | '10' hex{4})<br>
           | 'N{' propName '}'<br>
           | [[\u0000-\U00010FFFF]-[uxUN]]</code></td>
           <td><em><strong>error</strong> if lengths not exact</em></td>
         </tr>
         <tr>
           <th>charName</th>
           <td><code>= s [A-Za-z0-9] [-A-Za-z0-9_\x20]* s</code></td>
           <td>TIBETAN LETTER -A</td>
         </tr>
         <tr>
           <th>bracketedHex</th>
           <td><code>= '{' s hexCodePoint (s hexCodePoint)* s
           '}'</code></td>
           <td>{61 2019 62}</td>
         </tr>
         <tr>
           <th>hexCodePoint</th>
           <td><code>= hex{1,5} | '10' hex{4}</code></td>
           <td>&nbsp;</td>
         </tr>
         <tr>
           <th>hex</th>
           <td><code>= [0-9A-Fa-f]</code></td>
           <td>&nbsp;</td>
         </tr>
         <tr>
           <th>s</th>
           <td><code>= [:Pattern_White_Space:]*</code></td>
           <td>optional whitespace</td>
         </tr>
       </table>
     </div>
     <p>Some constraints on UnicodeSet syntax are not captured by
     this EBNF. Notably, property names and values are restricted to
     those supported by the implementation, and have additional constraints imposed by
     [<a href="https://unicode.org/reports/tr41/#UAX44">UAX44</a>]. In addition, quoted
     values that resolve to more than one code point are disallowed in ranges of the form
     <code>char '-' char</code>.</p>
     <p>The syntax characters are listed in the table below:</p>
     <table>
       <tbody>
         <tr>
           <th>Char</th>
           <th>Hex</th>
           <th>Name</th>
           <th>Usage</th>
         </tr>
         <tr>
           <td>$</td>
           <td>U+0024</td>
           <td>DOLLAR SIGN</td>
           <td>Equivalent of \uFFFF (This is for implementations
           that return \uFFFF when accessing before the first or
           after the last character)</td>
         </tr>
         <tr>
           <td>&amp;</td>
           <td>U+0026</td>
           <td>AMPERSAND</td>
           <td>Intersecting UnicodeSets</td>
         </tr>
         <tr>
           <td>-</td>
           <td>U+002D</td>
           <td>HYPHEN-MINUS</td>
           <td>Ranges of characters; also set difference.</td>
         </tr>
         <tr>
           <td>:</td>
           <td>U+003A</td>
           <td>COLON</td>
           <td>POSIX-style property syntax</td>
         </tr>
         <tr>
           <td>[</td>
           <td>U+005B</td>
           <td>LEFT SQUARE BRACKET</td>
           <td>Grouping; POSIX property syntax</td>
         </tr>
         <tr>
           <td>]</td>
           <td>U+005D</td>
           <td>RIGHT SQUARE BRACKET</td>
           <td>Grouping; POSIX property syntax</td>
         </tr>
         <tr>
           <td>\</td>
           <td>U+005C</td>
           <td>REVERSE SOLIDUS</td>
           <td>Escaping</td>
         </tr>
         <tr>
           <td>^</td>
           <td>U+005E</td>
           <td>CIRCUMFLEX ACCENT</td>
           <td>Posix negation syntax</td>
         </tr>
         <tr>
           <td>{</td>
           <td>U+007B</td>
           <td>LEFT CURLY BRACKET</td>
           <td>Strings in set; Perl property syntax</td>
         </tr>
         <tr>
           <td>}</td>
           <td>U+007D</td>
           <td>RIGHT CURLY BRACKET</td>
           <td>Strings in set; Perl property syntax</td>
         </tr>
         <tr>
           <td>&nbsp;</td>
           <td>U+0020 U+0009..U+000D U+0085<br>
           U+200E U+200F<br>
           U+2028 U+2029</td>
           <td>ASCII whitespace,<br>
           LRM, RLM,<br>
           LINE/PARAGRAPH SEPARATOR</td>
           <td>Ignored except when escaped</td>
         </tr>
       </tbody>
     </table><br>
     <h5><a href="#Lists_of_Code_Points" name="Lists_of_Code_Points"
     id="Lists_of_Code_Points">5.3.3.1 Lists of Code Points</a></h5>
     <p>Lists are a sequence of strings that may include ranges,
     which are indicated by a '-' between two code points, as in
     "a-z". The sequence <em>start-end</em> specifies the range of
     all code points from the start to end, inclusive, in Unicode
     order. For example, <b>[a c d-f m]</b> is equivalent to <b>[a c
     d e f m]</b>. Whitespace can be freely used for clarity, as
     <b>[a c d-f m]</b> means the same as <b>[acd-fm]</b>.</p>
     <p>A string with multiple code points is represented in a list
     by being surrounded by curly braces, such as in <strong>[a-z
     {ch}]</strong>. It can be used with the range notation, as
     described in <em>Section <a href="#String_Range">5.3.4 String
     Range</a></em> . There is an additional restriction on string
     ranges in a UnicodeSet: the number of codepoints in the first
     string of the range must be identical to the number in the
     second. Thus [{ab}-{c}] and [{ab}-c] are invalid.</p>
     <p>In UnicodeSets, there are two ways to quote syntax code
     points:</p>
     <p><a name="Backslash_Escapes" id=
     "Backslash_Escapes"></a>Outside of single quotes, certain
     backslashed code point sequences can be used to quote code
     points:</p>
     <table class='simple'>
       <tr>
         <td>\x{h...h}<br>
         \u{h...h}</td>
         <td>list of 1-6 hex digits ([0-9A-Fa-f]), separated by
         spaces</td>
       </tr>
       <tr>
         <td>\xhh</td>
         <td>2 hex digits</td>
       </tr>
       <tr>
         <td>\uhhhh</td>
         <td>Exactly 4 hex digits</td>
       </tr>
       <tr>
         <td>\Uhhhhhhhh</td>
         <td>Exactly 8 hex digits</td>
       </tr>
       <tr>
         <td>\a</td>
         <td>U+0007 (BEL / ALERT)</td>
       </tr>
       <tr>
         <td>\b</td>
         <td>U+0008 (BACKSPACE)</td>
       </tr>
       <tr>
         <td>\t</td>
         <td>U+0009 (TAB / CHARACTER TABULATION)</td>
       </tr>
       <tr>
         <td>\n</td>
         <td>U+000A (LINE FEED)</td>
       </tr>
       <tr>
         <td>\v</td>
         <td>U+000B (LINE TABULATION)</td>
       </tr>
       <tr>
         <td>\f</td>
         <td>U+000C (FORM FEED)</td>
       </tr>
       <tr>
         <td>\r</td>
         <td>U+000D (CARRIAGE RETURN)</td>
       </tr>
       <tr>
         <td>\\</td>
         <td>U+005C (BACKSLASH / REVERSE SOLIDUS)</td>
       </tr>
       <tr>
         <td>\N{name}</td>
         <td>The Unicode code point named "name".</td>
       </tr>
       <tr>
         <td>\p{…},\P{…}</td>
         <td>Unicode property (see below)</td>
       </tr>
     </table><br>
     <p>Anything else following a backslash is mapped to itself,
     except the property syntax described below, or in an
     environment where it is defined to have some special
     meaning.</p>
     <p>Any code point formed as the result of a backslash escape
     loses any special meaning and is treated as a literal. In
     particular, note that \x, \u and \U escapes create literal code
     points. (In contrast, Java treats Unicode escapes as just a way
     to represent arbitrary code points in an ASCII source file, and
     any resulting code points are <i><b>not</b></i> tagged as
     literals.)</p>
     <p>Unicode property sets are defined as described as described
     in <i>UTS #18: Unicode Regular Expressions</i> [<a href=
     "https://www.unicode.org/reports/tr41/#UTS18">UTS18</a>], Level
     1 and RL2.5, including the syntax where given. For an example
     of a concrete implementation of this, see [<a href=
     "#ICUUnicodeSet">ICUUnicodeSet</a>].</p>
     <h5><a href="#Unicode_Properties" name="Unicode_Properties" id=
     "Unicode_Properties">5.3.3.2 Unicode Properties</a></h5>
     <p>Briefly, Unicode property sets are specified by any Unicode
     property and a value of that property, such as
     <b>[:General_Category=Letter:]</b>. for Unicode letters or
     <b>\p{uppercase}</b> is the set of upper case letters in
     Unicode. The property names are defined by the
     PropertyAliases.txt file and the property values by the
     PropertyValueAliases.txt file. For more information, see
     [<a href="https://unicode.org/reports/tr41/#UAX44">UAX44</a>].
     The syntax for specifying the property sets is an extension of
     either POSIX or Perl syntax, by the addition of
     "=&lt;value&gt;". For example, you can match letters by using
     the POSIX-style syntax:</p>
     <p><b>[:General_Category=Letter:]</b></p>
     <p>or by using the Perl-style syntax</p>
     <p><b>\p{General_Category=Letter}</b>.</p>
     <p>Property names and values are case-insensitive, and
     whitespace, "-", and "_" are ignored. The property name can be
     omitted for the <strong>General_Category</strong> and
     <strong>Script</strong> properties, but is required for other
     properties. If the property value is omitted, it is assumed to
     represent a boolean property with the value "true". Thus
     <b>[:Letter:]</b> is equivalent to
     <b>[:General_Category=Letter:]</b>, and <b>[:Wh-ite-s
     pa_ce:]</b> is equivalent to <b>[:Whitespace=true:]</b>.</p>
     <p>The table below shows the two kinds of syntax: POSIX and
     Perl style. Also, the table shows the "Negative" version, which
     is a property that excludes all code points of a given kind.
     For example, <b>[:^Letter:]</b> matches all code points that
     are not <b>[:Letter:]</b>.</p>
     <table>
       <tr>
         <th>&nbsp;</th>
         <th>Positive</th>
         <th>Negative</th>
       </tr>
       <tr>
         <td>POSIX-style Syntax</td>
         <td>[:type=value:]</td>
         <td>[:^type=value:]</td>
       </tr>
       <tr>
         <td>Perl-style Syntax</td>
         <td>\p{type=value}</td>
         <td>\P{type=value}</td>
       </tr>
     </table>
     <h5><a href="#Boolean_Operations" name="Boolean_Operations" id=
     "Boolean_Operations">5.3.3.3 Boolean Operations</a></h5>
     <p>The low-level lists or properties then can be freely
     combined with the normal set operations (union, inverse,
     difference, and intersection):</p>
     <ul>
       <li>To union two sets, simply concatenate them. For example,
       <b>[[:letter:] [:number:]]</b></li>
       <li>To intersect two sets, use the '&amp;' operator. For
       example, <b>[[:letter:] &amp; [a-z]]</b></li>
       <li>To take the set-difference of two sets, use the '-'
       operator. For example, <b>[[:letter:] - [a-z]]</b></li>
       <li>To invert a set, place a '^' immediately after the
       opening '['. For example, <b>[^a-z]</b>. In any other
       location, the '^' does not have a special meaning. The
       inversion [^X] is equivalent to [[\x{0}-\x{10FFFF}]-[X]].
       Thus multi-code point strings are discarded.</li>
       <li>Symmetric difference (~) is not supported.</li>
     </ul>
     <p>The binary operators '&amp;', '-', and the implicit union
     have equal precedence and bind left-to-right. Thus
     <b>[[:letter:]-[a-z]-[\u0100-\u01FF]]</b> is equal to
     <b>[[[:letter:]-[a-z]]-[\u0100-\u01FF]]</b>. Another example is
     the set <b>[[ace][bdf] - [abc][def]]</b>, which is not the
     empty set, but instead equal to <b>[[[[ace] [bdf]] - [abc]]
     [def]]</b>, which equals <b>[[[abcdef] - [abc]] [def]]</b>,
     which equals <b>[[def] [def]]</b>, which equals
     <b>[def]</b>.</p>
     <p><strong>One caution:</strong> the '&amp;' and '-' operators
     operate between sets. That is, they must be immediately
     preceded and immediately followed by a set. For example, the
     pattern <b>[[:Lu:]-A]</b> is illegal, since it is interpreted
     as the set <b>[:Lu:]</b> followed by the incomplete range
     <b>-A</b>. To specify the set of upper case letters except for
     'A', enclose the 'A' in brackets: <b>[[:Lu:]-[A]]</b>.</p>
     <h5><a href="#UnicodeSet_Examples" name="UnicodeSet_Examples"
     id="UnicodeSet_Examples">5.3.3.4 UnicodeSet Examples</a></h5>
     <p>The following table summarizes the syntax that can be
     used.</p>
     <table style="margin-top: 0.5em; margin-bottom: 0.5em" id=
     "table18">
       <tr>
         <th>Example</th>
         <th>Description</th>
       </tr>
       <tr>
         <td nowrap>[a]</td>
         <td>The set containing 'a' alone</td>
       </tr>
       <tr>
         <td nowrap>[a-z]</td>
         <td>The set containing 'a' through 'z' and all letters in
         between, in Unicode order.<br>
         Thus it is the same as [\u0061-\u007A].</td>
       </tr>
       <tr>
         <td nowrap>[^a-z]</td>
         <td>The set containing all code points but 'a' through
         'z'.<br>
         Thus it is the same as [\u0000-\u0060
         \u007B-\x{10FFFF}].</td>
       </tr>
       <tr>
         <td nowrap>[[pat1][pat2]]</td>
         <td>The union of sets specified by pat1 and pat2</td>
       </tr>
       <tr>
         <td nowrap>[[pat1]&amp;[pat2]]</td>
         <td>The intersection of sets specified by pat1 and
         pat2</td>
       </tr>
       <tr>
         <td nowrap>[[pat1]-[pat2]]</td>
         <td>The asymmetric difference of sets specified by pat1 and
         pat2</td>
       </tr>
       <tr>
         <td nowrap>[a {ab} {ac}]</td>
         <td>The code point 'a' and the multi-code point strings
         "ab" and "ac"</td>
       </tr>
       <tr>
         <td nowrap>[x\u{61 2019 62}y]</td>
         <td>Equivalent to [x\u0061\u2019\u0062y] (= [xa’by])</td>
       </tr>
       <tr>
         <td nowrap>[{ax}-{bz}]</td>
         <td>The set containing [{ax} {ay} {az} {bx} {by} {bz}],
         using the range syntax to get all the strings from {ax} to
         {bz} as described in <em>Section <a href=
         "#String_Range">5.3.4 String Range</a></em>.</td>
       </tr>
       <tr>
         <td nowrap>[:Lu:]</td>
         <td>The set of code points with a given property value, as
         defined by PropertyValueAliases.txt. In this case, these
         are the Unicode upper case letters. The long form for this
         is <b>[:General_Category=Uppercase_Letter:]</b>.</td>
       </tr>
       <tr>
         <td nowrap>[:L:]</td>
         <td>The set of code points belonging to all Unicode
         categories starting with 'L', that is,
         <b>[[:Lu:][:Ll:][:Lt:][:Lm:][:Lo:]]</b>. The long form for
         this is <b>[:General_Category=Letter:]</b>.</td>
       </tr>
     </table><br>
     <h4><a name="String_Range" href="#String_Range" id=
     "String_Range">5.3.4 String Range</a></h4>
     <p>A String Range is a compact format for specifying a list of
     strings.</p>
     <p><strong>Syntax:<br></strong></p>
     <blockquote>
       <p>X <em>sep</em> Y<br></p>
     </blockquote>
     <p>The separator and the format of strings X, Y may vary
     depending on the domain. For example,</p>
     <ul>
       <li>for the validity files the separator is ~,</li>
       <li>for UnicodeSet the separator is -, and any
       multi-codepoint string is enclosed in {…}.</li>
     </ul>
     <p><strong>Validity:&nbsp;<br></strong></p>
     <blockquote>
       <p>A string range X <em>sep</em> Y is valid iff len(X) ≥
       len(Y) &gt; 0, where len(X) is the length of X in code
       points.</p>
       <p><em>There may be additional, domain-specific requirements
       for validity of the expansion of the string range.</em></p>
     </blockquote>
     <p><strong>Interpretation:<br></strong></p>
     <ol>
       <li>Break X into P and S, where len(S) = len(Y)
         <ul>
           <li>Note that P will be an empty string if the lengths of
           X and Y are equal.</li>
         </ul>
       </li>
       <li>Form the combinations of all
       P+(s₀..y₀)+(s₁..y₁)+...(sₙ..yₙ)
         <ul>
           <li>s₀ is the first code point in S, etc.</li>
         </ul>
       </li>
     </ol>
     <p><strong>Examples:</strong></p>
     <table>
       <tbody>
         <tr>
           <td>ab-ad</td>
           <td>→</td>
           <td>ab ac ad</td>
         </tr>
         <tr>
           <td>ab-d</td>
           <td>→</td>
           <td>ab ac ad</td>
         </tr>
         <tr>
           <td>ab-cd</td>
           <td>→</td>
           <td>ab ac ad bb bc bd cb cc cd</td>
         </tr>
         <tr>
           <td>👦🏻-👦🏿</td>
           <td>→</td>
           <td>👦🏻 👦🏼 👦🏽 👦🏾 👦🏿</td>
         </tr>
         <tr>
           <td>👦🏻-🏿</td>
           <td>→</td>
           <td>👦🏻 👦🏼 👦🏽 👦🏾 👦🏿</td>
         </tr>
       </tbody>
     </table><br>
     <h3><a name="Identity_Elements" href="#Identity_Elements" id=
     "Identity_Elements">5.4 Identity Elements</a></h3>
     <p class="dtd">&lt;!ELEMENT identity (alias | (version,
     generation?, language, script?, territory?, variant?, special*)
     ) &gt;</p>
     <p>The identity element contains information identifying the
     target locale for this data, and general information about the
     version of this data.</p>
     <p class="element2">&lt;version number="<u>$</u>Revision: 1.227
     <u>$</u>"&gt;</p>
     <p>The version element provides, in an attribute, the version
     of this file.&nbsp; The contents of the element can contain
     textual notes about the changes between this version and the
     last. For example:</p>
     <blockquote>
       <pre>&lt;version number="<span style=
       "color: blue">1.1</span>"&gt;<span style=
       "color: blue">Various notes and changes in version 1.1</span>&lt;/version&gt;</pre>
       <p>This is not to be confused with the version attribute on
       the ldml element, which tracks the dtd version.</p>
     </blockquote>
     <p class="element2">&lt;generation date="<u>$</u>Date:
     2007/07/17 23:41:16 <u>$</u>" /&gt;</p>
     <p>The generation element is now deprecated. It was used to
     contain the last modified date for the data. This could be in
     two formats: ISO 8601 format, or CVS format (illustrated by the
     example above).</p>
     <p class="element2">&lt;language type="<span style=
     "color: blue">en</span>"/&gt;</p>
     <p>The language code is the primary part of the specification
     of the locale id, with values as described above.</p>
     <p class="element2">&lt;script type="<span style=
     "color: blue">Latn</span>" /&gt;</p>
     <p>The script code may be used in the identification of written
     languages, with values described above.</p>
     <p class="element2">&lt;territory type="<span style=
     "color: blue">US</span>"/&gt;</p>
     <p>The territory code is a common part of the specification of
     the locale id, with values as described above.</p>
     <p class="element2">&lt;variant type="<span class=
     "attributeValue">NYNORSK</span>"/&gt;</p>
     <p>The variant code is the tertiary part of the specification
     of the locale id, with values as described above.</p>
     <p>When combined according to the rules described in
     <i><a href="#Unicode_Language_and_Locale_Identifiers">Section
     3, Unicode Language and Locale Identifiers</a></i>, the
     language element, along with any of the optional script,
     territory, and variant elements, must identify a known, stable
     locale identifier. Otherwise, it is an error.</p>
     <h3><a name="Valid_Attribute_Values" href=
     "#Valid_Attribute_Values" id="Valid_Attribute_Values">5.5 Valid
     Attribute Values</a></h3>
 	      <p>The <a href="#DTD_Annotations">DTD Annotations</a> in Section 5.7 are used to determine whether elements, attributes, or attribute values are valid (or deprecated).</p>

     <h3><a name="Canonical_Form" href="#Canonical_Form" id=
     "Canonical_Form">5.6 Canonical Form</a></h3>
     <p>The following are restrictions on the format of LDML files
     to allow for easier parsing and comparison of files.</p>
     <p>Peer elements have consistent order. That is, if the DTD or
     this specification requires the following order in an element
     <strong>foo</strong>:</p>
     <pre>&lt;foo&gt;
   &lt;pattern&gt;
   &lt;somethingElse&gt;
 &lt;/foo&gt;</pre>
     <p>It can never require the reverse order in a different
     element <strong>bar</strong>.</p>
     <pre>&lt;bar&gt;
   &lt;somethingElse&gt;
   &lt;pattern&gt;
 &lt;/bar&gt;</pre>
     <p>Note that there was one case that had to be corrected in
     order to make this true. For that reason, pattern occurs twice
     under currency:</p>
     <pre class="dtd">
     &lt;!ELEMENT currency (alias | (pattern*, displayName?, symbol?, pattern*,
 decimal?, group?, special*)) &gt;</pre>
     <p><a href="https://www.w3.org/TR/REC-xml/">XML</a> files can
     have a wide variation in textual form, while representing
     precisely the same data. By putting the LDML files in the
     repository into a canonical form, this allows us to use the
     simple diff tools used widely (and in CVS) to detect
     differences when vetting changes, without those tools being
     confused. This is not a requirement on other uses of LDML; just
     simply a way to manage repository data more easily.</p>
     <h4><a name="Content" href="#Content" id="Content">5.6.1
     Content</a></h4>
     <ol>
       <li>All start elements are on their own line, indented by
       <i>depth</i> tabs.</li>
       <li>All end elements (except for leaf nodes) are on their own
       line, indented by <i>depth</i> tabs.</li>
       <li>Any leaf node with empty content is in the form
       &lt;foo/&gt;.</li>
       <li>There are no blank lines except within comments or
       content.</li>
       <li>Spaces are used within a start element. There are no
       extra spaces within elements.
         <ul>
           <li><code>&lt;version number="1.2"/&gt;</code>, not
           <code>&lt;version&nbsp; number = "1.2" /&gt;</code></li>
           <li><code>&lt;/identity&gt;</code>, not
           <code>&lt;/identity &gt;</code></li>
         </ul>
       </li>
       <li>All attribute values use double quote ("), not single
       (').</li>
       <li>There are no CDATA sections, and no escapes except those
       absolutely required.
         <ul>
           <li>no &amp;apos; since it is not necessary</li>
           <li>no '&amp;#x61;', it would be just 'a'</li>
         </ul>
       </li>
       <li>All attributes with defaulted values are suppressed.</li>
       <li>The draft and alt="proposed.*" attributes are only on
       leaf elements.</li>
       <li>The tzid are canonicalized in the following way:
         <ol>
           <li type="a">All tzids as of as CLDR 1.1 (2004.06.08) in
           zone.tab are canonical.</li>
           <li>After that point, the first time a tzid is
           introduced, that is the canonical form.</li>
         </ol>
         <p>That is, new IDs are added, but existing ones keep the
         original form. The <i>TZ</i> timezone database keeps a set
         of equivalences in the "backward" file. These are used to
         map other tzids to the canonical form. For example, when
         <code>America/Argentina/Catamarca</code> was introduced as
         the new name for the previous
         <code>America/Catamarca</code> , a link was added in the
         backward file.</p>
         <p><code>Link America/Argentina/Catamarca
         America/Catamarca</code></p>
       </li>
     </ol>
     <p><i>Example:</i></p>
     <pre>&lt;ldml draft="unconfirmed" &gt;
         &lt;identity&gt;
                 &lt;version number="1.2"/&gt;
                 &lt;language type="en"/&gt;
                 &lt;territory type="AS"/&gt;
         &lt;/identity&gt;
         &lt;numbers&gt;
                 &lt;currencyFormats&gt;
                         &lt;currencyFormatLength&gt;
                                 &lt;currencyFormat&gt;
                                         &lt;pattern&gt;¤#,##0.00;(¤#,##0.00)&lt;/pattern&gt;
                                 &lt;/currencyFormat&gt;
                         &lt;/currencyFormatLength&gt;
                 &lt;/currencyFormats&gt;
         &lt;/numbers&gt;
 &lt;/ldml&gt;</pre>
     <h4><a name="Ordering" href="#Ordering" id="Ordering">5.6.2
     Ordering</a></h4>
     <p>An element is ordered first by the element name, and then if
     the element names are identical, by the sorted set of
     attribute-value pairs. For the latter, compare the first pair
     in each (in sorted order by attribute pair). If not identical,
     go to the second pair, and so on.</p>
     <p>Elements and attributes are ordered according to their order
     in the respective DTDs. Attribute value comparison is a bit
     more complicated, and may depend on the attribute and type.
     This is currently done with specific ordering tables.</p>
     <p>Any future additions to the DTD must be structured so as to
     allow compatibility with this ordering. See also <a href=
     "#Valid_Attribute_Values">Section 5.5 Valid Attribute
     Values.</a></p>
     <h4><a name="Comments" href="#Comments" id="Comments">5.6.3
     Comments</a></h4>
     <ol>
       <li>Comments are of the form &lt;!-- <i>stuff</i>
       --&gt;.</li>
       <li>They are logically attached to a node. There are 4 kinds:
         <ol>
           <li>Inline always appear after a leaf node, on the same
           line at the end. These are a single line.</li>
           <li>Preblock comments always precede the attachment node,
           and are indented on the same level.</li>
           <li>Postblock comments always follow the attachment node,
           and are indented on the same level.</li>
           <li>Final comment, after &lt;/ldml&gt;</li>
         </ol>
       </li>
       <li>Multiline comments (except the final comment) have each
       line after the first indented to one deeper level.</li>
     </ol>
     <p><b>Examples:</b></p>
     <pre>&lt;eraAbbr&gt;
         &lt;era type="0"&gt;BC&lt;/era&gt; &lt;!-- might add alternate BDE in the future --&gt;
 ...
 &lt;timeZoneNames&gt;
         &lt;!-- Note: zones that do not use daylight time need further work --&gt;
         &lt;zone type="America/Los_Angeles"&gt;
         ...
         &lt;!-- Note: the following is known to be sparse,
                 and needs to be improved in the future --&gt;
         &lt;zone type="Asia/Jerusalem"&gt;</pre>
     <h3><a name="DTD_Annotations" href="#DTD_Annotations" id=
     "DTD_Annotations">5.7 DTD Annotations</a></h3>
     <p>The information in a standard DTD is insufficient for use in
     CLDR. To make up for that, DTD annotations are added. These are
     of the form<br>
     &lt;!--@...--&gt;<br>
     and are included below the !ELEMENT or !ATTLIST line that they
     apply to. The current annotations are:</p>
     <table>
       <tr>
         <th>Type</th>
         <th>Description</th>
       </tr>
       <tr>
         <td>&lt;!--@VALUE--&gt;</td>
         <td>The attribute is not distinguishing, and is treated
         like an element value</td>
       </tr>
       <tr>
         <td>&lt;!--@METADATA--&gt;</td>
         <td>The attribute is a “comment” on the data, like the
         draft status. It is not typically used in
         implementations.</td>
       </tr>
       <tr>
         <td>&lt;!--@ORDERED--&gt;</td>
         <td>The element's children are ordered, and do not
         inherit.</td>
       </tr>
       <tr>
         <td>&lt;!--@DEPRECATED--&gt;</td>
         <td>The element or attribute is deprecated, and should not
         be used.</td>
       </tr>
       <tr>
         <td>&lt;!--@DEPRECATED: attribute-value1,
         attribute-value2--&gt;</td>
         <td>The attribute values are deprecated, and should not be
         used. Spaces between tokens are not significant.</td>
       </tr>
       <tr>
         <td>&lt;!--@MATCH:{attribute value constraint}--&gt;</td>
         <td>Requires the attribute value to match the constraint.</td>
       </tr>
     </table>
     <p>There is additional information in the
     attributeValueValidity.xml file that is used internally for
     testing. For example, the following line indicates that the
     'currency' element in the ldml dtd must have values from the
     bcp47 'cu' type.</p>
     <p class='example'>&lt;attributeValues dtds='ldml'
     elements='currency'
     attributes='type'&gt;$_bcp47_cu&lt;/attributeValues&gt;</p>
     <p>The element values may be literals, regular expressions, or
     variables (some of which are set programmatically according to
     other CLDR data, such as the above. However, the information as
     this point does not cover all attribute values, is used only
     for testing, and should not be used in implementations since
     the structure may change without notice.</p>
     <h4>5.7.1<a href="#match_expressions" name="match_expressions">Attribute Value Constraints</a></h4>
     <p>The following are constraints on the attribute values. Note: in future versions, the format may change, and/or the constaints may be tightened.</p>
     <table class='simple'>
       <tbody>
         <tr>
           <th>Constraint</th>
           <th colspan="2">Comments</th>
         </tr>
         <tr>
           <td>any</td>
           <td colspan="2">any string value</td>
         </tr>
         <tr>
           <td>any/TODO</td>
           <td colspan="2">placeholder for future constraints</td>
         </tr>
         <tr>
           <td>bcp47/anykey</td>
           <td colspan="2">any bcp47 key or tkey</td>
         </tr>
         <tr>
           <td>bcp47/anyvalue</td>
           <td colspan="2">any bcp47 value (type) or tvalue</td>
         </tr>
         <tr>
           <td>literal/{literal values}</td>
           <td colspan="2">comma separated</td>
         </tr>
         <tr>
           <td>regex/{regex expression}</td>
           <td colspan="2">valid regex expression</td>
         </tr>
         <tr>
           <td>bcp47/{key or tkey}</td>
           <td colspan="2">matches possible values for that key or tkey</td>
         </tr>
         <tr>
           <td>metazone</td>
           <td colspan="2">valid metazone</td>
         </tr>
         <tr>
           <td>range/{start_number~{end_number}}</td>
           <td colspan="2">number between (inclusive) start and end</td>
         </tr>
         <tr>
           <td>time/{time or date or date-time pattern}</td>
           <td colspan="2">eg HH:mm</td>
         </tr>
         <tr>
           <td>unicodeset/{unicodeset pattern}</td>
           <td colspan="2">valid unicodeset</td>
         </tr>
         <tr>
           <td rowspan="4">validity/{field}</td>
           <td colspan="2">currency, language, locale, region, script, subdivision, short-unit, unit, variant</td>
         </tr>
         <tr>
           <td colspan="2">The field can be qualified by particular enums, such as:</td>
         </tr>
         <tr>
           <td>validity/unit/regular deprecated</td>
           <td>matches only <em>deprecated</em> and <em>regular</em></td>
         </tr>
         <tr>
           <td>validity/unit/!deprecated</td>
           <td>matches all but <em>deprecated</em></td>
         </tr>
         <tr>
           <td>version</td>
           <td colspan="2">1 to 4 digit field version, such as 35.3.9</td>
         </tr>
         <tr>
           <td>set/{match}</td>
           <td colspan="2">set of elements that match {match}</td>
         </tr>
         <tr>
           <td>or/{match1}XX{match2}…</td>
           <td colspan="2">matches at least one of {match1}, etc</td>
         </tr>
       </tbody>
     </table><br>
     <h2><a name="Property_Data" href="#Property_Data" id=
     "Property_Data">6 Property Data</a></h2>
     <p>Some data in CLDR does not use an XML format, but rather a
     semicolon-delimited format derived from that of the Unicode
     Character Database. That is because the data is more likely to
     be parsed by implementations that already parse UCD data. Those
     files are present in the common/properties directory.</p>
     <p>Each file has a header that explains the format and usage of
     the data.</p>
     <h3><a name="Script_Metadata" href="#Script_Metadata" id=
     "Script_Metadata">6.1 Script Metadata</a></h3>
     <p><code>scriptMetadata.txt</code></p>
     <p>This file provides general information about scripts that
     may be useful to implementations processing text. The
     information is the best currently available, and may change
     between versions of CLDR. The format is similar to Unicode
     Character Database property file, and is documented in the
     header of the data file.</p>
     <h3><a name="Extended_Pictographic" href=
     "#Extended_Pictographic" id="Extended_Pictographic">6.2
     Extended Pictographic</a></h3>
     <p><code>ExtendedPictographic.txt</code></p>
     <p>This file was used to define the ExtendedPictographic data
     used for “future-proofing” emoji behavior, especially in
     segmentation. As of Emoji version 11.0, the set of
     Extended_Pictographic is incorporated into the emoji data files
     found at <a href=
     "https://unicode.org/Public/emoji/">unicode.org/Public/emoji/</a>.</p>
     <h3><a name="Labels.txt" href="#Labels.txt" id="Labels.txt">6.3
     Labels.txt</a></h3>
     <p><code>labels.txt</code></p>
     <p>This file provides general information about associations of
     labels to characters that may be useful to implementations of
     character-picking applications. The information is the best
     currently available, and may change between versions of CLDR.
     The format is similar to Unicode Character Database property
     file, and is documented in the header of the data file.</p>
     <p>Initially, the contents are focused on emoji, but may be
     expanded in the future to other types of characters. Note that
     a character may have multiple labels.</p>
     <h3><a name="Segmentation_Tests" href="#Segmentation_Tests">6.4
       Segmentation Tests</a></h3>
     <p>CLDR provides a tailoring to the <a href="https://unicode.org/reports/tr29/">Grapheme Cluster Break (gcb)</a> algorithm to avoid splitting Indic aksaras. The corresponding test files for that are located in common/properties/segments/, along with a readme.txt that provides more details. There are also specific test files for the supported Indic scripts in the unittest directory.</p>
     <h2><a name="Format_Parse_Issues" href="#Format_Parse_Issues"
     id="Format_Parse_Issues">7 Issues in Formatting and
     Parsing</a></h2>
     <h3><a name="Lenient_Parsing" href="#Lenient_Parsing" id=
     "Lenient_Parsing">7.1 Lenient Parsing</a></h3>
     <h4><a name="Motivation" href="#Motivation" id=
     "Motivation">7.1.1 Motivation</a></h4>
     <p>User input is frequently messy. Attempting to parse it by
     matching it exactly against a pattern is likely to be
     unsuccessful, even when the meaning of the input is clear to a
     human being. For example, for a date pattern of "MM/dd/yy", the
     input "June 1, 2006" will fail.</p>
     <p>The goal of lenient parsing is to accept user input whenever
     it is possible to decipher what the user intended. Doing so
     requires using patterns as data to guide the parsing process,
     rather than an exact template that must be matched. This
     informative section suggests some heuristics that may be useful
     for lenient parsing of dates, times, and numbers.</p>
     <h4><a name="Loose_Matching" href="#Loose_Matching" id=
     "Loose_Matching">7.1.2 Loose Matching</a></h4>
     <p>Loose matching ignores attributes of the strings being
     compared that are not important to matching. It involves the
     following steps:</p>
     <ul>
       <li>Remove "." from currency symbols and other fields used
       for matching, and also from the input string unless:
         <ul>
           <li>"." is in the decimal set, and</li>
           <li>its position in the input string is immediately
           before a decimal digit</li>
         </ul>
       </li>
       <li>Ignore all format characters: in particular, ignore any
       RLM, LRM or ALM used to control BIDI formatting.</li>
       <li>Ignore all characters in [:Zs:] unless they occur between
       letters. (In the heuristics below, even those between letters
       are ignored except to delimit fields)</li>
       <li>Map all characters in [:Dash:] to U+002D
       HYPHEN-MINUS</li>
       <li>Use the data in the &lt;character-fallback&gt; element to
       map equivalent characters (for example, curly to straight
       apostrophes). Other apostrophe-like characters should also be
       treated as equivalent, especially if the character actually
       used in a format may be unavailable on some keyboards. For
       example:
         <ul>
           <li>U+02BB MODIFIER LETTER TURNED COMMA (ʻ) might be
           typed instead as U+2018 LEFT SINGLE QUOTATION MARK
           (‘).</li>
           <li>U+02BC MODIFIER LETTER APOSTROPHE (ʼ) might be typed
           instead as U+2019 RIGHT SINGLE QUOTATION MARK (’), U+0027
           APOSTROPHE, etc.</li>
           <li>U+05F3 HEBREW PUNCTUATION GERESH (‎׳) might be typed
           instead as U+0027 APOSTROPHE.</li>
         </ul>
       </li>
       <li>Apply mappings particular to the domain (i.e., for dates
       or for numbers, discussed in more detail below)</li>
       <li>Apply case folding (possibly including language-specific
       mappings such as Turkish i)</li>
       <li>Normalize to NFKC; thus <i>no-break space</i> will map to
       <i>space</i>; half-width <i>katakana</i> will map to
       full-width.</li>
     </ul>
     <p>Loose matching involves (logically) applying the above
     transform to both the input text and to each of the field
     elements used in matching, before applying the specific
     heuristics below. For example, if the input number text is " -
     NA f. 1,000.00", then it is mapped to "-naf1,000.00" before
     processing. The currency signs are also transformed, so "NA f."
     is converted to "naf" for purposes of matching. As with other
     Unicode algorithms, this is a logical statement of the process;
     actual implementations can optimize, such as by applying the
     transform incrementally during matching.</p>
     <h3><a name="Invalid_Patterns" href="#Invalid_Patterns" id=
     "Invalid_Patterns">7.2 Handling Invalid Patterns</a></h3>
     <p>Processes sometimes encounter invalid number or date
     patterns, such as a number pattern with “¤¤¤¤¤” (valid pattern
     character but invalid length in current CLDR), a date pattern
     with “nn” (invalid pattern character in current CLDR), or a
     date pattern with “MMMMMM” (invalid length in current CLDR).
     The recommended behavior for handling such an invalid pattern
     field is:</p>
     <ul>
       <li>For a field using a currently-invalid length for a valid
       pattern character:
         <ul>
           <li>In <strong>formatting,</strong> emit U+FFFD
           REPLACEMENT CHARACTER for the invalid field.</li>
           <li>In <strong>parsing,</strong> the field may be parsed
           as if it had a valid length.</li>
         </ul>
       </li>
       <li>For a pattern that contains a currently-invalid pattern
       character (applies only to date patterns, for which A-Za-z
       are reserved as pattern characters but not all defined as
       valid):
         <ul>
           <li>Produce an error (set an error code or throw an
           exception) when an attempt is made to create a formatter
           with such a pattern or to apply such a pattern to an
           existing formatter.</li>
         </ul>
       </li>
     </ul>
     <h2><a name="Deprecated_Structure" href="#Deprecated_Structure"
     id="Deprecated_Structure">Annex A Deprecated Structure</a></h2>
     <p>The <a href="#DTD_Annotations">DTD Annotations</a> in Section 5.7 are used to determine whether elements, attributes, or attribute values are deprecated.</p>
     <p>While valid LDML, they are strongly
     discouraged, and no longer used in CLDR.</p>
     <p>The remainder of this section describes selected cases of
     deprecated structure that were present in previous versions of
     CLDR.</p>
     <h3><a name="Fallback_Elements" href="#Fallback_Elements" id=
     "Fallback_Elements">A.1 Element fallback</a></h3>
     <p class="dtd">&lt;!ELEMENT fallback (#PCDATA) &gt;</p>
     <p>The fallback element is deprecated. Implementations should
     use instead the information in <em><a href=
     "#LanguageMatching">Section 4.4 Language Matching</a></em> for
     doing language fallback.</p>
     <h3><a name="BCP47_Keyword_Mapping" href=
     "#BCP47_Keyword_Mapping" id="BCP47_Keyword_Mapping">A.2 BCP 47
     Keyword Mapping</a></h3>
     <p><b>Note:</b> <i>This structure is deprecated and replaced
     with <a href="#Unicode_Locale_Extension_Data_Files">Section
     3.6.4 U Extension Data Files</a>.</i></p>
     <p class="dtd">&lt;!ELEMENT bcp47KeywordMappings ( mapKeys?,
     mapTypes* ) &gt;<br>
     &lt;!ELEMENT mapKeys ( keyMap* ) &gt;<br>
     &lt;!ELEMENT keyMap EMPTY &gt;<br>
     &lt;!ATTLIST keyMap type NMTOKEN #REQUIRED &gt;<br>
     &lt;!ATTLIST keyMap bcp47 NMTOKEN #REQUIRED &gt;<br>
     &lt;!ELEMENT mapTypes ( typeMap* ) &gt;<br>
     &lt;!ATTLIST mapTypes type NMTOKEN #REQUIRED &gt;<br>
     &lt;!ELEMENT typeMap EMPTY &gt;<br>
     &lt;!ATTLIST typeMap type CDATA #REQUIRED &gt;<br>
     &lt;!ATTLIST typeMap bcp47 NMTOKEN #REQUIRED &gt;<br></p>
     <p>This section defines mappings between old Unicode locale
     identifier key/type values and their BCP 47 'u' extension
     subtag representations. The 'u' extension syntax described in
     <a href="#u_Extension">Section 3.6 Unicode BCP 47 U
     Extension</a> restricts a key to two ASCII alphanumerics and a
     type to three to eight ASCII alphanumerics. A key or a type
     which does not meet that syntax requirement is converted
     according to the mapping data defined by the mapKeys or
     mapTypes elements. For example, a keyword "collation=phonebook"
     is converted to BCP 47 'u' extension subtags "co-phonebk" by
     the mapping data below:</p>
     <pre>    &lt;mapKeys&gt;
         ...
         &lt;keyMap type="collation" bcp47="co"/&gt;
         ...
     &lt;/mapKeys&gt;
     &lt;mapTypes type="collation"&gt;
         ...
         &lt;typeMap type="phonebook" bcp47="phonebk"/&gt;
         ...
     &lt;/mapTypes&gt;
         </pre>
     <h3><a name="Choice_Patterns" href="#Choice_Patterns" id=
     "Choice_Patterns">A.3 Choice Patterns</a></h3>
     <p><b>Note:</b> <i>This structure is deprecated and replaced
     with count attributes.</i></p>
     <p>A choice pattern is a string that chooses among a number of
     strings, based on numeric value. It has the following form:</p>
     <p>&lt;choice_pattern&gt; = &lt;choice&gt; ( '|' &lt;choice&gt;
     )*<br>
     &lt;choice&gt; =
     &lt;number&gt;&lt;relation&gt;&lt;string&gt;<br>
     &lt;number&gt; = ('+' | '-')? (<font size="3">'∞' | [0-9]+ ('.'
     [0-9]+)?)<br>
     &lt;relation&gt; = '&lt;' | '</font> <span style=
     "color: blue">≤'</span></p>
     <p>The interpretation of a choice pattern is that given a
     number N, the pattern is scanned from right to left, for each
     choice evaluating &lt;number&gt; &lt;relation&gt; N. The first
     choice that matches results in the corresponding string. If no
     match is found, then the first string is used. For example:</p>
     <table border="1" cellpadding="0" cellspacing="0">
       <tr>
         <td width="33%">Pattern</td>
         <td width="33%">N</td>
         <td width="34%">Result</td>
       </tr>
       <tr>
         <td width="33%" rowspan="4">0≤Rf|1≤Ru|1&lt;Re</td>
         <td width="33%">-<font size="3">∞,</font> -3, -1,
         -0.000001</td>
         <td width="34%">Rf (defaulted to first string)</td>
       </tr>
       <tr>
         <td width="33%">0, 0.01, 0.9999</td>
         <td width="34%">Rf</td>
       </tr>
       <tr>
         <td width="33%">1</td>
         <td width="34%">Ru</td>
       </tr>
       <tr>
         <td width="33%">1.00001, 5, 99, <font size=
         "3">∞</font></td>
         <td width="34%">Re</td>
       </tr>
     </table>
     <p>Quoting is done using ' characters, as in date or number
     formats.</p>
     <h3><a name="Element_default" href="#Element_default" id=
     "Element_default">A.4 Element default</a></h3>
     <p><b>Note:</b> <i>This structure is deprecated.</i> Use
     replacement structure instead, for example:</p>
     <ul>
       <li>For &lt;collations&gt;, now use the
       &lt;defaultCollation&gt; element.</li>
       <li>For &lt;calendars&gt;, the default calendar type for a
       locale is now specified by <i><a href=
       "tr35-dates.html#Calendar_Preference_Data">Calendar
       Preference Data</a></i>.</li>
     </ul>
     <p>In some cases, a number of elements are present. The default
     element can be used to indicate which of them is the default,
     in the absence of other information. The value of the choice
     attribute is to match the value of the type attribute for the
     selected item.</p>
     <pre>&lt;timeFormats&gt;
   &lt;default choice="<span style="color: red">medium</span>" /&gt;
   &lt;timeFormatLength type="<span style=
 "color: blue">full</span>"&gt;
     &lt;timeFormat type="<span style=
 "color: blue">standard</span>"&gt;
       &lt;pattern type="<span style=
 "color: blue">standard</span>"&gt;<span style=
 "color: blue">h:mm:ss a z</span>&lt;/pattern&gt;
     &lt;/timeFormat&gt;
   &lt;/timeFormatLength&gt;
   &lt;timeFormatLength type="<span style=
 "color: blue">long</span>"&gt;
     &lt;timeFormat type="<span style=
 "color: blue">standard</span>"&gt;
       &lt;pattern type="<span style=
 "color: blue">standard</span>"&gt;<span style=
 "color: blue">h:mm:ss a z</span>&lt;/pattern&gt;
     &lt;/timeFormat&gt;
   &lt;/timeFormatLength&gt;
   &lt;timeFormatLength type="<span style=
 "color: red">medium</span>"&gt;
     &lt;timeFormat type="<span style=
 "color: blue">standard</span>"&gt;
       &lt;pattern type="<span style=
 "color: blue">standard</span>"&gt;<span style=
 "color: blue">h:mm:ss a</span>&lt;/pattern&gt;
     &lt;/timeFormat&gt;
   &lt;/timeFormatLength&gt;
 ...</pre>
     <p>Like all other elements, the &lt;default&gt; element is
     inherited. Thus, it can also refer to inherited resources. For
     example, suppose that the above resources are present in fr,
     and that in fr_BE we have the following:</p>
     <pre>&lt;timeFormats&gt;
   &lt;default choice="<span style="color: red">long</span>"/&gt;
 &lt;/timeFormats&gt;</pre>
     <p>In that case, the default time format for fr_BE would be the
     inherited "long" resource from fr. Now suppose that we had in
     fr_CA:</p>
     <pre>  &lt;timeFormatLength type="<span style=
     "color: red">medium</span>"&gt;
     &lt;timeFormat type="<span style=
 "color: blue">standard</span>"&gt;
       &lt;pattern type="<span style=
 "color: blue">standard</span>"&gt;<span style=
 "color: blue">...</span>&lt;/pattern&gt;
     &lt;/timeFormat&gt;
   &lt;/timeFormatLength&gt;
     </pre>
     <p>In this case, the &lt;default&gt; is inherited from fr, and
     has the value "medium". It thus refers to this new "medium"
     pattern in this resource bundle.</p>
     <h3><a name="Deprecated_Common_Attributes" href=
     "#Deprecated_Common_Attributes" id=
     "Deprecated_Common_Attributes">A.5 Deprecated Common
     Attributes</a></h3>
     <h4><a name="Attribute_standard" href="#Attribute_standard" id=
     "Attribute_standard">A.5.1 Attribute standard</a></h4>
     <p class="element2"><b>Note:</b> This attribute is deprecated.
     Instead, use a reference element with the attribute
     standard="true".</p>
     <p>The value of this attribute is a list of strings
     representing standards: international, national, organization,
     or vendor standards. The presence of this attribute indicates
     that the data in this element is compliant with the indicated
     standards. Where possible, for uniqueness, the string should be
     a URL that represents that standard. The strings are separated
     by commas; leading or trailing spaces on each string are not
     significant. Examples:</p>
     <p><code>&lt;collation standard="<span style="color: blue">MSA
     200:2002</span>"&gt;<br>
     ...<br>
     &lt;dateFormatStyle
     standard=”https://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26780&amp;amp;ICS1=1&amp;amp;ICS2=140&amp;amp;ICS3=30”&gt;</code></p>
     <h4><a name="Attribute_draft_nonLeaf" href=
     "#Attribute_draft_nonLeaf" id="Attribute_draft_nonLeaf">A.5.2
     Attribute draft in non-leaf elements</a></h4>
     <p>The draft attribute is deprecated except in leaf elements
     (elements that do not have any subelements)</p>
     <h3><a name="Element_base" href="#Element_base" id=
     "Element_base">A.6 Element base</a></h3>
     <p><b>Note:</b> <i>This element is deprecated.</i> Use the
     collation &lt;import&gt; element instead.</p>
     <p>The optional base element <code>&lt;base&gt;<span style=
     "color: blue">...</span>&lt;/base&gt;</code> , contains an
     alias element that points to another data source that defines a
     <i>base</i> collation. If present, it indicates that the
     settings and rules in the collation are modifications applied
     on <i>top of the</i> respective elements in the base collation.
     That is, any successive settings, where present, override what
     is in the base as described in <a href=
     "tr35-collation.html#Setting_Options">Setting Options</a>. Any
     successive rules are concatenated to the end of the rules in
     the base. The results of multiple rules applying to the same
     characters is covered in <a href=
     "tr35-collation.html#Orderings">Orderings</a>.</p>
     <h3><a name="Element_rules" href="#Element_rules" id=
     "Element_rules">A.7 Element rules</a></h3>
     <p><b>Note:</b> <i>The XML collation syntax is deprecated; this
     includes the &lt;rules&gt; element and its subelements, except
     that the &lt;import&gt; element has been moved up to be a
     subelement of &lt;collation&gt;.</i> Use the basic collation
     syntax with the <a href="tr35-collation.html#Rules">&lt;cr&gt;
     element</a> instead.</p>
     <p class="dtd">&lt;!ELEMENT rules (alias | ( ( reset | import
     ), ( reset | import | p | pc | s | sc | t | tc | i | ic | x)*
     )) &gt;</p>
     <h3><a name="Deprecated_subelements_of_dates" href=
     "#Deprecated_subelements_of_dates" id=
     "Deprecated_subelements_of_dates">A.8 Deprecated subelements of
     &lt;dates&gt;</a></h3>
     <ul>
       <li>&lt;localizedPatternChars&gt;</li>
       <li>&lt;dateRangePattern&gt;, replaced by
       &lt;intervalFormats&gt;.</li>
     </ul>
     <h3><a name="Deprecated_subelements_of_calendars" href=
     "#Deprecated_subelements_of_calendars" id=
     "Deprecated_subelements_of_calendars">A.9 Deprecated
     subelements of &lt;calendars&gt;</a></h3>
     <ul>
       <li>&lt;monthNames&gt; and &lt;monthAbbr&gt;; month name
       forms are specified in the &lt;months&gt; element. The older
       monthNames, monthAbbr are equivalent to: using the months
       element with the context type="<span style=
       "color: blue">format</span>" and the width type="<span style=
       "color: blue">wide</span>" (for ...Names) and
       type="<span style="color: blue">narrow</span>" (for ...Abbr),
       respectively.</li>
       <li>&lt;dayNames&gt; and &lt;dayAbbr&gt;; weekday name forms
       are specified in the &lt;days&gt; element. The older
       dayNames, dayAbbr are equivalent to: using the days element
       with the context type="<span style=
       "color: blue">format</span>" and the width type="<span style=
       "color: blue">wide</span>" (for ...Names) and
       type="<span style="color: blue">narrow</span>" (for ...Abbr),
       respectively.</li>
       <li><a name="week" href="#week" id="week">&lt;week&gt;</a> is
       deprecated in the main LDML files, because the data is more
       appropriately organized as connected to territories, not to
       linguistic data. Use the supplemental &lt;weekData&gt;
       element instead.</li>
       <li>&lt;am&gt; and &lt;pm&gt;; these are now included as part
       of the &lt;dayPeriods&gt; element</li>
       <li>&lt;fields&gt; is deprecated as a subelement of
       &lt;calendars&gt; instead, a &lt;fields&gt; element should be
       located just under a &lt;dates&gt; element. See <a href=
       "tr35-dates.html#Calendar_Fields">Calendar Fields</a>.</li>
     </ul>
     <h3><a name="Deprecated_subelements_of_timeZoneNames" href=
     "#Deprecated_subelements_of_timeZoneNames" id=
     "Deprecated_subelements_of_timeZoneNames">A.10 Deprecated
     subelements of &lt;timeZoneNames&gt;</a></h3>
     <ul>
       <li>&lt;hoursFormat&gt; e.g. "{0}/{1}" for "-0800/-0700"</li>
       <li><a name="fallbackRegionFormat" href=
       "#fallbackRegionFormat" id=
       "fallbackRegionFormat">&lt;fallbackRegionFormat&gt;</a>
       (deprecated), e.g. "{0}&nbsp;Time ({1})" for "United States
       Time (New York)"</li>
       <li>&lt;abbreviationFallback&gt;</li>
       <li>&lt;preferenceOrdering&gt;, a preference ordering among
       modern zones; use metazones instead.</li>
       <li>&lt;singleCountries&gt;, use <a href=
       "tr35-dates.html#Primary_Zones">Primary Zones</a></li>
     </ul>
     <h3><a name="Deprecated_subelements_of_zone_metazone" href=
     "#Deprecated_subelements_of_zone_metazone" id=
     "Deprecated_subelements_of_zone_metazone">A.11 Deprecated
     subelements of &lt;zone&gt; and &lt;metazone&gt;</a></h3>
     <ul>
       <li>&lt;commonlyUsed&gt;, formerly used to indicate whether a
       zone was commonly used in the locale.</li>
     </ul>
     <h3><a name=
     "Renamed_attribute_values_for_contextTransformUsage" href=
     "#Renamed_attribute_values_for_contextTransformUsage" id=
     "Renamed_attribute_values_for_contextTransformUsage">A.12
     Renamed attribute values for &lt;contextTransformUsage&gt;
     element</a></h3>
     <p>The &lt;contextTransformUsage&gt; element was introduced in
     CLDR 21. The values for its <em>type</em> attribute are
     documented in <a href=
     "tr35-general.html#contextTransformUsage_type_attribute_values">
     &lt;contextTransformUsage&gt; type attribute values</a>. In
     CLDR 25, some of these values were renamed from their previous
     values for improved clarity:</p>
     <ul>
       <li>"type" was renamed to "keyValue"</li>
       <li>"displayName" was renamed to "currencyName"</li>
       <li>"displayName-count" was renamed to
       "currencyName-count"</li>
       <li>"tense" was renamed to "relative"</li>
     </ul>
     <h3><a name="Deprecated_subelements_of_segmentations" href=
     "#Deprecated_subelements_of_segmentations" id=
     "Deprecated_subelements_of_segmentations">A.13 Deprecated
     subelements of &lt;segmentations&gt;</a></h3>
     <ul>
       <li>&lt;exceptions&gt; and &lt;exceptions&gt; were deprecated
       and replaced with &lt;suppressions&gt; and
       &lt;suppression&gt;.</li>
     </ul>
     <h3><a name="Element_cp" href="#Element_cp" id=
     "Element_cp">A.14 Element cp</a></h3>
     <p>The cp element was used to escape characters that cannot be
     represented in XML, even with NCRs. These escapes were only
     allowed in certain elements, according to the DTD.</p>
     <p>However, this mechanism is very clumsy, and was replaced by
     specialized syntax.</p>
     <table>
       <tr>
         <th>Code Point</th>
         <th>XML Example</th>
       </tr>
       <tr>
         <td><code>U+0000</code></td>
         <td><code>&lt;cp hex="0"&gt;</code></td>
       </tr>
     </table>
     <p>&nbsp;</p>
     <h3><a name="validSubLocales" href="#validSubLocales" id=
     "validSubLocales">A.15 Attribute validSubLocales</a></h3>
     <p>The attribute <i>validSubLocales</i> allowed sublocales in a
     given tree to be treated as though a file for them were present
     when there was not one. It only had an effect for locales that
     inherit from the current file where a file is missing.</p>
     <p><b>Example 1.</b> Suppose that in a particular LDML tree,
     there are no region locales for German, for example, there is a
     de.xml file, but no files for de_AT.xml, de_CH.xml, or
     de_DE.xml. Then no elements are valid for any of those region
     locales. If we want to mark one of those files as having valid
     elements, then we introduce an empty file, such as the
     following.</p>
     <p><code>&lt;ldml version="1.1"&gt;<br>
     &nbsp;&lt;identity&gt;<br>
     &nbsp; &lt;version number="1.1" /&gt;<br>
     &nbsp; &lt;language type="de" /&gt;<br>
     &nbsp; &lt;territory type="AT" /&gt;<br>
     &nbsp;&lt;/identity&gt;<br>
     &lt;/ldml&gt;</code></p>
     <p>With the <i>validSubLocales</i> attribute, instead of adding
     the empty files for de_AT.xml, de_CH.xml, and de_DE.xml, in the
     de file we could add to the parent locale a list of the child
     locales that should behave as if files were present.</p>
     <p><code>&lt;ldml version="1.1" validSubLocales="de_AT de_CH
     de_DE"&gt;<br>
     &nbsp;&lt;identity&gt;<br>
     &nbsp; &lt;version number="1.1" /&gt;<br>
     &nbsp; &lt;language type="de" /&gt;<br>
     &nbsp;&lt;/identity&gt;<br>
     ...<br>
     &lt;/ldml&gt;</code></p>
     <p>Now that the <i>validSubLocales</i> attribute has been
     deprecated, it is recommended to simply add empty files to
     specify which sublocales are valid. This convention is used
     throughout the CLDR.</p>
     <h3><a name="postCodeElements" href="#postCodeElements" id=
     "postCodeElements">A.16 Elements postalCodeData,
     postCodeRegex</a></h3>
     <p>The postal code validation data has been deprecated. Please
     see other services that are kept up to date, such as:</p>
     <ul>
       <li><a href=
       "https://i18napis.appspot.com/address/data/US">https://i18napis.appspot.com/address/data/US</a></li>
       <li><a href=
       "https://i18napis.appspot.com/address/data/CH">https://i18napis.appspot.com/address/data/CH</a></li>
       <li>...</li>
     </ul>
     <p>See <a href="tr35-info.html#Postal_Code_Validation">Postal
     Code Validation</a></p>
     <h3><a name="telephoneCodeData" href="#telephoneCodeData" id=
     "telephoneCodeData">A.17 Element telephoneCodeData</a></h3>
     <p>The element &lt;telephoneCodeData&gt; and its subelements
     have been deprecated and the data removed.</p>
     <hr>
     <h2><a name="Links_to_Other_Parts" href="#Links_to_Other_Parts"
     id="Links_to_Other_Parts">Annex B Links to Other Parts</a></h2>
     <p>The LDML specification is split into several <a href=
     "#Parts">parts</a> by topic, with one HTML document per part.
     The following tables provide redirects for links to specific
     topics. Please update your links and bookmarks.</p>
     <p>Part 1 Links: Core (this document): No redirects needed.</p>
     <table cellspacing="0" cellpadding="2" border="1" width="100%">
       <caption>
         <a href="#Part_2_Links" name="Part_2_Links" id=
         "Part_2_Links">Part 2 Links</a>: <a href=
         "tr35-general.html">General</a> (display names &amp;
         transforms, etc.)
       </caption>
       <tr>
         <th>Old section</th>
         <th>Section in new part</th>
       </tr>
       <tr>
         <td>5.4 <a name="Display_Name_Elements" href=
         "#Display_Name_Elements" id="Display_Name_Elements">Display
         Name Elements</a></td>
         <td>1 <a href=
         "tr35-general.html#Display_Name_Elements">Display Name
         Elements</a></td>
       </tr>
       <tr>
         <td>5.5 <a name="Layout_Elements" href="#Layout_Elements"
         id="Layout_Elements">Layout Elements</a></td>
         <td>2 <a href="tr35-general.html#Layout_Elements">Layout
         Elements</a></td>
       </tr>
       <tr>
         <td>5.6 <a name="Character_Elements" href=
         "#Character_Elements" id="Character_Elements">Character
         Elements</a></td>
         <td>3 <a href=
         "tr35-general.html#Character_Elements">Character
         Elements</a></td>
       </tr>
       <tr>
         <td>5.6.1 <a name="ExemplarSyntax" href="#ExemplarSyntax"
         id="ExemplarSyntax">Exemplar Syntax</a></td>
         <td>3.1 <a href="tr35-general.html#ExemplarSyntax">Exemplar
         Syntax</a></td>
       </tr>
       <tr>
         <td>5.6.2 Restrictions</td>
         <td>3.1 <a href="tr35-general.html#ExemplarSyntax">Exemplar
         Syntax</a></td>
       </tr>
       <tr>
         <td>5.6.3 Mapping</td>
         <td>3.2 <a href=
         "tr35-general.html#Character_Mapping">Mapping</a></td>
       </tr>
       <tr>
         <td>5.6.4 <a name="IndexLabels" href="#IndexLabels" id=
         "IndexLabels">Index Labels</a></td>
         <td>3.3 <a href="tr35-general.html#IndexLabels">Index
         Labels</a></td>
       </tr>
       <tr>
         <td>5.6.5 Ellipsis</td>
         <td>3.4 <a href=
         "tr35-general.html#Ellipsis">Ellipsis</a></td>
       </tr>
       <tr>
         <td>5.6.6 More Information</td>
         <td>3.5 <a href=
         "tr35-general.html#Character_More_Info">More
         Information</a></td>
       </tr>
       <tr>
         <td>5.7 <a name="Delimiter_Elements" href=
         "#Delimiter_Elements" id="Delimiter_Elements">Delimiter
         Elements</a></td>
         <td>4 <a href=
         "tr35-general.html#Delimiter_Elements">Delimiter
         Elements</a></td>
       </tr>
       <tr>
         <td>C.6 <a name="Measurement_System_Data" href=
         "#Measurement_System_Data" id=
         "Measurement_System_Data">Measurement System Data</a></td>
         <td>5 <a href=
         "tr35-general.html#Measurement_System_Data">Measurement
         System Data</a></td>
       </tr>
       <tr>
         <td>5.8 <a name="Measurement_Elements" href=
         "#Measurement_Elements" id=
         "Measurement_Elements">Measurement Elements
         (deprecated)</a></td>
         <td>5.1 <a href=
         "tr35-general.html#Measurement_Elements">Measurement
         Elements (deprecated)</a></td>
       </tr>
       <tr>
         <td>5.11 <a name="Unit_Elements" href="#Unit_Elements" id=
         "Unit_Elements">Unit Elements</a></td>
         <td>6 <a href="tr35-general.html#Unit_Elements">Unit
         Elements</a></td>
       </tr>
       <tr>
         <td>5.12 <a name="POSIX_Elements" href="#POSIX_Elements"
         id="POSIX_Elements">POSIX Elements</a></td>
         <td>7 <a href="tr35-general.html#POSIX_Elements">POSIX
         Elements</a></td>
       </tr>
       <tr>
         <td>5.13 <a name="Reference_Elements" href=
         "#Reference_Elements" id="Reference_Elements">Reference
         Element</a></td>
         <td>8 <a href=
         "tr35-general.html#Reference_Elements">Reference
         Element</a></td>
       </tr>
       <tr>
         <td>5.15 <a name="Segmentations" href="#Segmentations" id=
         "Segmentations">Segmentations</a></td>
         <td>9 <a href=
         "tr35-general.html#Segmentations">Segmentations</a></td>
       </tr>
       <tr>
         <td>5.15.1 <a name="Segmentation_Inheritance" href=
         "#Segmentation_Inheritance" id=
         "Segmentation_Inheritance">Segmentation
         Inheritance</a></td>
         <td>9.1 <a href=
         "tr35-general.html#Segmentation_Inheritance">Segmentation
         Inheritance</a></td>
       </tr>
       <tr>
         <td>5.16 <a name="Transforms" href="#Transforms" id=
         "Transforms">Transforms</a></td>
         <td>10 <a href=
         "tr35-general.html#Transforms">Transforms</a></td>
       </tr>
       <tr>
         <td>N <a name="Transform_Rules" href="#Transform_Rules" id=
         "Transform_Rules">Transform Rules</a></td>
         <td>10.3 <a href=
         "tr35-general.html#Transform_Rules_Syntax">Transform Rules
         Syntax</a></td>
       </tr>
       <tr>
         <td>5.18 <a name="ListPatterns" href="#ListPatterns" id=
         "ListPatterns">List Patterns</a></td>
         <td>11 <a href="tr35-general.html#ListPatterns">List
         Patterns</a></td>
       </tr>
       <tr>
         <td>C.20 <a name="List_Gender" href="#List_Gender" id=
         "List_Gender">Gender of Lists</a></td>
         <td>11.1 <a href="tr35-general.html#List_Gender">Gender of
         Lists</a></td>
       </tr>
       <tr>
         <td>5.19 <a name="Context_Transform_Elements" href=
         "#Context_Transform_Elements" id=
         "Context_Transform_Elements">ContextTransform
         Elements</a></td>
         <td>12 <a href=
         "tr35-general.html#Context_Transform_Elements">ContextTransform
         Elements</a></td>
       </tr>
       <tr>
         <td></td>
         <td><a href="tr35-general.html#"></a></td>
       </tr>
     </table>
     <table cellspacing="0" cellpadding="2" border="1" width="100%">
       <caption>
         <a href="#Part_3_Links" name="Part_3_Links" id=
         "Part_3_Links">Part 3 Links</a>: <a href=
         "tr35-numbers.html">Numbers</a> (number &amp; currency
         formatting)
       </caption>
       <tr>
         <th>Old section</th>
         <th>Section in new part</th>
       </tr>
       <tr>
         <td>C.13 <a name="Numbering_Systems" href=
         "#Numbering_Systems" id="Numbering_Systems">Numbering
         Systems</a></td>
         <td>1 <a href=
         "tr35-numbers.html#Numbering_Systems">Numbering
         Systems</a></td>
       </tr>
       <tr>
         <td>5.10 <a name="Number_Elements" href="#Number_Elements"
         id="Number_Elements">Number Elements</a></td>
         <td>2 <a href="tr35-numbers.html#Number_Elements">Number
         Elements</a></td>
       </tr>
       <tr>
         <td>5.10.1 <a name="Number_Symbols" href="#Number_Symbols"
         id="Number_Symbols">Number Symbols</a></td>
         <td>2.3 <a href="tr35-numbers.html#Number_Symbols">Number
         Symbols</a></td>
       </tr>
       <tr>
         <td>G <a name="Number_Format_Patterns" href=
         "#Number_Format_Patterns" id=
         "Number_Format_Patterns">Number Format Patterns</a></td>
         <td>3 <a href=
         "tr35-numbers.html#Number_Format_Patterns">Number Format
         Patterns</a></td>
       </tr>
       <tr>
         <td>5.10.2 <a name="Currencies" href="#Currencies" id=
         "Currencies">Currencies</a></td>
         <td>4 <a href=
         "tr35-numbers.html#Currencies">Currencies</a></td>
       </tr>
       <tr>
         <td>C.1 <a name="Supplemental_Currency_Data" href=
         "#Supplemental_Currency_Data" id=
         "Supplemental_Currency_Data">Supplemental Currency
         Data</a></td>
         <td>4.1 <a href=
         "tr35-numbers.html#Supplemental_Currency_Data">Supplemental
         Currency Data</a></td>
       </tr>
       <tr>
         <td>C.11 <a name="Language_Plural_Rules" href=
         "#Language_Plural_Rules" id=
         "Language_Plural_Rules">Language Plural Rules</a></td>
         <td>5 <a href=
         "tr35-numbers.html#Language_Plural_Rules">Language Plural
         Rules</a></td>
       </tr>
       <tr>
         <td>5.17 <a name="Rule-Based_Number_Formatting" href=
         "#Rule-Based_Number_Formatting" id=
         "Rule-Based_Number_Formatting">Rule-Based Number
         Formatting</a></td>
         <td>6 <a href=
         "tr35-numbers.html#Rule-Based_Number_Formatting">Rule-Based
         Number Formatting</a></td>
       </tr>
     </table>
     <table cellspacing="0" cellpadding="2" border="1" width="100%">
       <caption>
         <a href="#Part_4_Links" name="Part_4_Links" id=
         "Part_4_Links">Part 4 Links</a>: <a href=
         "tr35-dates.html">Dates</a> (date, time, time zone
         formatting)
       </caption>
       <tr>
         <th>Old section</th>
         <th>Section in new part</th>
       </tr>
       <tr>
         <td><a name="Date_Elements" href="#Date_Elements" id=
         "Date_Elements">5.9 Date Elements</a></td>
         <td>1 <a href=
         "tr35-dates.html#Overview_Dates_Element_Supplemental">Overview:
         Dates Element, Supplemental Date and Calendar
         Information</a></td>
       </tr>
       <tr>
         <td><a name="Calendar_Elements" href="#Calendar_Elements"
         id="Calendar_Elements">5.9.1 Calendar Elements</a></td>
         <td>2 <a href="tr35-dates.html#Calendar_Elements">Calendar
         Elements</a></td>
       </tr>
       <tr>
         <td><a name="months_days_quarters_eras" href=
         "#months_days_quarters_eras" id=
         "months_days_quarters_eras">Elements months, days,
         quarters, eras</a></td>
         <td>2.1 <a href=
         "tr35-dates.html#months_days_quarters_eras">Elements
         months, days, quarters, eras</a></td>
       </tr>
       <tr>
         <td><a name="monthPatterns_cyclicNameSets" href=
         "#monthPatterns_cyclicNameSets" id=
         "monthPatterns_cyclicNameSets">Elements monthPatterns,
         cyclicNameSets</a></td>
         <td>2.2 <a href=
         "tr35-dates.html#monthPatterns_cyclicNameSets">Elements
         monthPatterns, cyclicNameSets</a></td>
       </tr>
       <tr>
         <td><a name="dayPeriods" href="#dayPeriods" id=
         "dayPeriods">Element dayPeriods</a></td>
         <td>2.3 <a href="tr35-dates.html#dayPeriods">Element
         dayPeriods</a></td>
       </tr>
       <tr>
         <td><a name="dateFormats" href="#dateFormats" id=
         "dateFormats">Element dateFormats</a></td>
         <td>2.4 <a href="tr35-dates.html#dateFormats">Element
         dateFormats</a></td>
       </tr>
       <tr>
         <td><a name="timeFormats" href="#timeFormats" id=
         "timeFormats">Element timeFormats</a></td>
         <td>2.5 <a href="tr35-dates.html#timeFormats">Element
         timeFormats</a></td>
       </tr>
       <tr>
         <td><a name="dateTimeFormats" href="#dateTimeFormats" id=
         "dateTimeFormats">Element dateTimeFormats</a></td>
         <td>2.6 <a href="tr35-dates.html#dateTimeFormats">Element
         dateTimeFormats</a></td>
       </tr>
       <tr>
         <td><a name="Calendar_Fields" href="#Calendar_Fields" id=
         "Calendar_Fields">5.9.2 Calendar Fields</a></td>
         <td>3 <a href="tr35-dates.html#Calendar_Fields">Calendar
         Fields</a></td>
       </tr>
       <tr>
         <td>5.9.3 <a name="Timezone_Names" href="#Timezone_Names"
         id="Timezone_Names">Time Zone Names</a></td>
         <td>5 <a href="tr35-dates.html#Time_Zone_Names">Time Zone
         Names</a></td>
       </tr>
       <tr>
         <td><a name="Supplemental_Calendar_Data" href=
         "#Supplemental_Calendar_Data" id=
         "Supplemental_Calendar_Data">C.5 Supplemental Calendar
         Data</a></td>
         <td>4 <a href=
         "tr35-dates.html#Supplemental_Calendar_Data">Supplemental
         Calendar Data</a></td>
       </tr>
       <tr>
         <td><a name="Supplemental_Timezone_Data" href=
         "#Supplemental_Timezone_Data" id=
         "Supplemental_Timezone_Data">C.7 Supplemental Time Zone
         Data</a></td>
         <td>6 <a href=
         "tr35-dates.html#Supplemental_Time_Zone_Data">Supplemental
         Time Zone Data</a></td>
       </tr>
       <tr>
         <td><a name="Calendar_Preference_Data" href=
         "#Calendar_Preference_Data" id=
         "Calendar_Preference_Data">C.15 Calendar Preference
         Data</a></td>
         <td>4.2 <a href=
         "tr35-dates.html#Calendar_Preference_Data">Calendar
         Preference Data</a></td>
       </tr>
       <tr>
         <td><a name="DayPeriodRules" href="#DayPeriodRules" id=
         "DayPeriodRules">C.17 DayPeriod Rules</a></td>
         <td>4.5 <a href="tr35-dates.html#Day_Period_Rules">Day
         Period Rules</a></td>
       </tr>
       <tr>
         <td><a name="Date_Format_Patterns" href=
         "#Date_Format_Patterns" id="Date_Format_Patterns">Appendix
         F: Date Format Patterns</a></td>
         <td>8 <a href="tr35-dates.html#Date_Format_Patterns">Date
         Format Patterns</a></td>
       </tr>
       <tr>
         <td><a name="Date_Field_Symbol_Table" href=
         "#Date_Field_Symbol_Table" id=
         "Date_Field_Symbol_Table">Date Field Symbol Table</a></td>
         <td><a href="tr35-dates.html#Date_Field_Symbol_Table">Date
         Field Symbol Table</a></td>
       </tr>
       <tr>
         <td><a name="Localized_Pattern_Characters" href=
         "#Localized_Pattern_Characters" id=
         "Localized_Pattern_Characters">F.1 Localized Pattern
         Characters (deprecated)</a></td>
         <td>8.1 <a href=
         "tr35-dates.html#Localized_Pattern_Characters">Localized
         Pattern Characters (deprecated)</a></td>
       </tr>
       <tr>
         <td><a name="Time_Zone_Fallback" href="#Time_Zone_Fallback"
         id="Time_Zone_Fallback">Appendix J: Time Zone Display
         Names</a></td>
         <td>7 <a href="tr35-dates.html#Using_Time_Zone_Names">Using
         Time Zone Names</a></td>
       </tr>
       <tr>
         <td><a name="fallbackFormat" href="#fallbackFormat" id=
         "fallbackFormat"><b>fallbackFormat</b>:</a></td>
         <td><a href=
         "tr35-dates.html#fallbackFormat"><b>fallbackFormat</b>:</a></td>
       </tr>
       <tr>
         <td>O.4 Parsing Dates and Times</td>
         <td>9 <a href="tr35-dates.html#Parsing_Dates_Times">Parsing
         Dates and Times</a></td>
       </tr>
     </table>
     <table cellspacing="0" cellpadding="2" border="1" width="100%">
       <caption>
         <a href="#Part_5_Links" name="Part_5_Links" id=
         "Part_5_Links">Part 5 Links</a>: <a href=
         "tr35-collation.html">Collation</a> (sorting, searching,
         grouping)
       </caption>
       <tr>
         <th>Old section</th>
         <th>Section in new part</th>
       </tr>
       <tr>
         <td>5.14 <a name="Collation_Elements" href=
         "#Collation_Elements" id="Collation_Elements">Collation
         Elements</a></td>
         <td>3 <a href=
         "tr35-collation.html#Collation_Tailorings">Collation
         Tailorings</a></td>
       </tr>
       <tr>
         <td>5.14.1 <a name="Collation_Version" href=
         "#Collation_Version" id=
         "Collation_Version">Version</a></td>
         <td>3.1 <a href=
         "tr35-collation.html#Collation_Version">Version</a></td>
       </tr>
       <tr>
         <td>5.14.2 <a name="Collation_Element" href=
         "#Collation_Element" id="Collation_Element">Collation
         Element</a></td>
         <td>3.2 <a href=
         "tr35-collation.html#Collation_Element">Collation
         Element</a></td>
       </tr>
       <tr>
         <td>5.14.3 <a name="Setting_Options" href=
         "#Setting_Options" id="Setting_Options">Setting
         Options</a></td>
         <td>3.3 <a href=
         "tr35-collation.html#Setting_Options">Setting
         Options</a></td>
       </tr>
       <tr>
         <td>Table <a name="Collation_Settings" href=
         "#Collation_Settings" id="Collation_Settings">Collation
         Settings</a></td>
         <td>Table <a href=
         "tr35-collation.html#Collation_Settings">Collation
         Settings</a></td>
       </tr>
       <tr>
         <td>5.14.4 <a name="Rules" href="#Rules" id=
         "Rules">Collation Rule Syntax</a></td>
         <td>3.4 <a href="tr35-collation.html#Rules">Collation Rule
         Syntax</a></td>
       </tr>
       <tr>
         <td>5.14.5 <a name="Orderings" href="#Orderings" id=
         "Orderings">Orderings</a></td>
         <td>3.5 <a href=
         "tr35-collation.html#Orderings">Orderings</a></td>
       </tr>
       <tr>
         <td>5.14.6 <a name="Contractions" href="#Contractions" id=
         "Contractions">Contractions</a></td>
         <td>3.6 <a href=
         "tr35-collation.html#Contractions">Contractions</a></td>
       </tr>
       <tr>
         <td>5.14.7 <a name="Expansions" href="#Expansions" id=
         "Expansions">Expansions</a></td>
         <td>3.7 <a href=
         "tr35-collation.html#Expansions">Expansions</a></td>
       </tr>
       <tr>
         <td>5.14.8 <a name="Context_Before" href="#Context_Before"
         id="Context_Before">Context Before</a></td>
         <td>3.8 <a href=
         "tr35-collation.html#Context_Before">Context
         Before</a></td>
       </tr>
       <tr>
         <td>5.14.9 <a name="Placing_Characters_Before_Others" href=
         "#Placing_Characters_Before_Others" id=
         "Placing_Characters_Before_Others">Placing Characters
         Before Others</a></td>
         <td>3.9 <a href=
         "tr35-collation.html#Placing_Characters_Before_Others">Placing
         Characters Before Others</a></td>
       </tr>
       <tr>
         <td>5.14.10 <a name="Logical_Reset_Positions" href=
         "#Logical_Reset_Positions" id=
         "Logical_Reset_Positions">Logical Reset Positions</a></td>
         <td>3.10 <a href=
         "tr35-collation.html#Logical_Reset_Positions">Logical Reset
         Positions</a></td>
       </tr>
       <tr>
         <td>5.14.11 <a name="Special_Purpose_Commands" href=
         "#Special_Purpose_Commands" id=
         "Special_Purpose_Commands">Special-Purpose
         Commands</a></td>
         <td>3.11 <a href=
         "tr35-collation.html#Special_Purpose_Commands">Special-Purpose
         Commands</a></td>
       </tr>
       <tr>
         <td>5.14.12 <a name="Script_Reordering" href=
         "#Script_Reordering" id="Script_Reordering">Collation
         Reordering</a></td>
         <td>3.12 <a href=
         "tr35-collation.html#Script_Reordering">Collation
         Reordering</a></td>
       </tr>
       <tr>
         <td>5.14.13 <a name="Case_Parameters" href=
         "#Case_Parameters" id="Case_Parameters">Case
         Parameters</a></td>
         <td>3.13 <a href="tr35-collation.html#Case_Parameters">Case
         Parameters</a></td>
       </tr>
       <tr>
         <td>Definition: <a name="UncasedExceptions" href=
         "#UncasedExceptions" id=
         "UncasedExceptions">UncasedExceptions</a></td>
         <td>removed: see 3.13 <a href=
         "tr35-collation.html#Case_Parameters">Case
         Parameters</a></td>
       </tr>
       <tr>
         <td>Definition: <a name="LowerExceptions" href=
         "#LowerExceptions" id=
         "LowerExceptions">LowerExceptions</a></td>
         <td>removed: see 3.13 <a href=
         "tr35-collation.html#Case_Parameters">Case
         Parameters</a></td>
       </tr>
       <tr>
         <td>Definition: <a name="UpperExceptions" href=
         "#UpperExceptions" id=
         "UpperExceptions">UpperExceptions</a></td>
         <td>removed: see 3.13 <a href=
         "tr35-collation.html#Case_Parameters">Case
         Parameters</a></td>
       </tr>
       <tr>
         <td>5.14.14 <a name="Visibility" href="#Visibility" id=
         "Visibility">Visibility</a></td>
         <td>3.14 <a href=
         "tr35-collation.html#Visibility">Visibility</a></td>
       </tr>
     </table>
     <table cellspacing="0" cellpadding="2" border="1" width="100%">
       <caption>
         <a href="#Part_6_Links" name="Part_6_Links" id=
         "Part_6_Links">Part 6 Links</a>: <a href=
         "tr35-info.html">Supplemental</a> (supplemental data)
       </caption>
       <tr>
         <th>Old section</th>
         <th>Section in new part</th>
       </tr>
       <tr>
         <td>C <a name="Supplemental_Data" href="#Supplemental_Data"
         id="Supplemental_Data">Supplemental Data</a></td>
         <td>Introduction <a href=
         "tr35-info.html#Supplemental_Data">Supplemental
         Data</a></td>
       </tr>
       <tr>
         <td>C.2 <a name="Supplemental_Territory_Containment" href=
         "#Supplemental_Territory_Containment" id=
         "Supplemental_Territory_Containment">Supplemental Territory
         Containment</a></td>
         <td>1.1 <a href=
         "tr35-info.html#Supplemental_Territory_Containment">Supplemental
         Territory Containment</a></td>
       </tr>
       <tr>
         <td>C.4 <a name="Supplemental_Territory_Information" href=
         "#Supplemental_Territory_Information" id=
         "Supplemental_Territory_Information">Supplemental Territory
         Information</a></td>
         <td>1.2 <a href=
         "tr35-info.html#Supplemental_Territory_Information">Supplemental
         Territory Information</a></td>
       </tr>
       <tr>
         <td>C.3 <a name="Supplemental_Language_Data" href=
         "#Supplemental_Language_Data" id=
         "Supplemental_Language_Data">Supplemental Language
         Data</a></td>
         <td>2 <a href=
         "tr35-info.html#Supplemental_Language_Data">Supplemental
         Language Data</a></td>
       </tr>
       <tr>
         <td>C.9 <a name="Supplemental_Code_Mapping" href=
         "#Supplemental_Code_Mapping" id=
         "Supplemental_Code_Mapping">Supplemental Code
         Mapping</a></td>
         <td>4 <a href=
         "tr35-info.html#Supplemental_Code_Mapping">Supplemental
         Code Mapping</a></td>
       </tr>
       <tr>
         <td>C.12 <a name="Telephone_Code_Data" href=
         "#Telephone_Code_Data" id="Telephone_Code_Data">Telephone
         Code Data</a></td>
         <td>5 <a href=
         "tr35-info.html#Telephone_Code_Data">Telephone Code
         Data</a></td>
       </tr>
       <tr>
         <td>C.14 <a name="Postal_Code_Validation" href=
         "#Postal_Code_Validation" id=
         "Postal_Code_Validation">Postal Code Validation</a></td>
         <td>6 <a href=
         "tr35-info.html#Postal_Code_Validation">Postal Code
         Validation</a></td>
       </tr>
       <tr>
         <td>C.8 <a name="Supplemental_Character_Fallback_Data"
         href="#Supplemental_Character_Fallback_Data" id=
         "Supplemental_Character_Fallback_Data">Supplemental
         Character Fallback Data</a></td>
         <td>7 <a href=
         "tr35-info.html#Supplemental_Character_Fallback_Data">Supplemental
         Character Fallback Data</a></td>
       </tr>
       <tr>
         <td>M <a name="Coverage_Levels" href="#Coverage_Levels" id=
         "Coverage_Levels">Coverage Levels</a></td>
         <td>8 <a href="tr35-info.html#Coverage_Levels">Coverage
         Levels</a></td>
       </tr>
       <tr>
         <td>5.20 <a name="Metadata_Elements" href=
         "tr35-info.html#Metadata_Elements" id=
         "Metadata_Elements">Metadata Elements</a></td>
         <td>10 <a href="tr35-info.html#Metadata_Elements">Locale
         Metadata Element</a></td>
       </tr>
       <tr>
         <td>P <a name="Appendix_Supplemental_Metadata" href=
         "tr35-info.html#Appendix_Supplemental_Metadata" id=
         "Appendix_Supplemental_Metadata">Supplemental
         Metadata</a><br>
         P.1 <a name="Supplemental_Alias_Information" href=
         "tr35-info.html#Supplemental_Alias_Information" id=
         "Supplemental_Alias_Information">Supplemental Alias
         Information</a><br>
         P.2 <a name="Supplemental_Deprecated_Information" href=
         "tr35-info.html#Supplemental_Deprecated_Information" id=
         "Supplemental_Deprecated_Information">Supplemental
         Deprecated Information</a><br>
         P.3 <a name="Default_Content" href=
         "tr35-info.html#Default_Content" id=
         "Default_Content">Default Content</a></td>
         <td>9 <a href=
         "tr35-info.html#Appendix_Supplemental_Metadata">Supplemental
         Metadata</a><br>
         9.1 <a href=
         "tr35-info.html#Supplemental_Alias_Information">Supplemental
         Alias Information</a><br>
         9.2 <a href=
         "tr35-info.html#Supplemental_Deprecated_Information">Supplemental
         Deprecated Information</a><br>
         9.3 <a href="tr35-info.html#Default_Content">Default
         Content</a></td>
       </tr>
     </table>
     <table cellspacing="0" cellpadding="2" border="1" width="100%">
       <caption>
         <a href="#Part_7_Links" name="Part_7_Links" id=
         "Part_7_Links">Part 7 Links</a>: <a href=
         "tr35-keyboards.html">Keyboards</a> (keyboard mappings)
       </caption>
       <tr>
         <th>Old section</th>
         <th>Section in new part</th>
       </tr>
       <tr>
         <td>S <a name="Keyboards" href="#Keyboards" id=
         "Keyboards">Keyboards</a></td>
         <td>1 <a href=
         "tr35-keyboards.html#Keyboards">Keyboards</a></td>
       </tr>
       <tr>
         <td>S <a name="Goals_and_Nongoals" href=
         "#Goals_and_Nongoals" id="Goals_and_Nongoals">Goals and
         Nongoals</a></td>
         <td><a href="tr35-keyboards.html#Goals_and_Nongoals">Goals
         and Nongoals</a></td>
       </tr>
       <tr>
         <td>S <a name="File_and_Dir_Structure" href=
         "#File_and_Dir_Structure" id="File_and_Dir_Structure">File
         and Directory Structure</a></td>
         <td><a href=
         "tr35-keyboards.html#File_and_Dir_Structure">File and
         Directory Structure</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_Heirarchy_Layout_File" href=
         "#Element_Heirarchy_Layout_File" id=
         "Element_Heirarchy_Layout_File">Element Hierarchy - Layout
         File</a></td>
         <td><a href=
         "tr35-keyboards.html#Element_Heirarchy_Layout_File">Element
         Hierarchy - Layout File</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_Heirarchy_Platform_File" href=
         "#Element_Heirarchy_Platform_File" id=
         "Element_Heirarchy_Platform_File">Element Hierarchy -
         Platform File</a></td>
         <td><a href=
         "tr35-keyboards.html#Element_Heirarchy_Platform_File">Element
         Hierarchy - Platform File</a></td>
       </tr>
       <tr>
         <td>S <a name="Invariants" href="#Invariants" id=
         "Invariants">Invariants</a></td>
         <td><a href=
         "tr35-keyboards.html#Invariants">Invariants</a></td>
       </tr>
       <tr>
         <td>S <a name="Data_Sources" href="#Data_Sources" id=
         "Data_Sources">Data Sources</a></td>
         <td><a href="tr35-keyboards.html#Data_Sources">Data
         Sources</a></td>
       </tr>
       <tr>
         <td>S <a name="Keyboard_IDs" href="#Keyboard_IDs" id=
         "Keyboard_IDs">Keyboard IDs</a></td>
         <td><a href="tr35-keyboards.html#Keyboard_IDs">Keyboard
         IDs</a></td>
       </tr>
       <tr>
         <td>S <a name="Platform_Behaviors_in_Edge_Cases" href=
         "#Platform_Behaviors_in_Edge_Cases" id=
         "Platform_Behaviors_in_Edge_Cases">Platform Behaviors in
         Edge Cases</a></td>
         <td><a href=
         "tr35-keyboards.html#Platform_Behaviors_in_Edge_Cases">Platform
         Behaviors in Edge Cases</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_Keyboard" href="#Element_Keyboard"
         id="Element_Keyboard">Element: keyboard</a></td>
         <td><a href="tr35-keyboards.html#Element_Keyboard">Element:
         keyboard</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_version" href="#Element_version" id=
         "Element_version">Element: version</a></td>
         <td><a href="tr35-keyboards.html#Element_version">Element:
         version</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_generation" href=
         "#Element_generation" id="Element_generation">Element:
         generation</a></td>
         <td><a href=
         "tr35-keyboards.html#Element_generation">Element:
         generation</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_names" href="#Element_names" id=
         "Element_names">Element: names</a></td>
         <td><a href="tr35-keyboards.html#Element_names">Element:
         names</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_name" href="#Element_name" id=
         "Element_name">Element: name</a></td>
         <td><a href="tr35-keyboards.html#Element_name">Element:
         name</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_settings" href="#Element_settings"
         id="Element_settings">Element: settings</a></td>
         <td><a href="tr35-keyboards.html#Element_settings">Element:
         settings</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_keyMap" href="#Element_keyMap" id=
         "Element_keyMap">Element: keyMap</a></td>
         <td><a href="tr35-keyboards.html#Element_keyMap">Element:
         keyMap</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_map" href="#Element_map" id=
         "Element_map">Element: map</a></td>
         <td><a href="tr35-keyboards.html#Element_map">Element:
         map</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_transforms" href=
         "#Element_transforms" id="Element_transforms">Element:
         transforms</a></td>
         <td><a href=
         "tr35-keyboards.html#Element_transforms">Element:
         transforms</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_transform" href="#Element_transform"
         id="Element_transform">Element: transform</a></td>
         <td><a href=
         "tr35-keyboards.html#Element_transform">Element:
         transform</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_platform" href="#Element_platform"
         id="Element_platform">Element: platform</a></td>
         <td><a href="tr35-keyboards.html#Element_platform">Element:
         platform</a></td>
       </tr>
       <tr>
         <td>S <a name="Element_hardwareMap" href=
         "#Element_hardwareMap" id="Element_hardwareMap">Element:
         hardwareMap</a></td>
         <td><a href=
         "tr35-keyboards.html#Element_hardwareMap">Element:
         hardwareMap</a></td>
       </tr>
       <tr>
         <td>S <a name="Principles_for_Keyboard_Ids" href=
         "#Principles_for_Keyboard_Ids" id=
         "Principles_for_Keyboard_Ids">Principles for Keyboard
         Ids</a></td>
         <td><a href=
         "tr35-keyboards.html#Principles_for_Keyboard_Ids">Principles
         for Keyboard Ids</a></td>
       </tr>
     </table>
     <hr>

 	  <h2><a href="#LocaleId_Canonicalization" name="LocaleId_Canonicalization">Annex C. LocaleId Canonicalization</a></h2>
 	  <p>&nbsp;</p>
 		  <p>The languageAlias, scriptAlias, territoryAlias, and variantAlias elements are used as rules to transform an input <em>source localeId</em>. The first step is to transform the <em>languageId</em> portion of the localeId. <br>
       </p>
 		  <blockquote>Note: in the following discussion, the separator '-' is used. That is also used in examples of XML alias data, even though for compatibility reasons that alias data actually uses '_' as a separator. The processing can also be applied to syntax while maintaining the separator '_', <em>mutatis mutandis</em>. CLDR also uses &ldquo;territory&rdquo; and &ldquo;region&rdquo; interchangeably.</blockquote>
 	  <h3 >Definitions</h3>
 	  <h4 >1. Multimap interpretation</h4>
 		  <p>Interpret each languageId as a multimap from a <em>fieldId</em> (language, script, region, variants) to a <strong>set</strong> of field values.</p>
 	  <p><em>Examples:</em></p>
 		  <a ></a><a ></a>
 		  <table class='simple'>
 		    <tbody>
 		      <tr>
 		        <td colspan="1" rowspan="2"><p> </p>
 		          <p><strong>Source</strong></p></td>
 		        <td colspan="4" rowspan="1"><p><strong>Fields</strong></p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p><strong>Language</strong></p></td>
 		        <td colspan="1" rowspan="1"><p><strong>Script</strong></p></td>
 		        <td colspan="1" rowspan="1"><p><strong>Region</strong></p></td>
 		        <td colspan="1" rowspan="1"><p><strong>Variants</strong></p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>en-GB</p></td>
 		        <td colspan="1" rowspan="1"><p>{en}</p></td>
 		        <td colspan="1" rowspan="1"><p>{}</p></td>
 		        <td colspan="1" rowspan="1"><p>{GB}</p></td>
 		        <td colspan="1" rowspan="1"><p>{}</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>und-GB</p></td>
 		        <td colspan="1" rowspan="1"><p>{}</p></td>
 		        <td colspan="1" rowspan="1"><p>{}</p></td>
 		        <td colspan="1" rowspan="1"><p>{GB}</p></td>
 		        <td colspan="1" rowspan="1"><p>{}</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>ja-Latn-YU-hepburn-heploc</p></td>
 		        <td colspan="1" rowspan="1"><p>{ja}</p></td>
 		        <td colspan="1" rowspan="1"><p>{Latn}</p></td>
 		        <td colspan="1" rowspan="1"><p>{YU}</p></td>
 		        <td colspan="1" rowspan="1"><p>{hepburn, heploc}</p></td>
 	          </tr>
 	        </tbody>
       </table>
 		  <p> </p>
 		  <ul>
 		    <li>This can be represented as an abbreviated format: {L={ja}, S={Latn}, R={YU}, V={hepburn, heploc}}, skipping empty sets.</li>
 		    <li>&ldquo;und&rdquo; is a special language code that is treated as an empty set.</li>
 		    <li>Of course, only the Variants can contain more than one item: the others are either empty or contain exactly 1 item.</li>
       </ul>
 	  <h4 >2. Alias elements</h4>
 		  <p>For the languageAlias elements, the <em>type</em> and <em>replacements</em> are languageIds.</p>
 		  <p>For the script-, territory- (aka region), and variant- Alias elements, the type and replacements are interpreted as a languageIds, <em>after</em> prefixing with &ldquo;und-&rdquo;. Thus</p>
 		  <code>&lt;territoryAlias type="AN" replacement="CW SX BQ" reason="deprecated"/&gt;</code>
 		  <p>is interpreted as:</p>
 		  <code>&lt;territoryAlias type="und-AN" replacement="und-CW und-SX und-BQ" reason="deprecated"/&gt;</code>
 		  <p>Note that for the case of territoryAlias, there may be multiple replacement values separated by spaces in the text (such as replacement="und-CW und-SX und-BQ"); other rules only ever have a single replacement value.</p>
 		  <p> </p>
 	  <h4 >3. Matches</h4>
 		  <p>A rule matches a source if and only for all fields, each <em>source</em> field ⊇ <em>type</em> field.</p>
 		  <blockquote>
 		  <p><em>Examples:</em></p>
 		  <p>source=&ldquo;ja-heploc-hepburn&rdquo; and type=&rdquo;und-hepburn&rdquo;</p>
 		  <table class='simple'>
 		    <tbody>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{ja} ⊇ {} </p></td>
 		        <td colspan="1" rowspan="1"><p>success, und = {}</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{hepburn, heploc} ⊇ {hepburn}</p></td>
 		        <td colspan="1" rowspan="1"><p><strong>success</strong></p></td>
 	          </tr>
 	        </tbody>
 		    </table>
 		  <p>so the rule matches the source. (Note that order of variants is immaterial to matching)</p>
 		  <p>&nbsp;</p>
 		  <p> </p>
 		  <p>source=&ldquo;ja-hepburn&rdquo; and type=&rdquo;und-hepburn-heploc&rdquo;</p>
 		  <table class='simple'>
 		    <tbody>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{ja} ⊇ {} </p></td>
 		        <td colspan="1" rowspan="1"><p>success, und = {}</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{hepburn} ⊉ {hepburn, heploc}</p></td>
 		        <td colspan="1" rowspan="1"><p><strong>failure</strong></p></td>
 	          </tr>
 	        </tbody>
 		    </table>
 		  <p>so the rule does not match the source.</p></blockquote>
 	  <h4 >4. Replacement</h4>
 	  <p>A matching rule can be used to transform the source fields as follows</p>
 		  <ul>
 		    <li>if type.field ≠ {}
 		      <ul>
 		        <li>source.field = (source.field - type.field) ∪ replacement.field</li>
 	          </ul>
 		    </li>

 		    <li>else if source.field = {} and replacement.field ≠ {}
 		      <ul>
 		        <li>source.field = replacement.field</li>
 	          </ul>
 		    </li>
       </ul>
 		  <p><em>Example:</em></p>
 		  <blockquote><p>source=ja-Latn-fonipa-hepburn-heploc</p>
 		  <p>rule =&rdquo;&lt;languageAlias type="und-hepburn-heploc"</p>
 		  <p>replacement="und-alalc97"&gt;&rdquo;</p>
 		  <p>&nbsp;</p>
 		  <p>result=&rdquo;ja-Latn-alalc97-fonipa&rdquo; // note that CLDR canonical order of variants is alphabetical</p></blockquote>
 	  <h5 >Territory Exception</h5>
 		  <p>If the field = territory, and the replacement.field has more than one value, then look up the most likely territory* for the base language code (and script, if there is one). If that likely territory is in the list of replacements, use it. Otherwise, use the first territory in the list.</p>
 		  <p><em>Example:</em></p>
 	  <blockquote><p>source=ja-Latn-fonipa-hepburn-heploc</p>
 	    <p>rule =&rdquo;&lt;languageAlias type="und-hepburn-heploc"</p>
 		  <p>replacement="und-alalc97"&gt;&rdquo;</p>
 	    <p>&nbsp;</p>
 	    <p>result=&rdquo;ja-Latn-alalc97-fonipa&rdquo; <em>// note that CLDR canonical order of variants is alphabetical</em></p>
 	  </blockquote>
 	  <h4>5. Canonicalizing Syntax</h4>
 		<p>To canonicalize the syntax of <em>source</em>: </p>
 		<ul>
 		  <li>Initial Script Subtag
 		    <ul>
 		      <li>If the first subtag has 4 letters, prepend the source with &quot;und-&quot;</li>
 		      <li>Note: These are only for specialized use.</li>
 	        </ul>
 	      </li>
 		  <li>Casing
 		    <ul>
 		      <li>Put any script subtag into title case (eg, Hant)</li>
 		      <li>Put any region subtag int uppercase (eg, DE)</li>
 		      <li>Put all other subtags into lowercase (eg, en, fonipa)</li>
 	        </ul>
 	      </li>
 		  <li>Order
 		    <ul>
 		      <li>Put any variants into alphabetical order (eg, en-fonipa-scouse, not en-scouse-fonipa)</li>
 		      <li>Put any extensions into alphabetical order by their singleton (eg, en-t-xxx-u-yyy, not en-u-yyy-t-xxx)</li>
 		      <li>Put all attributes into  alphabetical order.</li>
 		      <li>Put all &lt;keywords, tfields&gt; pairs into alphabetical order of their keys, within their respective extensions.</li>
 		      <li>Remove any type or tfield value of "true"</li>
 	        </ul>
 		  </li>
 		  <li>Separator
 		    <ul>
 		      <li>Replace '_' by '-' </li>
 	        </ul>
 		  </li>
 	  </ul>
 	  <h3 >Preprocessing</h3>
 	  <p>The data from supplementalMetadata is (logically) preprocessed as follows.</p>
 		  <ol start="1">
 		    <li>Load the rules from supplementalMetadata.xml, replacing '_' by '-', and adding &ldquo;und-&rdquo; as described in <em>Definition 2. Alias Elements</em>.</li>
 		    <li>Capture all languageAlias rules where the <em>type</em> is an invalid languageId into a set of <strong>BCP47 LegacyRules</strong>. Example:
 		      <ol>
 		        <li>&lt;languageAlias type="i-mingo" replacement="see-x-i-mingo" reason="legacy"/&gt;</li>
 	          </ol>
 		    </li>
 		    <li>Discard all rules where the <em>type</em> is an invalid languageId. Examples are
 <ol>
           <li>&lt;languageAlias type="i-mingo" replacement="see-x-i-mingo" reason="legacy"/&gt;</li>
 		        <li>&lt;territoryAlias type="und-AAA" replacement="und-AA" reason="overlong"/&gt;</li>
 	          </ol>
 	        </li>
 		    <li>Change the <em>type</em> and <em>replacement</em> values in the remaining rules into multimap rules, as per <em>Definition 1. Multimap Interpretation</em>.
 		      <ol>
 		        <li>Note that the &ldquo;und&rdquo; value disappears.</li>
 	          </ol>
 		    </li>

 		    <li>Order the set of rules by
 		      <ol>
 		        <li>the size of the union of all field value sets, with largest size first</li>
 		        <li>and then alphabetically by field.</li>
 	          </ol>
 	        </li>

 		    <li>The result is the set of <strong>Alias Rules</strong></li>
       </ol>
 		  <p> </p>
 	  <p>So using the examples above, we get the following order:</p>
 		  <table class='simple'>
 		    <tbody>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p><strong>languageId</strong></p></td>
 		        <td colspan="1" rowspan="1"><p><strong>size of union</strong></p></td>
 		        <td colspan="1" rowspan="1"><p><strong>Alpha</strong></p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{V={hepburn, heploc}}</p></td>
 		        <td colspan="1" rowspan="1"><p>2</p></td>
 		        <td colspan="1" rowspan="1"><p>n/a</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{L={en}, R={GB}}</p></td>
 		        <td colspan="1" rowspan="1"><p>2</p></td>
 		        <td colspan="1" rowspan="2"><p>en &lt; fr</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{L={fr}, R={CA}}</p></td>
 		        <td colspan="1" rowspan="1"><p>2</p></td>
 	          </tr>
 		      <tr>
 		        <td colspan="1" rowspan="1"><p>{R={CA}}</p></td>
 		        <td colspan="1" rowspan="1"><p>1</p></td>
 		        <td colspan="1" rowspan="1"><p>n/a</p></td>
 	          </tr>
 	        </tbody>
       </table>
 		  <p> </p>
 		  <blockquote><strong>Note: </strong>The secondary sort order in Preprocessing step 5.2 is only to ensure  determinant results when two rules &ldquo;of the same length&rdquo; could apply.</blockquote>
 	  <h3 >Processing LanguageIds</h3>
 	  <p>To canonicalize a given <em>source</em>:</p>
 		  <ol start="1">
 		    <li>Canonicalize the syntax of <em>source</em> as per <em>Definition 5. Canonicalizing Syntax</em>.</li>
             <li>Where the <em>source</em> could be an arbitrary BCP 47 language tag, first process as follows:
 <ol>
           <li>If the source is identical to one of the types in the BCP47 LegacyRules, replace the entire source by the replacement value.</li>
 		        <li>Else if there is an extlang subtag, then apply Step 3 of <a href="https://www.google.com/url?q=https://tools.ietf.org/html/bcp47%23section-4.5&amp;sa=D&amp;ust=1600829915065000&amp;usg=AOvVaw12vD5EzoVl3VFzEyrECMj-">https://tools.ietf.org/html/bcp47#section-4.5</a> to remove the extlang subtag (possibly adjusting the language subtag).
 		          <ol>
 		            <li>Don&rsquo;t apply any of the other canonicalization steps in that section, however.</li>
 	              </ol>
 	            </li>
 		        <li>Else if the first subtag is "x", prefix by "und-".</li>
 		        <li><strong>Note: </strong>there are currently no valid 4-letter primary language subtags. While it is extremely unlikely that BCP47 would ever register them, if so then <i>languageAlias</i> mappings will be supplied for them, mapping to defined CLDR language subtags (from the idStatus=&quot;reserved&quot; set).</li>
 	          </ol>
 		    </li>
 		    <li>Find the first matching rule in <strong>Alias Rules</strong> (from <strong>Preprocessing</strong>)
 <ol>
           <li>If there are none, return <em>source</em></li>
 	          </ol>
 	        </li>
 		    <li>Transform <em>source</em> according to that rule</li>
 		    <li>loop (goto #3)</li>
       </ol>
 	  <h2 >Processing LocaleIds</h2>
 	  <p>The canonicalization of localeIds is done by first canonicalizing the languageId portion, then handling extensions in the following way:</p>
 		  <ol start="1">
 		    <li>Replace any <em>tlang</em> languageId value by its canonicalization.</li>
 		    <li>Use the bcp47 data to replace keys, types, tfields, and tvalues by their canonical forms. See <strong>Section 3.6.4 U Extension Data Files</strong> and <strong>Section 3.7.1 T Extension Data Files</strong>. The matches are in the alias attribute value, while the canonical replacement is in the name attribute value. For example:
 		      <ol>
 		        <li>Because of the following bcp47 data:<br>
 		          <code>&lt;key name="ms"…&gt;…&lt;type name="uksystem" … alias="imperial" … /&gt;…&lt;/key&gt;</code></li>
 		        <li>We get the following transformation:<br>
 		          <code>en-u-ms-imperial ⇒ en-u-ms-uksystem</code></li>
 	          </ol>
 		    </li>

 		    <li>If there is an 'sd' or 'rg' key, replace any subdivision alias in its value in the same way, using subdivisionAlias data.</li>
       </ol>
 	  <h2 >Optimizations</h2>
 		  <p>The above algorithm is a logical statement of the process, but would obviously not be directly suited to production code. Production-level code can use many optimizations for efficiency while achieving the same result. For example, the Alias Rules can be further preprocessed to avoid indefinite looping, instead doing a rule lookup once per subtag. As another example, the small number of <strong>Territory Exceptions</strong> can be preprocessed to avoid the likely subtags processing.</p>
 	    <p>&nbsp;</p>

 	  <hr>
     <h2><a name="References" href="#References" id=
     "References">References</a></h2>
     <table cellpadding="4" cellspacing="0" class="noborder" border=
     "0">
       <tr>
         <th class="noborder" width="148">Ancillary Information</th>
         <td class="noborder" width="730"><i>To properly localize,
         parse, and format data requires ancillary information,
         which is not expressed in Locale Data Markup Language. Some
         of the formats for values used in Locale Data Markup
         Language are constructed according to external
         specifications. The sources for this data and/or formats
         include the following:<br>
         &nbsp;</i></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Bugs" href=
         "#Bugs" id="Bugs">Bugs</a>]</td>
         <td class="noborder" width="730">CLDR Bug Reporting
         form<br>
         <a href=
         "http://cldr.unicode.org/index/bug-reports">http://cldr.unicode.org/index/bug-reports</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Charts" href=
         "#Charts" id="Charts">Charts</a>]</td>
         <td class="noborder" width="730">The online code charts can
         be found at <a href=
         "https://unicode.org/charts/">https://unicode.org/charts/</a>
         An index to character names with links to the corresponding
         chart is found at <a href=
         "https://unicode.org/charts/charindex.html">https://unicode.org/charts/charindex.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="DUCET" href=
         "#DUCET" id="DUCET">DUCET</a>]</td>
         <td class="noborder" width="730">The Default Unicode
         Collation Element Table (DUCET)<br>
         For the base-level collation, of which all the collation
         tables in this document are tailorings.<br>
         <a href=
         "https://unicode.org/reports/tr10/#Default_Unicode_Collation_Element_Table">
         https://unicode.org/reports/tr10/#Default_Unicode_Collation_Element_Table</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="FAQ" href="#FAQ"
         id="FAQ">FAQ</a>]</td>
         <td class="noborder" valign="top" width="730">Unicode
         Frequently Asked Questions<br>
         <a href=
         "https://unicode.org/faq/">https://unicode.org/faq/<br></a>
         <i>For answers to common questions on technical
         issues.</i></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="FCD" href="#FCD"
         id="FCD">FCD</a>]</td>
         <td class="noborder" width="730">As defined in UTN #5
         Canonical Equivalences in Applications<br>
         <a href=
         "https://unicode.org/notes/tn5/">https://unicode.org/notes/tn5/</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Glossary" href=
         "#Glossary" id="Glossary">Glossary</a>]</td>
         <td class="noborder" width="730">Unicode Glossary<a href=
         "https://unicode.org/glossary/"><br>
         https://unicode.org/glossary/<br></a> <i>For explanations of
         terminology used in this and other documents.</i></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="JavaChoice"
         href="#JavaChoice" id="JavaChoice">JavaChoice</a>]</td>
         <td class="noborder" width="730">Java ChoiceFormat<br>
         <a href=
         "https://docs.oracle.com/javase/7/docs/api/java/text/ChoiceFormat.html">
         https://docs.oracle.com/javase/7/docs/api/java/text/ChoiceFormat.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Olson" href=
         "#Olson" id="Olson">Olson</a>]</td>
         <td class="noborder" width="730">The <i>TZ</i>ID Database
         (aka Olson timezone database)<br>
         Time zone and daylight savings information.<br>
         <a href=
         "https://www.iana.org/time-zones">https://www.iana.org/time-zones</a><br>

         For archived data, see&nbsp;<br>
         <a href=
         "ftp://ftp.iana.org/tz/releases/">ftp://ftp.iana.org/tz/releases/</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Reports" href=
         "#Reports" id="Reports">Reports</a>]</td>
         <td class="noborder" width="730">Unicode Technical
         Reports<br>
         <a href=
         "https://unicode.org/reports/">https://unicode.org/reports/<br>
         </a> <i>For information on the status and development
         process for technical reports, and for a list of technical
         reports.</i></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Unicode" href=
         "#Unicode" id="Unicode">Unicode</a>]</td>
         <td class="noborder" width="730">The Unicode Consortium, <i>The Unicode Standard, Version 13.0.0</i><br>
         (Mountain View, CA: The Unicode Consortium, 2020. ISBN 978-1-936213-26-9)<br>
         <a href="https://www.unicode.org/versions/Unicode13.0.0/">https://www.unicode.org/versions/Unicode13.0.0/</a>
       </td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Versions" href=
         "#Versions" id="Versions">Versions</a>]</td>
         <td class="noborder" width="730">Versions of the Unicode
         Standard<br>
         <a href=
         "https://www.unicode.org/versions/">https://www.unicode.org/versions/</a><br>

         <i>For information on version numbering, and citing and
         referencing the Unicode Standard, the Unicode Character
         Database, and Unicode Technical Reports.</i></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="XPath" href=
         "#XPath" id="XPath">XPath</a>]</td>
         <td class="noborder" width="730"><a href=
         "https://www.w3.org/TR/xpath/">https://www.w3.org/TR/xpath/</a></td>
       </tr>
       <tr>
         <th class="noborder" width="148">Other Standards</th>
         <td class="noborder" width="730"><i>Various standards
         define codes that are used as keys or values in Locale Data
         Markup Language. These include:</i></td>
       </tr>
       <tr>
         <td class="noborder">[<a name="BCP47" href="#BCP47" id=
         "BCP47">BCP47</a>]</td>
         <td class="noborder">
           <a href=
           "https://www.rfc-editor.org/rfc/bcp/bcp47.txt">https://www.rfc-editor.org/rfc/bcp/bcp47.txt</a>
           <p>The Registry<br>
           <a href=
           "https://www.iana.org/assignments/language-subtag-registry">
           https://www.iana.org/assignments/language-subtag-registry</a></p>
         </td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ISO639" href=
         "#ISO639" id="ISO639">ISO639</a>]</td>
         <td class="noborder" width="730">ISO Language Codes<br>
         <a href=
         "https://www.loc.gov/standards/iso639-2/">https://www.loc.gov/standards/iso639-2/</a><br>

         Actual List<br>
         <a href=
         "https://www.loc.gov/standards/iso639-2/langcodes.html">https://www.loc.gov/standards/iso639-2/langcodes.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ISO1000" href=
         "#ISO1000" id="ISO1000">ISO1000</a>]</td>
         <td class="noborder" width="730">ISO 1000: SI units and
         recommendations for the use of their multiples and of
         certain other units, International Organization for
         Standardization, 1992.<br>
         <a href=
         "https://www.iso.org/iso/catalogue_detail?csnumber=5448">https://www.iso.org/iso/catalogue_detail?csnumber=5448</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ISO3166" href=
         "#ISO3166" id="ISO3166">ISO3166</a>]</td>
         <td class="noborder" width="730">ISO Region Codes<br>
         <a href=
         "https://www.iso.org/iso/country_codes">https://www.iso.org/iso/country_codes</a><br>

         Actual List<br>
         <a href=
         "https://www.iso.org/obp/ui/#search">https://www.iso.org/obp/ui/#search</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ISO4217" href=
         "#ISO4217" id="ISO4217">ISO4217</a>]</td>
         <td class="noborder" width="730">
           ISO Currency Codes<br>
           <a href=
           "https://www.iso.org/iso/home/standards/currency_codes.htm">
           https://www.iso.org/iso/home/standards/currency_codes.htm</a>
           <p><i>(Note that as of this point, there are significant
           problems with this list. The supplemental data file
           contains the best compendium of currency information
           available.)</i></p>
         </td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ISO8601" href=
         "#ISO8601" id="ISO8601">ISO8601</a>]</td>
         <td class="noborder" width="730">ISO Date and Time
         Format<br>
         <a href=
         "https://www.iso.org/iso/iso8601">https://www.iso.org/iso/iso8601</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ISO15924" href=
         "#ISO15924" id="ISO15924">ISO15924</a>]</td>
         <td class="noborder" width="730">ISO Script Codes<br>
         <a href=
         "https://www.unicode.org/iso15924/index.html">https://www.unicode.org/iso15924/index.html</a><br>

         Actual List<br>
         <a href=
         "https://www.unicode.org/iso15924/codelists.html">https://www.unicode.org/iso15924/codelists.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="LOCODE" href=
         "#LOCODE" id="LOCODE">LOCODE</a>]</td>
         <td class="noborder" width="730">United Nations Code for
         Trade and Transport Locations, commonly known as
         "UN/LOCODE"<br>
         <a href=
         "https://www.unece.org/cefact/locode/welcome.html">https://www.unece.org/cefact/locode/welcome.html</a><br>

         Download at:&nbsp;<a href=
         "https://www.unece.org/cefact/codesfortrade/codes_index.htm">&nbsp;https://www.unece.org/cefact/codesfortrade/codes_index.htm</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="RFC6067" href=
         "#RFC6067" id="RFC6067">RFC6067</a>]</td>
         <td class="noborder" width="730">BCP 47 Extension U<br>
         <a href=
         "https://www.ietf.org/rfc/rfc6067.txt">https://www.ietf.org/rfc/rfc6067.txt</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="RFC6497" href=
         "#RFC6497" id="RFC6497">RFC6497</a>]</td>
         <td class="noborder" width="730">BCP 47 Extension T -
         Transformed Content<br>
         <a href=
         "https://www.ietf.org/rfc/rfc6497.txt">https://www.ietf.org/rfc/rfc6497.txt</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="UNM49" href=
         "#UNM49" id="UNM49">UNM49</a>]</td>
         <td class="noborder" width="730">
           UN M.49: UN Statistics Division
           <p>Country or area &amp; region codes<br>
           <a href=
           "https://unstats.un.org/unsd/methods/m49/m49.htm">https://unstats.un.org/unsd/methods/m49/m49.htm</a></p>
           <p>Composition of macro geographical (continental)
           regions, geographical sub-regions, and selected economic
           and other groupings<br>
           <a href=
           "https://unstats.un.org/unsd/methods/m49/m49regin.htm">https://unstats.un.org/unsd/methods/m49/m49regin.htm</a></p>
         </td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="XMLSchema" href=
         "#XMLSchema" id="XMLSchema">XML Schema</a>]</td>
         <td class="noborder" width="730">W3C XML Schema<br>
         <a href=
         "https://www.w3.org/XML/Schema">https://www.w3.org/XML/Schema</a></td>
       </tr>
       <tr>
         <th class="noborder" width="148">General</th>
         <td class="noborder" width="730"><i>The following are
         general references from the text:</i></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ByType" href=
         "#ByType" id="ByType">ByType</a>]</td>
         <td class="noborder" width="730">CLDR Comparison Charts<br>
         <a href=
         "https://www.unicode.org/cldr/comparison_charts.html">https://www.unicode.org/cldr/comparison_charts.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Calendars" href=
         "#Calendars" id="Calendars">Calendars</a>]</td>
         <td class="noborder" width="730">Calendrical Calculations:
         The Millennium Edition by Edward M. Reingold, Nachum
         Dershowitz; Cambridge University Press; Book and CD-ROM
         edition (July 1, 2001); ISBN: 0521777526. Note that the
         algorithms given in this book are copyrighted.</td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="Comparisons"
         href="#Comparisons" id="Comparisons">Comparisons</a>]</td>
         <td class="noborder" width="730">Comparisons between locale
         data from different sources<br>
         <a href=
         "https://unicode-org.github.io/cldr-staging/charts/38/supplemental/dtd_deltas.html">https://unicode-org.github.io/cldr-staging/charts/38/supplemental/dtd_deltas.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="CurrencyInfo"
         href="#CurrencyInfo" id=
         "CurrencyInfo">CurrencyInfo</a>]</td>
         <td class="noborder" width="730">UNECE Currency Data<br>
         <a href=
         "https://www.currency-iso.org/en/home/tables.html">https://www.currency-iso.org/en/home/tables.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="DataFormats"
         href="#DataFormats" id="DataFormats">DataFormats</a>]</td>
         <td class="noborder" width="730">CLDR Translation
         Guidelines<br>
         <a href=
         "http://cldr.unicode.org/translation">http://cldr.unicode.org/translation</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="LDML" href=
         "#LDML" id="LDML">Example</a>]</td>
         <td class="noborder" width="730">A sample in Locale Data
         Markup Language<br>
         <a href=
         "https://unicode.org/cldr/dtd/1.1/ldml-example.xml">https://unicode.org/cldr/dtd/1.1/ldml-example.xml</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ICUCollation"
         href="#ICUCollation" id=
         "ICUCollation">ICUCollation</a>]</td>
         <td class="noborder" width="730">ICU rule syntax<br>
         <a href=
         "https://unicode-org.github.io/icu/userguide/collation/customization/">
         https://unicode-org.github.io/icu/userguide/collation/customization/</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ICUTransforms"
         href="#ICUTransforms" id=
         "ICUTransforms">ICUTransforms</a>]</td>
         <td class="noborder" width="730">Transforms<br>
         <a href=
         "https://unicode-org.github.io/icu/userguide/transforms/">
         https://unicode-org.github.io/icu/userguide/transforms/</a><br>

         Transforms Demo<br>
         <a href=
         "http://demo.icu-project.org/icu-bin/translit/">http://demo.icu-project.org/icu-bin/translit/</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ICUUnicodeSet"
         href="#ICUUnicodeSet" id=
         "ICUUnicodeSet">ICUUnicodeSet</a>]</td>
         <td class="noborder" width="730">ICU UnicodeSet<br>
         <a href=
         "https://unicode-org.github.io/icu/userguide/strings/unicodeset.html">https://unicode-org.github.io/icu/userguide/strings/unicodeset.html<br>
         </a> API<br>
         <a href=
         "https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/UnicodeSet.html">
         https://unicode-org.github.io/icu-docs/apidoc/released/icu4j/com/ibm/icu/text/UnicodeSet.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="ITUE164" href=
         "#ITUE164" id="ITUE164">ITUE164</a>]</td>
         <td class="noborder" width="730">International
         Telecommunication Union: List Of ITU Recommendation E.164
         Assigned Country Codes<br>
         available at <a href=
         "https://www.itu.int/opb/publications.aspx?parent=T-SP&amp;view=T-SP2">
         https://www.itu.int/opb/publications.aspx?parent=T-SP&amp;view=T-SP2</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="LocaleExplorer"
         href="#LocaleExplorer" id=
         "LocaleExplorer">LocaleExplorer</a>]</td>
         <td class="noborder" width="730">ICU Locale Explorer<br>
         <a href=
         "http://demo.icu-project.org/icu-bin/locexp">http://demo.icu-project.org/icu-bin/locexp</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="localeProject"
         href="#localeProject" id=
         "localeProject">LocaleProject</a>]</td>
         <td class="noborder" width="730">Common Locale Data
         Repository Project<br>
         <a href=
         "https://unicode.org/cldr/">https://unicode.org/cldr/</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="NamingGuideline"
         href="#NamingGuideline" id=
         "NamingGuideline">NamingGuideline</a>]</td>
         <td class="noborder" width="730">OpenI18N Locale Naming
         Guideline<br>
         formerly at
         https://www.openi18n.org/docs/text/LocNameGuide-V10.txt</td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="RBNF" href=
         "#RBNF" id="RBNF">RBNF</a>]</td>
         <td class="noborder" width="730">Rule-Based Number
         Format<br>
         <a href=
         "https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classicu_1_1RuleBasedNumberFormat.html">
         https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classicu_1_1RuleBasedNumberFormat.html</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="RBBI" href=
         "#RBBI" id="RBBI">RBBI</a>]</td>
         <td class="noborder" width="730">Rule-Based Break
         Iterator<br>
         <a href=
         "https://unicode-org.github.io/icu/userguide/boundaryanalysis">
         https://unicode-org.github.io/icu/userguide/boundaryanalysis</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="UCAChart" href=
         "#UCAChart" id="UCAChart">UCAChart</a>]</td>
         <td class="noborder" width="730">Collation Chart<a href=
         "https://unicode.org/charts/collation/"><br>
         https://unicode.org/charts/collation/</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="UTCInfo" href=
         "#UTCInfo" id="UTCInfo">UTCInfo</a>]</td>
         <td class="noborder" width="730">NIST Time and Frequency
         Division Home Page<br>
         <a href="https://tf.nist.gov/">https://tf.nist.gov/<br></a>
         U.S. Naval Observatory: What is Universal Time?<br>
         <a href=
         "https://www.usno.navy.mil/USNO/time/master-clock/systems-of-time">https://www.usno.navy.mil/USNO/time/master-clock/systems-of-time</a></td>
       </tr>
       <tr>
         <td class="noborder" width="148">[<a name="WindowsCulture"
         href="#WindowsCulture" id=
         "WindowsCulture">WindowsCulture</a>]</td>
         <td class="noborder" width="730">Windows Culture Info
         (with&nbsp; mappings from [<a href=
         "#BCP47">BCP47</a>]-style codes to LCIDs)<br>
         <a href=
         "https://msdn.microsoft.com/en-us/library/system.globalization.cultureinfo(vs.71).aspx">
         http://msdn2.microsoft.com/en-us/library/system.globalization.cultureinfo(vs.71).aspx</a></td>
       </tr>
     </table>
     <h2><a name="Acknowledgments" href="#Acknowledgments" id=
     "Acknowledgments">Acknowledgments</a></h2>
     <p>Special thanks to the following people for their continuing
     overall contributions to the CLDR project, and for their
     specific contributions in the following areas. These
     descriptions only touch on the many contributions that they
     have made.</p>
     <ul>
       <li>Mark
       Davis for creating the initial version of LDML, and
       adding to and maintaining this specification, and for his
       work on the LDML code and tests, much of the supplemental
       data and overall structure, and transforms and
       keyboards.</li>
       <li>John Emmons for the POSIX conversion tool and
       metazones.</li>
       <li>Deborah Goldsmith for her contributions to LDML
       architecture and this specification.</li>
       <li>Chris Hansten for coordinating and managing data
       submissions and vetting.</li>
       <li>Erkki Kolehmainen and his team for their work on
       Finnish.</li>
       <li>Steven R. Loomis for development of the survey tool and
       database management.</li>
       <li>Peter Nugent for his contributions to the POSIX tool and
       from Open Office, and for coordinating and managing data
       submissions and vetting.</li>
       <li>George Rhoten for his work on currencies.</li>
       <li>Roozbeh Pournader (روزبه پورنادر) for his work on South
       Asian countries.</li>
       <li>Ram Viswanadha (రఘురామ్ విశ్వనాధ) for all of his work on
       LDML code and data integration, and for coordinating and
       managing data submissions and vetting.</li>
       <li>Vladimir Weinstein (Владимир Вајнштајн) for his work on
       collation.</li>
       <li>Yoshito Umaoka (馬岡 由人) for his work on the timezone
       architecture.</li>
       <li>Rick McGowan for his work gathering language, script and
       region data.</li>
       <li>Xiaomei Ji (吉晓梅) for her work on time intervals and
       plural formatting.</li>
       <li>David Bertoni for his contributions to the conversion
       tools.</li>
       <li>Mike Tardif for reviewing this specification and for
       coordinating and vetting data submissions.</li>
       <li>Peter Edberg for work on this specification,
       monthPatterns, cyclicNameSets, contextTransforms and other
       items.</li>
       <li>Raymond Wainman and Cibu Johny for their work on
       keyboards.</li>
       <li>Jennifer Chye for her contributions to the conversion
       tools.</li>
       <li>Markus Scherer for a major rewrite of Part 5, Collation.</li>
       <li><a href="https://www.sffc.xyz/">Shane Carr</a> for his work on numbers and measurement units.</li>
       <li>Robin Leroy for his work on compact plurals: Part 3, Section 5, <a href="tr35-numbers.html#Language_Plural_Rules">Language Plural
       Rules</a></li>
     </ul>
     <p>Other contributors to CLDR are listed on the <a href=
     "https://www.unicode.org/cldr/">CLDR Project Page</a>.</p>


     <h2><a name="Modifications" href="#Modifications" id=
     "Modifications">Modifications</a></h2>

     <p><b>Revision 61</b></p>
 	<ul>
 	  <li><b>Reissued</b> for CLDR 38.</li>

 	  <li><strong>Part 1: <a href="tr35.html#Contents">Core</a> (languages, locales, basic structure)</strong>
         <ul>
           <li><strong>Section 3.2.1 <a href="#Canonical_Unicode_Locale_Identifiers">Canonical Unicode Locale Identifiers</a></strong>: replaced text by a reference to <strong>Annex C. <a href="#LocaleId_Canonicalization" >LocaleId Canonicalization</a></strong>
           <li><strong>Section 3.3.1 <a  href=
     "#BCP_47_Language_Tag_Conversion" >BCP 47 Language Tag
     Conversion</a>:</strong> replaced text by a reference to <strong>Annex C. <a href="#LocaleId_Canonicalization" >LocaleId Canonicalization</a></strong></li>
           <li><strong>Section 3.6.1 <a href="#Key_And_Type_Definitions_" >Key And Type Definitions</a></strong>:
           added new key “dx”, for <a href="#UnicodeDictionaryBreakExclusionIdentifier" >Unicode Dictionary Break Exclusion Identifier</a>.</li>
           <li><strong>Section 3.6.4 <a href="#Unicode_Locale_Extension_Data_Files" >U Extension Data Files</a></strong>:
           added description of <a href="#SCRIPT_CODE" >SCRIPT_CODE</a> value for key “dx”.</li>
           <li><strong>Section 4.1.2 <a  href="#Lateral_Inheritance">Lateral Inheritance</a>: </strong>specified lateral inheritance in more detail, added case and gender.</li>
           <li><strong>Annex C. <a href="#LocaleId_Canonicalization" >LocaleId Canonicalization</a></strong>
             <ul>
               <li>Added new Annex, replacing text in <strong>Section 3.2.1 <a href="#Canonical_Unicode_Locale_Identifiers">Canonical Unicode Locale Identifiers</a></strong> and <strong>Section 3.3.1 <a  href=
     "#BCP_47_Language_Tag_Conversion" >BCP 47 Language Tag
     Conversion</a></strong></li>
               <li>Cleans up ambiguities in the previous specification of canonicalization. (This was done in concert with fixes to the alias data to work better with the specification.)</li>
             </ul>
           </li>
         </ul>
 	  </li>
 	  <li><strong>Part 2: <a href="tr35-general.html#Contents">General</a> (display names &amp;transforms, etc.)</strong>
         <ul>
           <li><strong>Section 6 <a href="tr35-general.html#Unit_Elements">Unit Elements</a></strong>
 		    <ul>
 		      <li>Added new element compoundUnitPattern1</li>
 		      <li>Added case attribute to compoundUnitPattern</li>
 		      <li>Provided full description of compound unit components</li>
 		    </ul>
           </li>

           <li><strong>Section 14.2 <a href="tr35-general.html#Character_Labels">Annotations Character Labels</a></strong>
 		    <ul>
 		      <li>Added new characterLabelPattern type attribute values subscript and superscript.</li>
 		    </ul>
           </li>

           <li><strong>Section 16 <a href="tr35-general.html#Grammatical_Derivations">Grammatical Derivations</a></strong> — new</li>
         </ul>
 	  </li>
 	  <li><strong>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a> (number &amp; currency formatting)</strong>
         <ul>
 	      <li><strong>Section 2.3 <a href="tr35-numbers.html#Number_Symbols">Number Symbols</a>:</strong>
 	        added approximatelySign.</li>
 	      <li><strong>Section 2.6 <a href="tr35-numbers.html#Minimal_Pairs">Minimal Pairs</a>:</strong> added case and
 	        gender minimal pairs. Removed the alt/draft ATTLIST since those are documented elsewhere and just obfuscate
 	        the text.</li>
 	      <li><strong>Section 5 <a href="tr35-numbers.html#Language_Plural_Rules">Language Plural Rules</a>:</strong>
 	        added the 'e' operand for use in certain compact number formatting.</li>
         </ul>
 	  </li>
       <li><strong>Part 6: <a href="tr35-info.html#Contents">Supplemental</a> (supplemental data)</strong>
 	    <ul>
 	      <li><strong>Section 14 <a href="tr35-info.html#Unit_Preferences">Unit Preferences</a></strong>: defined the
 	        userPreferences skeleton more precisely.</li>
         </ul>
 	  </li>
       <li><strong>Throughout: </strong>Where possible, use “legacy” (for language tag or unit) instead of “grandfathered”.</li>
  </ul>


 	      <p>&nbsp;</p>

        <p>Modifications in previous versions are listed in those
     respective versions. Click on <strong>Previous Version</strong>
     in the header until you get to the desired version.</p>
     <hr>
     <p class="copyright">Copyright © 2001–2020 Unicode, Inc. All
     Rights Reserved. The Unicode Consortium makes no expressed or
     implied warranty of any kind, and assumes no liability for
     errors or omissions. No liability is assumed for incidental and
     consequential damages in connection with or arising out of the
     use of the information or programs contained or accompanying
     this technical report. The Unicode <a href=
     "https://unicode.org/copyright.html">Terms of Use</a> apply.</p>
     <p class="copyright">Unicode and the Unicode logo are
     trademarks of Unicode, Inc., and are registered in some
     jurisdictions.</p>
   </div>
 </body>
 </html>