| <?xml version="1.0" encoding="UTF-8" standalone="no"?> |
| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| <html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Arbitrary Character Types</title><meta name="generator" content="DocBook XSL Stylesheets V1.73.2" /><meta name="keywords" content=" ISO C++ , library " /><link rel="start" href="../spine.html" title="The GNU C++ Library Documentation" /><link rel="up" href="bk01pt05ch13.html" title="Chapter 13. String Classes" /><link rel="prev" href="bk01pt05ch13s02.html" title="Case Sensitivity" /><link rel="next" href="bk01pt05ch13s04.html" title="Tokenizing" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Arbitrary Character Types</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="bk01pt05ch13s02.html">Prev</a> </td><th width="60%" align="center">Chapter 13. String Classes</th><td width="20%" align="right"> <a accesskey="n" href="bk01pt05ch13s04.html">Next</a></td></tr></table><hr /></div><div class="sect1" lang="en" xml:lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="strings.string.character_types"></a>Arbitrary Character Types</h2></div></div></div><p> |
| </p><p>The <code class="code">std::basic_string</code> is tantalizingly general, in that |
| it is parameterized on the type of the characters which it holds. |
| In theory, you could whip up a Unicode character class and instantiate |
| <code class="code">std::basic_string<my_unicode_char></code>, or assuming |
| that integers are wider than characters on your platform, maybe just |
| declare variables of type <code class="code">std::basic_string<int></code>. |
| </p><p>That's the theory. Remember however that basic_string has additional |
| type parameters, which take default arguments based on the character |
| type (called <code class="code">CharT</code> here): |
| </p><pre class="programlisting"> |
| template <typename CharT, |
| typename Traits = char_traits<CharT>, |
| typename Alloc = allocator<CharT> > |
| class basic_string { .... };</pre><p>Now, <code class="code">allocator<CharT></code> will probably Do The Right |
| Thing by default, unless you need to implement your own allocator |
| for your characters. |
| </p><p>But <code class="code">char_traits</code> takes more work. The char_traits |
| template is <span class="emphasis"><em>declared</em></span> but not <span class="emphasis"><em>defined</em></span>. |
| That means there is only |
| </p><pre class="programlisting"> |
| template <typename CharT> |
| struct char_traits |
| { |
| static void foo (type1 x, type2 y); |
| ... |
| };</pre><p>and functions such as char_traits<CharT>::foo() are not |
| actually defined anywhere for the general case. The C++ standard |
| permits this, because writing such a definition to fit all possible |
| CharT's cannot be done. |
| </p><p>The C++ standard also requires that char_traits be specialized for |
| instantiations of <code class="code">char</code> and <code class="code">wchar_t</code>, and it |
| is these template specializations that permit entities like |
| <code class="code">basic_string<char,char_traits<char>></code> to work. |
| </p><p>If you want to use character types other than char and wchar_t, |
| such as <code class="code">unsigned char</code> and <code class="code">int</code>, you will |
| need suitable specializations for them. For a time, in earlier |
| versions of GCC, there was a mostly-correct implementation that |
| let programmers be lazy but it broke under many situations, so it |
| was removed. GCC 3.4 introduced a new implementation that mostly |
| works and can be specialized even for <code class="code">int</code> and other |
| built-in types. |
| </p><p>If you want to use your own special character class, then you have |
| <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00163.html" target="_top">a lot |
| of work to do</a>, especially if you with to use i18n features |
| (facets require traits information but don't have a traits argument). |
| </p><p>Another example of how to specialize char_traits was given <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00260.html" target="_top">on the |
| mailing list</a> and at a later date was put into the file <code class="code"> |
| include/ext/pod_char_traits.h</code>. We agree |
| that the way it's used with basic_string (scroll down to main()) |
| doesn't look nice, but that's because <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00236.html" target="_top">the |
| nice-looking first attempt</a> turned out to <a class="ulink" href="http://gcc.gnu.org/ml/libstdc++/2002-08/msg00242.html" target="_top">not |
| be conforming C++</a>, due to the rule that CharT must be a POD. |
| (See how tricky this is?) |
| </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="bk01pt05ch13s02.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="bk01pt05ch13.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="bk01pt05ch13s04.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Case Sensitivity </td><td width="20%" align="center"><a accesskey="h" href="../spine.html">Home</a></td><td width="40%" align="right" valign="top"> Tokenizing</td></tr></table></div></body></html> |