<HTML><HEAD><META content="text/html; charset=utf8" http-equiv="Content-Type"><SCRIPT language="JavaScript" src="resources/script.js" type="text/javascript"></SCRIPT><TITLE>Programming/Parsing FAQs</TITLE></HEAD><BODY alink="#ff0000" bgcolor="#ffffff" leftmargin="4" link="#0000ff" marginheight="4" marginwidth="4" text="#000000" topmargin="4" vlink="#0000aa"><TABLE border="0" cellpadding="0" cellspacing="0" width="620"><TR><TD align="left" height="60" rowspan="3" valign="top" width="135"><IMG border="0" height="60" hspace="0" src="resources/logo.gif" vspace="0" width="135"></TD><TD align="left" colspan="4" height="5" valign="top" width="456"><IMG border="0" height="5" hspace="0" src="resources/line.gif" vspace="0" width="456"></TD><TD align="left" height="60" rowspan="3" valign="top" width="29"><IMG border="0" height="60" hspace="0" src="resources/right.gif" vspace="0" width="29"></TD></TR><TR><TD align="left" bgcolor="#0086b2" colspan="4" height="35" valign="top" width="456"><IMG alt="" border="0" height="35" hspace="0" src="graphics/faq-parse-header.jpg" vspace="0" width="456"></TD></TR><TR><TD align="left" height="20" valign="top" width="168"><IMG border="0" height="20" hspace="0" src="resources/bottom.gif" vspace="0" width="168"></TD><TD align="left" height="20" valign="top" width="96"><A href="http://xml.apache.org/" onMouseOut="rolloverOff('xml');" onMouseOver="rolloverOn('xml');" target="new"><IMG alt="http://xml.apache.org/" border="0" height="20" hspace="0" name="xml" onLoad="rolloverLoad('xml','resources/button-xml-hi.gif','resources/button-xml-lo.gif');" src="resources/button-xml-lo.gif" vspace="0" width="96"></A></TD><TD align="left" height="20" valign="top" width="96"><A href="http://www.apache.org/" onMouseOut="rolloverOff('asf');" onMouseOver="rolloverOn('asf');" target="new"><IMG alt="http://www.apache.org/" border="0" height="20" hspace="0" name="asf" onLoad="rolloverLoad('asf','resources/button-asf-hi.gif','resources/button-asf-lo.gif');" src="resources/button-asf-lo.gif" vspace="0" width="96"></A></TD><TD align="left" height="20" valign="top" width="96"><A href="http://www.w3.org/" onMouseOut="rolloverOff('w3c');" onMouseOver="rolloverOn('w3c');" target="new"><IMG alt="http://www.w3.org/" border="0" height="20" hspace="0" name="w3c" onLoad="rolloverLoad('w3c','resources/button-w3c-hi.gif','resources/button-w3c-lo.gif');" src="resources/button-w3c-lo.gif" vspace="0" width="96"></A></TD></TR></TABLE><TABLE border="0" cellpadding="0" cellspacing="0" width="620"><TR><TD align="left" valign="top" width="120"><IMG border="0" height="14" hspace="0" src="resources/join.gif" vspace="0" width="120"><BR> | |
<A href="../index.html" onMouseOut="rolloverOff('side-ext-2');" onMouseOver="rolloverOn('side-ext-2');"><IMG alt="Home" border="0" height="12" hspace="0" name="side-ext-2" onLoad="rolloverLoad('side-ext-2','graphics/ext-2-label-2.jpg','graphics/ext-2-label-3.jpg');" src="graphics/ext-2-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> | |
<A href="index.html" onMouseOut="rolloverOff('side-index');" onMouseOver="rolloverOn('side-index');"><IMG alt="Readme" border="0" height="12" hspace="0" name="side-index" onLoad="rolloverLoad('side-index','graphics/index-label-2.jpg','graphics/index-label-3.jpg');" src="graphics/index-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="charter.html" onMouseOut="rolloverOff('side-charter');" onMouseOver="rolloverOn('side-charter');"><IMG alt="Charter" border="0" height="12" hspace="0" name="side-charter" onLoad="rolloverLoad('side-charter','graphics/charter-label-2.jpg','graphics/charter-label-3.jpg');" src="graphics/charter-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="releases.html" onMouseOut="rolloverOff('side-releases');" onMouseOver="rolloverOn('side-releases');"><IMG alt="Release Info" border="0" height="12" hspace="0" name="side-releases" onLoad="rolloverLoad('side-releases','graphics/releases-label-2.jpg','graphics/releases-label-3.jpg');" src="graphics/releases-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> | |
<A href="install.html" onMouseOut="rolloverOff('side-install');" onMouseOver="rolloverOn('side-install');"><IMG alt="Installation" border="0" height="12" hspace="0" name="side-install" onLoad="rolloverLoad('side-install','graphics/install-label-2.jpg','graphics/install-label-3.jpg');" src="graphics/install-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="http://xerces.apache.org/xerces-c/download.cgi" onMouseOut="rolloverOff('side-ext-20');" onMouseOver="rolloverOn('side-ext-20');"><IMG alt="Download" border="0" height="12" hspace="0" name="side-ext-20" onLoad="rolloverLoad('side-ext-20','graphics/ext-20-label-2.jpg','graphics/ext-20-label-3.jpg');" src="graphics/ext-20-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="build.html" onMouseOut="rolloverOff('side-build');" onMouseOver="rolloverOn('side-build');"><IMG alt="Build Instructions" border="0" height="12" hspace="0" name="side-build" onLoad="rolloverLoad('side-build','graphics/build-label-2.jpg','graphics/build-label-3.jpg');" src="graphics/build-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> | |
<A href="program.html" onMouseOut="rolloverOff('side-program');" onMouseOver="rolloverOn('side-program');"><IMG alt="Programming" border="0" height="12" hspace="0" name="side-program" onLoad="rolloverLoad('side-program','graphics/program-label-2.jpg','graphics/program-label-3.jpg');" src="graphics/program-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="samples.html" onMouseOut="rolloverOff('side-samples');" onMouseOver="rolloverOn('side-samples');"><IMG alt="Samples" border="0" height="12" hspace="0" name="side-samples" onLoad="rolloverLoad('side-samples','graphics/samples-label-2.jpg','graphics/samples-label-3.jpg');" src="graphics/samples-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="faqs.html" onMouseOut="rolloverOff('side-faqs');" onMouseOver="rolloverOn('side-faqs');"><IMG alt="FAQs" border="0" height="12" hspace="0" name="side-faqs" onLoad="rolloverLoad('side-faqs','graphics/faqs-label-2.jpg','graphics/faqs-label-3.jpg');" src="graphics/faqs-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> | |
<A href="api.html" onMouseOut="rolloverOff('side-api');" onMouseOver="rolloverOn('side-api');"><IMG alt="API Reference" border="0" height="12" hspace="0" name="side-api" onLoad="rolloverLoad('side-api','graphics/api-label-2.jpg','graphics/api-label-3.jpg');" src="graphics/api-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="ApacheDOMC++Binding.html" onMouseOut="rolloverOff('side-ext-88');" onMouseOver="rolloverOn('side-ext-88');"><IMG alt="DOM C++ Binding" border="0" height="12" hspace="0" name="side-ext-88" onLoad="rolloverLoad('side-ext-88','graphics/ext-88-label-2.jpg','graphics/ext-88-label-3.jpg');" src="graphics/ext-88-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="migrate.html" onMouseOut="rolloverOff('side-migrate');" onMouseOver="rolloverOn('side-migrate');"><IMG alt="Migration Guide" border="0" height="12" hspace="0" name="side-migrate" onLoad="rolloverLoad('side-migrate','graphics/migrate-label-2.jpg','graphics/migrate-label-3.jpg');" src="graphics/migrate-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> | |
<A href="feedback.html" onMouseOut="rolloverOff('side-feedback');" onMouseOver="rolloverOn('side-feedback');"><IMG alt="Feedback" border="0" height="12" hspace="0" name="side-feedback" onLoad="rolloverLoad('side-feedback','graphics/feedback-label-2.jpg','graphics/feedback-label-3.jpg');" src="graphics/feedback-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="bug-report.html" onMouseOut="rolloverOff('side-bug-report');" onMouseOver="rolloverOn('side-bug-report');"><IMG alt="Bug-Reporting" border="0" height="12" hspace="0" name="side-bug-report" onLoad="rolloverLoad('side-bug-report','graphics/bug-report-label-2.jpg','graphics/bug-report-label-3.jpg');" src="graphics/bug-report-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="mailing-lists.html" onMouseOut="rolloverOff('side-mailing-lists');" onMouseOver="rolloverOn('side-mailing-lists');"><IMG alt="Mailing Lists" border="0" height="12" hspace="0" name="side-mailing-lists" onLoad="rolloverLoad('side-mailing-lists','graphics/mailing-lists-label-2.jpg','graphics/mailing-lists-label-3.jpg');" src="graphics/mailing-lists-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="6" hspace="0" src="resources/separator.gif" vspace="0" width="120"><BR> | |
<A href="source-repository.html" onMouseOut="rolloverOff('side-source-repository');" onMouseOver="rolloverOn('side-source-repository');"><IMG alt="Source Repository" border="0" height="12" hspace="0" name="side-source-repository" onLoad="rolloverLoad('side-source-repository','graphics/source-repository-label-2.jpg','graphics/source-repository-label-3.jpg');" src="graphics/source-repository-label-3.jpg" vspace="0" width="120"></A><BR> | |
<A href="applications.html" onMouseOut="rolloverOff('side-applications');" onMouseOver="rolloverOn('side-applications');"><IMG alt="Applications" border="0" height="12" hspace="0" name="side-applications" onLoad="rolloverLoad('side-applications','graphics/applications-label-2.jpg','graphics/applications-label-3.jpg');" src="graphics/applications-label-3.jpg" vspace="0" width="120"></A><BR> | |
<IMG border="0" height="14" hspace="0" src="resources/close.gif" vspace="0" width="120"><BR></TD><TD align="left" valign="top" width="500"><TABLE border="0" cellpadding="3" cellspacing="0"><TR><TD><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Questions</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"><UL><LI><A href="#faq-1">Does Xerces-C++ support Schema?</A></LI><LI><A href="#faq-2">Why Xerces-C++ does not support this particular Schema feature?</A></LI><LI><A href="#faq-3">Why does my application crash when instantiating the parser?</A></LI><LI><A href="#faq-4">Is it OK to call the XMLPlatformUtils::Initialize/Terminate pair of routines multiple times in one program?</A></LI><LI><A href="#faq-5">Why does my application crash or hang if XMLPlatformUtils::Initialize()/Terminate() pair is called more than once?</A></LI><LI><A href="#faq-6">Why does my application crash after calling XMLPlatformUtils::Terminate()?</A></LI><LI><A href="#faq-7">I'm suddenly getting segfaults with Xerces-C 2.3.0; why might this be?</A></LI><LI><A href="#faq-8">Is Xerces-C++ thread-safe?</A></LI><LI><A href="#faq-9">I am seeing memory leaks in Xerces-C++. Are they real?</A></LI><LI><A href="#faq-10">I find memory leaks in Xerces-C++. How do I eliminate it?</A></LI><LI><A href="#faq-11">Can Xerces-C++ create an XML skeleton based on a DTD</A></LI><LI><A href="#faq-12">Can I use Xerces-C++ to perform write validation</A></LI><LI><A href="#faq-13">Can I validate the data contained in a DOM tree?</A></LI><LI><A href="#faq-14">How to write out a DOM tree into a string or an XML file?</A></LI><LI><A href="#faq-15">Why does DOMNode::cloneNode() not clone the pointer assigned to a DOMNode via DOMNode::setUserData()?</A></LI><LI><A href="#faq-16">How are entity reference nodes handled in DOM?</A></LI><LI><A href="#faq-17">What kinds of URLs are currently supported in Xerces-C++?</A></LI><LI><A href="#faq-18">How can I add support for URLs with HTTP/FTP protocols?</A></LI><LI><A href="#faq-19">Can I use Xerces-C++ to parse HTML?</A></LI><LI><A href="#faq-20">I keep getting an error: "invalid UTF-8 character". What's wrong?</A></LI><LI><A href="#faq-21">What encodings are supported by Xerces-C / XML4C?</A></LI><LI><A href="#faq-22">What character encoding should I use when creating XML documents?</A></LI><LI><A href="#faq-23">Is EBCDIC supported?</A></LI><LI><A href="#faq-24">Why does deleting a transcoded string result in assertion on windows?</A></LI><LI><A href="#faq-25">How do I transcode to/from something besides the local code page?</A></LI><LI><A href="#faq-26">Why does setProperty not work?</A></LI><LI><A href="#faq-27">Why does getProperty not work?</A></LI><LI><A href="#faq-28">Why does the parser still try to locate the DTD even validation is turned off and how to ignore external DTD reference?</A></LI><LI><A href="#faq-29">Why do I get segmentation fault when running on Redhat Linux?</A></LI><LI><A href="#faq-30">Why does the XML data generated by the DOMWriter does not match my original XML input?</A></LI><LI><A href="#faq-31">Why does my application crash when deleting the parser after releasing a document?</A></LI><LI><A href="#faq-32">Why do we have two versions of some XMLString methods (one with memory manager and one without)?</A></LI></UL></FONT></TD></TR></TABLE><BR><BR><A name="faq-1"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B> Does Xerces-C++ support Schema?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Yes. The Xerces-C++ 2.8.0 contains an implementation | |
of the W3C XML Schema Language, a recommendation of the Worldwide Web Consortium | |
available in three parts: | |
<A href="http://www.w3.org/TR/xmlschema-0/">XML Schema: Primer</A> and | |
<A href="http://www.w3.org/TR/xmlschema-1/">XML Schema: Structures</A> and | |
<A href="http://www.w3.org/TR/xmlschema-2/">XML Schema: Datatypes</A>. | |
We consider this implementation complete. See | |
<A href="schema.html#limitation">the Schema page</A> for limitations.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-2"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B> Why Xerces-C++ does not support this particular Schema feature?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The Xerces-C++ 2.8.0 contains an implementation | |
of the W3C XML Schema Language, a recommendation of the Worldwide Web Consortium | |
available in three parts: | |
<A href="http://www.w3.org/TR/xmlschema-0/">XML Schema: Primer</A> and | |
<A href="http://www.w3.org/TR/xmlschema-1/">XML Schema: Structures</A> and | |
<A href="http://www.w3.org/TR/xmlschema-2/">XML Schema: Datatypes</A>. | |
We consider this implementation complete. See | |
<A href="schema.html#limitation">the Schema page</A> for limitations.</P> | |
<P>If you find any Schema feature which is specified in the W3C XML Schema Language | |
Recommendation does not work with Xerces-C++ 2.8.0, we encourage | |
the submission of bugs as described in | |
<A href="bug-report.html">Bug Reporting</A> page. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-3"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does my application crash when instantiating the parser?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>In order to work with the Xerces-C++ parser, you have to first | |
initialize the XML subsystem. The most common mistake is to forget this | |
initialization. Before you make any calls to Xerces-C++ APIs, you must | |
call XMLPlatformUtils::Initialize(): </P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
try { | |
XMLPlatformUtils::Initialize(); | |
} | |
catch (const XMLException& toCatch) { | |
// Do your failure processing here | |
}</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
<P>This initializes the Xerces system and sets its internal | |
variables. Note that you must the include <CODE><FONT face="courier, monospaced">xercesc/util/PlatformUtils.hpp</FONT></CODE> file for this to work.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-4"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Is it OK to call the XMLPlatformUtils::Initialize/Terminate pair of routines multiple times in one program?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Yes. Since Xerces-C++ 1.5.2, the code has been enhanced so that | |
calling XMLPlatformUtils::Initialize/Terminate pair of routines | |
multiple times in one process is now allowed. | |
</P> | |
<P>But the application needs to guarantee that only one thread has entered either the | |
method XMLPlatformUtils::Initialize() or the method XMLPlatformUtils::Terminate() at any | |
one time.</P> | |
<P>If you are calling XMLPlatformUtils::Initialize() a number of times, and then follow with | |
XMLPlatformUtils::Terminate() the same number of times, only the first XMLPlatformUtils::Initialize() | |
will do the initialization, and only the last XMLPlatformUtils::Terminate() will clean up | |
the memory. The other calls are ignored. | |
</P> | |
<P>To ensure all the memory held by the parser are freed, the number of XMLPlatformUtils::Terminate() calls | |
should match the number of XMLPlatformUtils::Initialize() calls. | |
</P> | |
<P> | |
Consider the following code snippets (for illustration simplicity the following | |
sample code is not coded in try/catch clause): | |
</P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
// The XMLPlatformUtils::Initialize/Terminate calls are paired. | |
{ | |
// Initialize the parser | |
XMLPlatformUtils::Initialize(); | |
SAXParser* parser = new SAXParser; | |
parser->parse(xmlFile); | |
delete parser; | |
// Free all memory that was being held by the parser | |
XMLPlatformUtils::Terminate(); | |
// Initialize the parser | |
XMLPlatformUtils::Initialize(); | |
parser = new SAXParser; | |
parser->parse(xmlFile); | |
delete parser; | |
// Free all memory that was being held by the parser | |
XMLPlatformUtils::Terminate(); | |
} | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
// calls XMLPlatformUtils::Initialize() three times | |
// then calls XMLPlatformUtils::Terminate() numerous times | |
{ | |
// Initialize the parser | |
XMLPlatformUtils::Initialize(); | |
// The next two calls are no-op | |
XMLPlatformUtils::Initialize(); | |
XMLPlatformUtils::Initialize(); | |
SAXParser* parser = new SAXParser; | |
parser->parse(xmlFile); | |
delete parser; | |
// The first two XMLPlatformUtils::Terminate() calls are no-op | |
XMLPlatformUtils::Terminate(); | |
XMLPlatformUtils::Terminate(); | |
// This third XMLPlatformUtils::Terminate() will free all memory that was being held by the parser | |
XMLPlatformUtils::Terminate(); | |
// This extra fourth XMLPlatformUtils::Terminate() call is no-op. | |
// However calling XMLPlatformUtils::Terminate() without a matching XMLPlatformUtils::Initialize() | |
// is dangerous and should be avoided. | |
XMLPlatformUtils::Terminate(); | |
} | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
</FONT></TD></TR></TABLE><BR><A name="faq-5"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does my application crash or hang if XMLPlatformUtils::Initialize()/Terminate() pair is called more than once?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Please make sure you are using the Xerces-C++ 1.5.2 or up. | |
</P> | |
<P>Earlier version of Xerces-C++ does not allow XMLPlatformUtils::Initialize()/Terminate() | |
pair to be called more than once or has a problem. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-6"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does my application crash after calling XMLPlatformUtils::Terminate()?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Please make sure the XMLPlatformUtils::Terminate() is the last Xerces-C++ function to be called | |
in your program. NO explicit nor implicit Xerces-C++ destructor (those local data that are | |
destructed when going out of scope) should be called after XMLPlatformUtils::Terminate(). | |
</P> | |
<P> | |
For example consider the following code snippets which is incorrect | |
(for illustration simplicity the following sample code is not coded in try/catch clause): | |
</P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
1: { | |
2: XMLPlatformUtils::Initialize(); | |
3: DOMString c("hello"); | |
4: XMLPlatformUtils::Terminate(); | |
5: } | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
<P>The DOMString object "c" is destructed when going out of scope at line 5 before the closing | |
brace. As a result, DOMString destructor is called at line 5 after | |
XMLPlatformUtils::Terminate() which is wrong. Correct code should be: | |
</P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
1: { | |
2: XMLPlatformUtils::Initialize(); | |
2a: { | |
3: DOMString c("hello"); | |
3a: } | |
4: XMLPlatformUtils::Terminate(); | |
5: } | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
<P>The extra pair of braces (line 2a and 3a) ensures that all implicit destructors are called | |
before terminating Xerces-C++.</P> | |
<P>In addition the application also needs to guarantee that only one thread has entered either the | |
method XMLPlatformUtils::Initialize() or the method XMLPlatformUtils::Terminate() at any | |
one time. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-7"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>I'm suddenly getting segfaults with Xerces-C 2.3.0; | |
why might this be?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The introduction of pluggable memory management into | |
Xerces-C, one of the main features of 2.3.0, means that | |
application writers have to be more conscious about | |
destructors being invoked implicitly after a call to | |
XMLPlatformUtils::Terminate(). For example, the | |
following code is guaranteed to produce a segmentation | |
fault under Xerces-C 2.3.0, while it happened to work | |
under previous versions (in fact, this was how our | |
SAXPrint sample was formerly written; | |
try-catch blocks removed for brevity): | |
</P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
void myParsingFunction() | |
{ | |
XMLPlatformUtils::Initialize(); | |
SAXParser parser; | |
//parser.various method calls | |
XMLPlatformUtils::Terminate(); | |
} // seg fault here! | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
<P>The reason this will produce a segmentation fault is | |
that any dynamic memory the SAXParser (or any other of | |
Xerces's parsers) needs to allocate is now allocated | |
by default by a static object owned by XMLPlatformUtils. | |
When the XMLPlatformUtils::Terminate() call is made, this | |
object is destroyed--and, consequently, so are all the | |
objects that it directly created. This includes all the | |
objects dynamically allocated by the SAXParser. When the | |
parser object goes out of scope, its destructor is | |
invoked, and this attempts to destroy all the objects | |
that it created--which have of course just been destroyed | |
by the static MemoryManager in XMLPlatformUtils. | |
</P> | |
<P> | |
To avoid this, one must either explicitly scope the | |
parser object inside calls to | |
XMLPlatformUtils::Initialize() and | |
XMLPlatformUtils::Terminate(), or dynamically allocate | |
the parser object and destroy it explicitly before the | |
call to XMLPlatformUtils::Terminate() is made. | |
</P> | |
<P>Another way of producing segmentation faults--that again, | |
unfortunately, was employed by some of our | |
samples--is to have calls to XMLPlatformUtils::Terminate() | |
in a catch block that catches any of Xerces's exceptions. | |
Since the destructor of the exception will implicitly be | |
invoked upon exit from the catch block, and since some of | |
the exceptions' destructors call on Xerces's | |
default memory manager to destroy dynamically-allocated | |
objects, their destruction will provoke a segmentation | |
fault even if a return statement is placed in the catch | |
block since the default memory manager will no longer exist. | |
This practice is now avoided in all our samples. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-8"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Is Xerces-C++ thread-safe?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>This is not a question that has a simple yes/no answer. Here are the | |
rules for using Xerces-C++ in a multi-threaded environment:</P> | |
<P>Within an address space, an instance of the parser may be used without | |
restriction from a single thread, or an instance of the parser can be accessed | |
from multiple threads, provided the application guarantees that only one thread | |
has entered a method of the parser at any one time.</P> | |
<P>When two or more parser instances exist in a process, the instances can | |
be used concurrently, without external synchronization. That is, in an | |
application containing two parsers and two threads, one parser can be running | |
within the first thread concurrently with the second parser running within the | |
second thread.</P> | |
<P>The same rules apply to Xerces-C++ DOM documents. Multiple document | |
instances may be concurrently accessed from different threads, but any given | |
document instance can only be accessed by one thread at a time.</P> | |
<P>DOMStrings allow multiple concurrent readers. All DOMString const | |
methods are thread safe, and can be concurrently entered by multiple threads. | |
Non-const DOMString methods, such as <CODE><FONT face="courier, monospaced">appendData()</FONT></CODE>, are not thread safe and the application must guarantee that no other | |
methods (including const methods) are executed concurrently with them.</P> | |
<P>The application also needs to guarantee that only one thread has entered either the | |
method XMLPlatformUtils::Initialize() or the method XMLPlatformUtils::Terminate() at any | |
one time.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-9"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>I am seeing memory leaks in Xerces-C++. Are they real?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The Xerces-C++ library allocates and caches some commonly reused | |
items. The storage for these may be reported as memory leaks by some heap | |
analysis tools; to avoid the problem, call the function <CODE><FONT face="courier, monospaced">XMLPlatformUtils::Terminate()</FONT></CODE> before your application exits. This will free all memory that was being | |
held by the library.</P> | |
<P>For most applications, the use of <CODE><FONT face="courier, monospaced">Terminate()</FONT></CODE> is optional. The system will recover all memory when the application | |
process shuts down. The exception to this is the use of Xerces-C++ from DLLs | |
that will be repeatedly loaded and unloaded from within the same process. To | |
avoid memory leaks with this kind of use, <CODE><FONT face="courier, monospaced">Terminate()</FONT></CODE> must be called before unloading the Xerces-C++ library</P> | |
<P>To ensure all the memory held by the parser are freed, the number of XMLPlatformUtils::Terminate() calls | |
should match the number of XMLPlatformUtils::Initialize() calls. | |
</P> | |
<P>If you are using XML4C where ICU is used, you may call ICU function u_cleanup() to clean up | |
ICU static data. Please see <A href="http://icu-project.org/">ICU documentation</A> | |
for details. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-10"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>I find memory leaks in Xerces-C++. How do I eliminate it?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The "leaks" that are reported through a leak-detector or heap-analysis | |
tools aren't really leaks in most application, in that the memory usage does | |
not grow over time as the XML parser is used and re-used.</P> | |
<P>What you are seeing as leaks are actually lazily evaluated data | |
allocated into static variables. This data gets released when the application | |
ends. You can make a call to <CODE><FONT face="courier, monospaced">XMLPlatformUtil::terminate()</FONT></CODE> to release all the lazily allocated variables before you exit your | |
program.</P> | |
<P>To ensure all the memory held by the parser are freed, the number of XMLPlatformUtils::Terminate() calls | |
should match the number of XMLPlatformUtils::Initialize() calls. | |
</P> | |
<P>If you are using XML4C where ICU is used, you may call ICU function u_cleanup() to clean up | |
ICU static data. Please see <A href="http://icu-project.org/">ICU documentation</A> | |
for details. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-11"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Is there a function that I have totally missed that creates | |
an XML file from a DTD, (obviously with the values missing, a skeleton, as it | |
were)?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>No. This is not supported.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-12"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Can I use Xerces-C++ to perform "write validation" (which is having an | |
appropriate Grammar and being able to add elements to the DOM whilst validating | |
against the grammar)?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>No. This is not supported.</P> | |
<P>The best you can do for now is to create the DOM document, write it back | |
as XML and re-parse it.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-13"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Is there a facility in Xerces-C++ to validate the data contained in a | |
DOM tree? That is, without saving and re-parsing the source document?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>No. The best option for now is to generate XML source from the DOM and feed that back | |
into the parser.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-14"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>How to write out a DOM tree into a string or an XML file?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Please make sure you are using Xerces-C++ 2.8.0 or up.</P> | |
<P>You can use | |
the DOMWriter::writeToString, or DOMWriter::writeNode to serialize a DOM tree. | |
Please refer to the sample DOMPrint or the API documentation for more details of | |
DOMWriter.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-15"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does DOMNode::cloneNode() not clone the pointer assigned to a DOMNode via DOMNode::setUserData()?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Xerces-C++ supports the DOMNode::userData specified | |
in <A href="http://www.w3.org/TR/2003/WD-DOM-Level-3-Core-20030226/DOM3-Core.html#core-ID-3A0ED0A4"> | |
the DOM level 3 Node interface</A>. As | |
is made clear in the description of the behaviour of | |
<CODE><FONT face="courier, monospaced">cloneNode()</FONT></CODE>, userData that has been set on the | |
Node is not cloned. Thus, if the userData is to be copied | |
to the new Node, this copy must be effected manually. | |
Note further that the operation of <CODE><FONT face="courier, monospaced">importNode()</FONT></CODE> | |
is specified similarly. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-16"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>How are entity reference nodes handled in DOM?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>If you are using the native DOM classes, the function <CODE><FONT face="courier, monospaced">setCreateEntityReferenceNodes</FONT></CODE> | |
controls how entities appear in the DOM tree. When | |
setCreateEntityReferenceNodes is set to true (the default), an occurrence of an | |
entity reference in the XML document will be represented by a subtree with an | |
EntityReference node at the root whose children represent the entity expansion. | |
Entity expansion will be a DOM tree representing the structure of the entity | |
expansion, not a text node containing the entity expansion as text.</P> | |
<P>If setCreateEntityReferenceNodes is false, an entity reference in the XML | |
document is represented by only the nodes that represent the entity expansion. | |
The DOM tree will not contain any entityReference nodes.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-17"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>What kinds of URLs are currently supported in Xerces-C++?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The <CODE><FONT face="courier, monospaced">XMLURL</FONT></CODE> class provides for limited URL support. It understands the <CODE><FONT face="courier, monospaced">file://, http://</FONT></CODE>, and <CODE><FONT face="courier, monospaced">ftp://</FONT></CODE> URL types, and is capable or parsing them into their constituent | |
components, and normalizing them. It also supports the commonly required action | |
of conglomerating a base and relative URL into a single URL. In other words, it | |
performs the limited set of functions required by an XML parser.</P> | |
<P>Another thing that URLs commonly do are to create an input stream that | |
provides access to the entity referenced. The parser, as shipped, only supports | |
this functionality on URLs in the form <CODE><FONT face="courier, monospaced">file:///</FONT></CODE> and <CODE><FONT face="courier, monospaced">file://localhost/</FONT></CODE>, i.e. only when the URL refers to a local file.</P> | |
<P>You may enable support for HTTP and FTP URLs by implementing and | |
installing a NetAccessor object. When a NetAccessor object is installed, the | |
URL class will use it to create input streams for the remote entities referred | |
to by such URLs.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-18"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>How can I add support for URLs with HTTP/FTP protocols?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Support for the http: protocol is now included by default on all | |
platforms.</P> | |
<P>To address the need to make remote connections to resources specified | |
using additional protocols, ftp for example, Xerces-C++ provides the <CODE><FONT face="courier, monospaced">NetAccessor</FONT></CODE> interface. The header file is <CODE><FONT face="courier, monospaced">src/xercesc/util/XMLNetAccessor.hpp</FONT></CODE>. This interface allows you to plug in your own implementation of URL | |
networking code into the Xerces-C++ parser.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-19"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Can I use Xerces-C++ to parse HTML?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Yes, but only if the HTML follows the rules given in the | |
<A href="http://www.w3.org/TR/REC-xml">XML specification</A>. Most HTML, | |
however, does not follow the XML rules, and will generate XML well-formedness | |
errors.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-20"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>I keep getting an error: "invalid UTF-8 character". What's wrong?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Most commonly, the XML <CODE><FONT face="courier, monospaced">encoding =</FONT></CODE> declaration is either incorrect or missing. Without a declaration, XML | |
defaults to the use utf-8 character encoding, which is not compatible with the | |
default text file encoding on most systems.</P> | |
<P>The XML declaration should look something like this:</P> | |
<P><CODE><FONT face="courier, monospaced"><?xml version="1.0" encoding="iso-8859-1"?></FONT></CODE></P> | |
<P>Make sure to specify the encoding that is actually used by file. The | |
encoding for "plain" text files depends both on the operating system and the | |
locale (country and language) in use.</P> | |
<P>Another common source of problems is that some characters are not | |
allowed in XML documents, according to the XML spec. Typical disallowed | |
characters are control characters, even if you escape them using the Character | |
Reference form. See the <A href="http://www.w3.org/TR/REC-xml#charsets">XML | |
spec</A>, sections 2.2 and 4.1 for details. If the parser is generating an <CODE><FONT face="courier, monospaced">Invalid character (Unicode: 0x???)</FONT></CODE> error, it is very likely that there's a character in there that you | |
can't see. You can generally use a UNIX command like "od -hc" to find it.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-21"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>What encodings are supported by Xerces-C / XML4C?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Xerces-C has intrinsic support for ASCII, UTF-8, UTF-16 (Big/Small | |
Endian), UCS4 (Big/Small Endian), EBCDIC code pages IBM037, IBM1047 and IBM1140 | |
encodings, ISO-8859-1 (aka Latin1) and Windows-1252. This means that it can | |
parse input XML files in these above mentioned encodings.</P> | |
<P>XML4C -- the version of Xerces-C available from IBM -- combines Xerces-C | |
and <A href="http://icu-project.org/"> | |
International Components for Unicode (ICU)</A> and | |
extends the encoding support to over 100 different encodings that are allowed | |
by ICU. In particular, all the encodings registered with the | |
<A href="http://www.iana.org/assignments/character-sets"> | |
Internet Assigned Numbers Authority (IANA) </A> are supported in XML4C.</P> | |
<P>Some implementations or ports of Xerces-C provide support for | |
additional encodings. The exact set will depend on the supplier of the parser | |
and on the character set transcoding services in use.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-22"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>What character encoding should I use when creating XML documents?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The best choice in most cases is either utf-8 or utf-16. Advantages of | |
these encodings include:</P> | |
<UL> | |
<LI>The best portability. These encodings are more widely supported by | |
XML processors than any others, meaning that your documents will have the best | |
possible chance of being read correctly, no matter where they end up.</LI> | |
<LI>Full international character support. Both utf-8 and utf-16 cover the | |
full Unicode character set, which includes all of the characters from all major | |
national, international and industry character sets.</LI> | |
<LI>Efficient. utf-8 has the smaller storage requirements for documents | |
that are primarily composed of characters from the Latin alphabet. utf-16 is | |
more efficient for encoding Asian languages. But both encodings cover all | |
languages without loss.</LI> | |
</UL> | |
<P>The only drawback of utf-8 or utf-16 is that they are not the native | |
text file format for most systems, meaning that common text file editors and | |
viewers can not be directly used.</P> | |
<P>A second choice of encoding would be any of the others listed in the | |
table above. This works best when the xml encoding is the same as the default | |
system encoding on the machine where the XML document is being prepared, | |
because the document will then display correctly as a plain text file. For UNIX | |
systems in countries speaking Western European languages, the encoding will | |
usually be iso-8859-1.</P> | |
<P>The versions of Xerces distributed by IBM, both C and Java (known | |
respectively as XML4C and XML4J), include all of the encodings listed in the | |
above table, on all platforms.</P> | |
<P>A word of caution for Windows users: The default character set on | |
Windows systems is windows-1252, not iso-8859-1. While Xerces-C++ does | |
recognize this Windows encoding, it is a poor choice for portable XML data | |
because it is not widely recognized by other XML processing tools. If you are | |
using a Windows-based editing tool to generate XML, check which character set | |
it generates, and make sure that the resulting XML specifies the correct name | |
in the <CODE><FONT face="courier, monospaced">encoding="..."</FONT></CODE> declaration.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-23"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Is EBCDIC supported?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Yes, Xerces-C++ supports EBCDIC with the ibm1140, ibm037 and ibm1047 encodings. | |
When creating EBCDIC encoded XML data, the preferred encoding is ibm1140. The ibm037 encoding, | |
and its alternate name, ebcdic-cp-us, is almost the same as ibm1140, but | |
it lacks the Euro symbol.</P> | |
<P>These three encodings, ibm1140, ibm037 and ibm1047, are available on both | |
Xerces-C and IBM XML4C, on all platforms.</P> | |
<P>On IBM System 390, XML4C also supports three alternative forms, | |
ibm037-s390, ibm1140-s390, and ibm1047-s390. These are similar to the base ibm037, ibm1140, and ibm1047 | |
encodings, but with alternate mappings of the EBCDIC new-line character, which | |
allows them to appear as normal text files on System 390. These encodings are | |
not supported on other platforms, and should not be used for portable data.</P> | |
<P>XML4C on System 390 and AS/400 also provides additional EBCDIC | |
encodings, including those for the character sets of different countries. The | |
exact set supported will be platform dependent, and these encodings are not | |
recommended for portable XML data.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-24"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does deleting a transcoded string result in assertion on windows?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>Both your application program and the Xerces-C++ DLL must use the same *DLL* version of the | |
runtime library. If either statically links to the runtime library, the | |
problem will still occur.</P> | |
<P>For example, for a Win32/VC6 build, the runtime library build setting MUST | |
be "Multithreaded DLL" for release builds and "Debug Multithreaded DLL" for | |
debug builds.</P> | |
<P>Or for example for a Win32/BCB6 build, application need to switch to Multithreaded | |
runtime to avoid such memory access violation.</P> | |
<P>To bypass such problem, instead of calling operator delete[] directly, you can use the | |
provided function XMLString::release to delete any string that was allocated by the parser. | |
This will ensure the string is allocated and deleted by the same DLL and such assertion | |
problem should be resolved.</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-25"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>How do I transcode to/from something besides the local code page?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>XMLString::transcode() will transcode from XMLCh to the local code page, and | |
other APIs which take a char* assume that the source text is in the local | |
code page. If this is not true, you must transcode the text yourself. You | |
can do this using local transcoding support on your OS, such as Iconv on | |
Unix or IBM's ICU package. However, if your transcoding needs are simple, | |
you can achieve some better portability by using the Xerces-C++ parser's | |
transcoder wrappers. You get a transcoder like this: | |
</P> | |
<UL> | |
<LI> | |
Call XMLPlatformUtils::fgTransServer->MakeNewTranscoderFor() and provide | |
the name of the encoding you wish to create a transcoder for. This will | |
return a transcoder to you, which you own and must delete when you are | |
through with it. | |
NOTE: You must provide a maximum block size that you will pass to the transcoder | |
at one time, and you must pass blocks of characters of this count or smaller when | |
you do your transcoding. The reason for this is that this is really an | |
internal API and is used by the parser itself to do transcoding. The parser | |
always does transcoding in known block sizes, and this allows transcoders to | |
be much more efficient for internal use since it knows the max size it will | |
ever have to deal with and can set itself up for that internally. In | |
general, you should stick to block sizes in the 4 to 64K range. | |
</LI> | |
<LI> | |
The returned transcoder is something derived from XMLTranscoder, so they | |
are all returned to you via that interface. | |
</LI> | |
<LI> | |
This object is really just a wrapper around the underlying transcoding | |
system actually in use by your version of Xerces, and does whatever is | |
necessary to handle differences between the XMLCh representation and the | |
representation used by that underlying transcoding system. | |
</LI> | |
<LI> | |
The transcoder object has two primary APIs, transcodeFrom() and | |
transcodeTo(). These transcode between the XMLCh format and the encoding you | |
indicated. | |
</LI> | |
<LI> | |
These APIs will transcode as much of the source data as will fit into the | |
outgoing buffer you provide. They will tell you how much of the source they | |
ate and how much of the target they filled. You can use this information to | |
continue the process until all source is consumed. | |
</LI> | |
<LI> | |
char* data is always dealt with in terms of bytes, and XMLCh data is | |
always dealt with in terms of characters. Don't mix up which you are dealing | |
with or you will not get the correct results, since many encodings don't | |
have a one to one relationship of characters to bytes. | |
</LI> | |
<LI> | |
When transcoding from XMLCh to the target encoding, the transcodeTo() | |
method provides an 'unrepresentable flag' parameter, which tells the | |
transcoder how to deal with an XMLCh code point that cannot be converted | |
legally to the target encoding, which can easily happen since XMLCh is | |
Unicode and can represent thousands of code points. The options are to use a | |
default replacement character (which the underlying transcoding service will | |
choose, and which is guaranteed to be legal for the target encoding), or to | |
throw an exception. | |
</LI> | |
</UL> | |
<P>Here is an example:</P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
// create an XMLTranscoder that is able to transcode between Unicode and Big5 | |
// ASSUMPTION: assumes your underlying transcoding utility supports this encoding Big5 | |
XMLTranscoder* t = | |
XMLPlatformUtils::fgTransService->makeNewTranscoderFor("Big5", failReason, 16*1024, MemoryManager); | |
// source string is in Unicode, wanna to transcode to Big5 | |
t->transcodeTo(source_unicode, length, result_Big5, length, charsEaten, XMLTranscoder::UnRep_Throw ); | |
// source string in Big5, wanna to transcode to Unicode | |
t->transcodeFrom(source_Big5, length, result_unicode, length, bytesEaten, (unsigned char*)charSz); | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
</FONT></TD></TR></TABLE><BR><A name="faq-26"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does setProperty not work?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The function <CODE><FONT face="courier, monospaced">SAX2XMLReader::setProperty(const XMLCh* const name, void* value)</FONT></CODE> | |
and <CODE><FONT face="courier, monospaced">DOMBuilder::setProperty(const XMLCh* const name, void* value)</FONT></CODE> | |
takes a void pointer for the property value. Application is required to initialize this void pointer | |
to a correct type. See <A href="program-sax2.html#SAX2Properties">SAX2 Programming Guide</A> | |
and <A href="program-dom.html#DOMBuilderProperties">DOM Programming Guide</A> | |
to learn exactly what type of property value that each property expects for processing. | |
Passing a void pointer that was initialized with a wrong type will lead to unexpected result. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-27"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does getProperty not work?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>The function <CODE><FONT face="courier, monospaced">void* SAX2XMLReader::getProperty(const XMLCh* const name)</FONT></CODE> | |
and <CODE><FONT face="courier, monospaced">void* DOMBuilder::getProperty(const XMLCh* const name)</FONT></CODE> | |
returns a void pointer for the property value. See | |
<A href="program-sax2.html#SAX2Properties">SAX2 Programming Guide</A> and | |
exactly what type of object each property returns. | |
</P> | |
<P>The parser owns the returned pointer. The memory allocated for | |
the returned pointer will be destroyed when the parser is deleted. | |
To ensure accessibility of the returned information after the parser | |
is deleted, callers need to copy and store the returned information | |
somewhere else; otherwise you may get unexpected result. Since the returned | |
pointer is a generic void pointer, see | |
<A href="program-sax2.html#SAX2Properties">SAX2 Programming Guide</A> and | |
<A href="program-dom.html#DOMBuilderProperties">DOM Programming Guide</A> to learn | |
exactly what type of property value each property returns for replication. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-28"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does the parser still try to locate the DTD even validation is turned off | |
and how to ignore external DTD reference?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>When DTD is referenced, the parser will try to read it, because DTDs can | |
provide a lot more information than just validation. It defines entities and | |
notations, external unparsed entities, default attributes, character | |
entities, etc... So it will always try to read it if present, even if | |
validation is turned off. | |
</P> | |
<P>To ignore the DTD, with Xerces-C++ 2.8.0 or up, you can call | |
<CODE><FONT face="courier, monospaced">setLoadExternalDTD(false)</FONT></CODE> (or | |
<CODE><FONT face="courier, monospaced">setFeature(XMLUni::fgXercesLoadExternalDTD, false)</FONT></CODE> | |
to disable the loading of external DTD. The parser will then ignore | |
any external DTD completely if the validationScheme is set to Val_Never. | |
</P> | |
<P>Note: This flag is ignored if the validationScheme is set to Val_Always or Val_Auto. | |
</P> | |
<P>To ignore the DTD in earlier version of Xerces-C++, the | |
only way to get around this is to install an EntityResolver | |
(see the Redirect sample for an example of how this is done), and reset the | |
DTD file to "". | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-29"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why do I get segmentation fault when running on Redhat Linux?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>There were some problems with Redhat Linux 7.x with C++ exception handling across shared | |
libraries. More details can be found | |
<A href="http://rhn.redhat.com/errata/RHBA-2002-055.html">here</A>. | |
Please try to upgrade your Redhat Linux gcc to the latest patch level and see if it helps. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-30"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does the XML data generated by the DOMWriter does not match my original XML input?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>If you parse an xml document using XercesDOMParser or DOMBuilder and pass such DOMNode | |
to DOMWriter for serialization, you may not get something that is exactly the same | |
as the original XML data. The parser may have done normalization, end of line conversion, | |
or has expanded the entity reference as per the XML 1.0 spec, 4.4 XML Processor Treatment of | |
Entities and References. From DOMWriter perspective, it does not know what the original | |
string was, all it sees is a processed DOMNode generated by the parser. | |
But since the DOMWriter is supposed to generate something that is parsable if sent | |
back to the parser, it will not print the DOMNode node value as is. The DOMWriter | |
may do some "touch up" to the output data for it to be parsable.</P> | |
<P>See <A href="program-dom.html#DOMWriterEntityRef">How does DOMWriter handle built-in entity | |
Reference in node value?</A> to understand further how DOMWriter touches up the entity reference. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-31"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why does my application crash when deleting the parser after releasing a document?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>In most cases, the parser handles deleting documents when the parser gets deleted. However, if an application | |
needs to release a document, it shall adopt the document before releasing it, so that the parser | |
knows that the ownership of this particular document is transfered to the application and will not | |
try to delete it once the parser gets deleted. | |
</P> | |
<DIV align="left"><TABLE border="0" cellpadding="0" cellspacing="4" width="464"><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#ffffff" width="462"><FONT size="-1"><PRE> | |
XercesDOMParser *parser = new XercesDOMParser; | |
... | |
try | |
{ | |
parser->parse(gXmlFile); | |
} | |
catch () | |
{ | |
... | |
} | |
DOMNode *doc = parser->getDocument(); | |
... | |
parser->adoptDocument(); | |
doc->release(); | |
... | |
delete parser; | |
</PRE></FONT></TD><TD bgcolor="#0086b2" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" height="1" width="462"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="462"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></DIV> | |
<P>The alternative to release document is to call parser's resetDocumentPool(), which releases | |
all the documents parsed. | |
</P> | |
</FONT></TD></TR></TABLE><BR><A name="faq-32"><!--anchor--></A><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="ffffff" colspan="2" width="494"><TABLE border="0" cellpadding="0" cellspacing="0" width="494"><TR><TD bgcolor="#039acc" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#039acc" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#039acc" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#0086b2" width="492"><FONT color="#ffffff" face="arial,helvetica,sanserif" size="+1"><IMG border="0" height="2" hspace="0" src="resources/void.gif" vspace="0" width="2"><B>Why do we have two versions of some XMLString methods (one with memory manager and one without)?</B></FONT></TD><TD bgcolor="#017299" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR><TR><TD bgcolor="#0086b2" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD><TD bgcolor="#017299" height="1" width="492"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="492"></TD><TD bgcolor="#017299" height="1" width="1"><IMG border="0" height="1" hspace="0" src="resources/void.gif" vspace="0" width="1"></TD></TR></TABLE></TD></TR><TR><TD width="10"> </TD><TD width="484"><FONT color="#000000" face="arial,helvetica,sanserif"> | |
<P>With the introduction of the configurable memory manager, we didn't want to break users by | |
changing the signature of the existing methods (for example, transcode and replicate). Also, | |
we did not want to provide a default memory | |
manager as it would introduce a side effect with users experiencing some strange core dumps. | |
The latter will occur when the scope of the string allocated is beyond that of | |
XMLPlatformUtils::Terminate (i.e. a string is allocated using the default memory manager | |
which is deleted when XMLPlatformUtils::Terminate is called, but the allocated string is | |
deleted later). We plan to deprecate the methods without a memory manager in a later release. | |
</P> | |
</FONT></TD></TR></TABLE><BR></TD></TR></TABLE></TD></TR></TABLE><BR><TABLE border="0" cellpadding="0" cellspacing="0" width="620"><TR><TD bgcolor="#0086b2"><IMG height="1" src="images/dot.gif" width="1"></TD></TR><TR><TD align="center"><FONT color="#0086b2" size="-1"><I> | |
Copyright © 1999-2007 The Apache Software Foundation. | |
All Rights Reserved. | |
</I></FONT></TD></TR></TABLE></BODY></HTML> |