blob: a4d3462d34c8987401fb59e5f6f5960e82510532 [file] [log] [blame]
<?xml version="1.0" standalone="no"?>
<!--
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
-->
<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd">
<s1 title="Schema">
<s2 title="Introduction">
<p>This package contains an implementation of the W3C XML Schema
Language, a recommendation of the Worldwide Web Consortium
available in three parts:
<jump href="http://www.w3.org/TR/xmlschema-0/">XML Schema: Primer</jump> and
<jump href="http://www.w3.org/TR/xmlschema-1/">XML Schema: Structures</jump> and
<jump href="http://www.w3.org/TR/xmlschema-2/">XML Schema: Datatypes</jump>.
We consider this implementation complete except for the limitations cited below.
</p>
&schema-feedback-info;
</s2>
<anchor name="limitation"/>
<s2 title="Limitations">
<ul>
<li>Due to the way in which the parser constructs content
models for elements with complex content, specifying large
values for the <code>minOccurs</code> or <code>maxOccurs</code>
attributes may cause a stack overflow or very poor performance
in the parser. Large values for <code>minOccurs</code> should be
avoided, and <code>unbounded</code> should be used instead of
a large value for <code>maxOccurs</code>.</li>
<li>The parser treats local elements in the same scope with the
same name and namespace as one element declaration and does not
differentiate between them.</li>
</ul>
</s2>
<anchor name="interpretation"/>
<s2 title="Interpretation of Areas that are Unclear or Implementation-Dependent">
<s3 title="keyref">
<p>
We have interpreted the specs as requiring &lt;keyref&gt; Identity Constraints to refer to
&lt;key&gt; or &lt;unique&gt; identity constraints within the scope of the elements to which
the &lt;keyref&gt; is attached. This interpretation is at variance with the Schema Primer, which
contains an example with a &lt;keyref&gt; declared on an element used inside the element of its
corresponding &lt;key&gt;.
</p>
</s3>
<s3 title="out-of-bound float values">
<p>
For float data, the specs does not explicitly specify how to deal with
out-of-bound data. Xerces converts these values as below (the values
depend on the system specific values of FLT_MAX and FLT_MIN):
</p>
<table>
<tr>
<td>Values in range</td>
<td>Values converted</td>
</tr>
<tr>
<td>less than -FLT_MAX (approx -3.402823466e+38) </td>
<td>-INF</td>
</tr>
<tr>
<td>greater than -FLT_MIN (approx -1.175494351e-38) and less than -0 </td>
<td>-0</td>
</tr>
<tr>
<td>greater than +0 and less than +FL_MIN (approx +1.175494351e-38) </td>
<td>+0</td>
</tr>
<tr>
<td>greater than +FLT_MAX (approx 3.402823466e+38) </td>
<td>+INF</td>
</tr>
</table>
<p>
The effect of this conversion would invalidate an instance data, for example,
"1.1e-39", of a data type derived from float, with minExclusive value '+0',
since "1.1e-39" is converted to "+0", which is the same as the minExclusive.
</p>
</s3>
<s3 title="out-of-bound double values">
<p>
Similarly, Xerces converts double values as below (the values
depend on the system specific values of DBL_MAX and DBL_MIN):
</p>
<table>
<tr>
<td>Values in range</td>
<td>Values converted</td>
</tr>
<tr>
<td>less than -DBL_MAX (approx -1.7976931348623158e+308) </td>
<td>-INF</td>
</tr>
<tr>
<td>greater than -DBL_MIN (approx -2.2250738585072014e-308) and less than -0 </td>
<td>-0</td>
</tr>
<tr>
<td>greater than +0 and less than +DBL_MIN (approx +2.2250738585072014e-308) </td>
<td>+0</td>
</tr>
<tr>
<td>greater than +DBL_MAX (approx +1.7976931348623158e+308) </td>
<td>+INF</td>
</tr>
</table>
</s3>
</s2>
<anchor name="usage"/>
<s2 title="Usage">
<p>Here is an example how to turn on schema processing in DOMParser
(default is off). Note that you must also turn on namespace support
(default is off) for schema processing.
</p>
<source>// Instantiate the DOM parser.
DOMParser parser;
parser.setDoNamespaces(true);
parser.setDoSchema(true);
parser.parse(xmlFile);
</source>
<p>Usage in SAXParser is similar, please refer to the
sample program 'samples/SAXCount/SAXCount.cpp' for further reference.
</p>
<p>Here is an example how to turn on schema processing in SAX2XMLReader
(default is on). Note that namespace must be on (default is on) as well.
</p>
<source>SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
parser->setFeature(XMLUni::fgXercesSchema, true);
parser->parse(xmlFile);
</source>
<p>Review the sample file, 'samples/data/personal-schema.xml' and
'samples/data/personal.xsd' for an example of an XML Schema grammar.
</p>
</s2>
<anchor name="associate"/>
<s2 title="Associating Schema Grammar with instance document">
<p>Schema grammars can be associated with instance documents in two ways.
</p>
<s3 title="Specifying Schema Grammar through method calls:">
<p>An application developer may associate schemas with instance documents through
methods <code>setExternalSchemaLocation</code> if they use namespaces, and
<code>setExternalNoNamespaceSchemaLocation</code> otherwise.
(For SAX2XMLReader, use the properties:
"http://apache.org/xml/properties/schema/external-schemaLocation" and
"http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation")
</p>
<p>Here is an example with no target namespace:
</p>
<source>
// Instantiate the DOM parser.
DOMParser parser;
parser.setDoNamespaces(true);
parser.setDoSchema(true);
parser.setExternalNoNamespaceSchemaLocation("personal.xsd");
parser.parse("test.xml");
// Instantiate the SAX2 XMLReader.
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
XMLCh* propertyValue = XMLString::transcode("personal.xsd");
ArrayJanitor&lt;XMLCh&gt; janValue(propertyValue);
parser->setProperty(
XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,
propertyValue);
parser.parse("test.xml");
</source>
<p>Here is an example with a target namespace. Note that it is an error to specify a
different namespace in <code>setExternalSchemaLocation</code> than the target
namespace defined in the Schema.
</p>
<source>
// Instantiate the DOM parser.
DOMParser parser;
parser.setDoNamespaces(true);
parser.setDoSchema(true);
parser.setExternalSchemaLocation("http://my.com personal.xsd http://my2.com test2.xsd");
parser.parse("test.xml");
// Instantiate the SAX2 XMLReader.
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
XMLCh* propertyValue = XMLString::transcode("http://my.com personal.xsd http://my2.com test2.xsd");
ArrayJanitor&lt;XMLCh&gt; janValue(propertyValue);
parser->setProperty(
XMLCh XMLUni::fgXercesSchemaExternalSchemaLocation,
propertyValue);
parser.parse("test.xml");
</source>
</s3>
<s3 title="Specifying Schema Grammar through attributes in the instance document:">
<p>If schema grammar was not specified externally through methods,
then each instance document that uses XML Schema grammars must specify the location of
the grammars it uses by using an xsi:schemaLocation attribute if they use
namespaces, and xsi:noNamespaceSchemaLocation attribute otherwise.
</p>
<p>Here is an example with no target namespace:
</p>
<source>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;personnel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation='personal.xsd'&gt;
...
&lt;/personnel&gt;
</source>
<p>Here is an example with a target namespace. Note that it is an error to specify a
different namespace in xsi:schemaLocation attribute than the target namespace
defined in the Schema.
</p>
<source>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;personnel xmlns="http://my.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://my.com personal.xsd http://my2.com test2.xsd"&gt;
...
&lt;/personnel&gt;
</source>
</s3>
</s2>
</s1>