XSLT option disabled, please look at HelpOnConfiguration.
<?xml version="1.0" encoding="utf-8"?>
<!-- uncomment to enable validation of the users' manual
<!DOCTYPE article [
<!ELEMENT xi:include EMPTY>
<!ATTLIST xi:include
  xmlns:xi CDATA #FIXED "http://www.w3.org/2001/XInclude"
  href     CDATA #REQUIRED
>
<!ENTITY % local.divcomponent.mix "|xi:include">
<!ENTITY % DocBookDTD PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
           "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
%DocBookDTD;
]>
--><article xmlns:xi="http://www.w3.org/2001/XInclude">
  <title>4Suite Core: Open-source Library for XML Processing</title>

  <subtitle>Users' Manual</subtitle>

  <articleinfo>
    <abstract>
      <para>This document describes how to perform a set of XML manipulation
      tasks with the 4Suite XML processing library. These tasks include
      parsing XML using either DOM-like or SAX-like models, querying XML or
      XML models using XPath, using XSLT, using XUpdate, and validating
      documents with RELAX NG.</para>
    </abstract>

    <revhistory>
      <revision>
        <revnumber>0.8</revnumber>

        <date>2006-08-15</date>

        <author>
          <firstname>Mike</firstname>

          <othername role="middlename">J.</othername>

          <surname>Brown</surname>

          <email>mike at skew.org</email>
        </author>

        <revdescription>
          <para>Added sections on XInclude and Pointer processing, and
          clarified which DOM interfaces are Domlette-specific.</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.7</revnumber>

        <date>2005-11-17</date>

        <author>
          <firstname>John</firstname>

          <othername role="middlename">L.</othername>

          <surname>Clark</surname>

          <email>jlc6@po.cwru.edu</email>
        </author>

        <revdescription>
          <para>Expanded the Domlette section on validation and added a
          Saxlette section on validation.</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.6</revnumber>

        <date>2005-11-10</date>

        <author>
          <firstname>John</firstname>

          <othername role="middlename">L.</othername>

          <surname>Clark</surname>

          <email>jlc6@po.cwru.edu</email>
        </author>

        <revdescription>
          <para>Factored a major example out of the XSLT section into a new
          section for comprehensive examples. Made the other examples in the
          XSLT section more robust.</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.5</revnumber>

        <date>2005-10-21</date>

        <author>
          <firstname>Uche</firstname>

          <othername role="middlename">G.</othername>

          <surname>Ogbuji</surname>

          <email>uche.ogbuji@fourthought.com</email>
        </author>

        <revdescription>
          <para>Prep for 4Suite XML 1.0b2 release. Update Saxlette section,
          add Ft.Xml.Xslt.Transform* to XSLT section, other minor
          edits..</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.4</revnumber>

        <date>2005-09-29</date>

        <author>
          <firstname>John</firstname>

          <othername role="middlename">L.</othername>

          <surname>Clark</surname>

          <email>jlc6@po.cwru.edu</email>
        </author>

        <revdescription>
          <para>Domlette section edits and updates; split the document into
          sections using XInclude.</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.3</revnumber>

        <date>2005-09-14</date>

        <author>
          <firstname>Luis</firstname>

          <othername role="middlename">Miguel</othername>

          <surname>Morillas</surname>

          <email>morillas@gmail.com</email>
        </author>

        <revdescription>
          <para>MarkupWriter section updates.</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.2</revnumber>

        <date>2005-09-11</date>

        <author>
          <firstname>Dave</firstname>

          <surname>Pawson</surname>

          <email>dave.pawson@gmail.com</email>
        </author>

        <revdescription>
          <para>XPath and XSLT section updates.</para>
        </revdescription>
      </revision>

      <revision>
        <revnumber>0.1</revnumber>

        <date>2005-09-08</date>

        <author>
          <firstname>Uche</firstname>

          <surname>Ogbuji</surname>

          <email>uche.ogbuji@fourthought.com</email>
        </author>

        <revdescription>
          <para>Initial draft.</para>
        </revdescription>
      </revision>
    </revhistory>
  </articleinfo>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/Introduction.xml">
  <title>Introduction</title>

  <para>4Suite allows users to take advantage of standard XML technologies
  rapidly and to develop and integrate Web-based applications. It also puts
  practical technologies for knowledge management projects in the hands of
  developers. It is implemented in Python with C extensions.</para>

  <para>At the core of 4Suite is a library of integrated tools (including
  convenient command-line tools) for XML processing, implementing open
  technologies such as DOM, SAX, XSLT, XInclude, XPointer, XLink, XPath,
  XUpdate, RELAX NG, and XML/SGML Catalogs.</para>

  <para>With 4Suite, you can:</para>

  <itemizedlist>
    <listitem>
      <para><link linkend="domlette">Parse a document into an efficient
      DOM-like structure (Domlette)</link></para>
    </listitem>

    <listitem>
      <para><link linkend="saxlette">Parse a document in event mode based on
      SAX 2 (Saxlette)</link></para>
    </listitem>

    <listitem>
      <para><link linkend="xpath_engine">Run XPath queries over a parsed
      document</link></para>
    </listitem>

    <listitem>
      <para><link linkend="xslt_engine">Apply XSLT to a document, whether or
      not it has been separately parsed</link></para>
    </listitem>

    <listitem>
      <para><link linkend="xupdate">Update a document using an XUpdate
      script</link></para>
    </listitem>

    <listitem>
      <para><link linkend="RELAXNG">Validate a document using RELAX
      NG</link></para>
    </listitem>
  </itemizedlist>

  <para>And much more. These tasks are covered in this manual.</para>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/Installation.xml">
  <title>Installation</title>

  <para>Please see the <ulink url="http://4suite.org/docs/howto/UNIX.xml">UNIX</ulink> or <ulink url="http://4suite.org/docs/howto/Windows.xml">Windows</ulink> install
  documents.  Remember that if you are using Cygwin on Windows, you should follow the UNIX instructions.</para>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/Domlette.xml" id="domlette">
  <title>DOM-like XML processing</title>

  <para>Domlette is 4Suite's lightweight DOM implementation. It is optimized
  for XPath operations, speed, and relatively low memory overhead. The
  Domlette API is accessible through <systemitem class="library">Ft.Xml.Domlette</systemitem>. This section describes how to
  parse, manipulate, and then serialize XML documents using this API.</para>

  <para>Below, we briefly summarize the various elements of the API that form
  the basic life span of Domlette objects.</para>

  <variablelist>
    <varlistentry>
      <term>Parsing XML documents</term>

      <listitem>
        <para>The <systemitem class="library">Ft.Xml</systemitem> module
        contains the function <methodname>Parse</methodname> that gets the
        job done quickly. See <xref linkend="quick_reader_access"/> for
        details. For a bit more more advanced parsing, you will need a
        combination of the reader instances in the
        <systemitem class="library">Ft.Xml.Domlette</systemitem> module and
        <methodname>Ft.Xml.CreateInputSource</methodname> for constructing
        <classname>InputSource</classname> instances. In rare cases you
        might need lower-level APIs in in the
        <systemitem class="library">Ft.Xml.InputSource</systemitem> module.
        Read <xref linkend="full_domlette_reader"/> if
        <methodname>Ft.Xml.Parse</methodname> isn't enough.</para>
      </listitem>
    </varlistentry>

    <varlistentry>
      <term>Modifying and interacting with XML documents</term>

      <listitem>
        <para>The Domlette API for interacting with XML documents—accessible
        as methods of the various Domlette objects—is similar to <ulink url="http://www.w3.org/TR/DOM-Level-2-Core">the DOM Level 2
        specification</ulink>. See <xref linkend="domlette_API"/> for more
        information.</para>
      </listitem>
    </varlistentry>

    <varlistentry>
      <term>Serializing XML documents</term>

      <listitem>
        <para>The <systemitem class="library">Ft.Xml.Domlette</systemitem>
        module provides two functions, <methodname>Print</methodname> and
        <methodname>PrettyPrint</methodname>, for writing your XML documents.
        The <methodname>Print</methodname> function writes the XML document
        precisely as given in the model. On the other hand, the
        <methodname>PrettyPrint</methodname> function adds whitespace nodes to
        your document to try to indent the resulting output nicely. See <xref linkend="domlette_serializing"/> for details.</para>
      </listitem>
    </varlistentry>
  </variablelist>

  <section>
    <title>Parsing XML documents</title>

    <para>We begin our discussion of the Domlette API by describing how to
    obtain a model of your XML documents to manipulate further. Because XML
    documents offer such rich functionality and exist in such varied
    environments, there can be a surprising amount of work that you must do to
    simply load your XML documents. We begin by providing a short-cut for easy
    access. We will then dive into the full suite of document loading
    utilities.</para>

    <section id="quick_reader_access">
      <title>Quick access to the Domlette reader API</title>

      <para>For basic document manipulations or to get started quickly, the
      <systemitem class="library">Ft.Xml</systemitem> module offers a quick
      way to parse XML documents and directly obtain access to the Domlette
      interface to those documents. Within this module the function of
      interest is <methodname>Parse</methodname>.</para>

      <warning>
        <para>This function will get you started quickly because it
        specifically chooses some default values for some of the more advanced
        parsing features. If you are passing in a string or stream, and the
        material in <xref linkend="base_URIs"/>
        applies to your parsing situation, then you will want to use the
        full-featured API. In brief, if your XML document references external
        resources, you should not use this convenience function. See <xref linkend="full_domlette_reader"/> instead.</para>
      </warning>

      <para>This function returns a Domlette
      <classname>Document</classname> representing the root of the document
      from the argument.</para>

      <variablelist>
        <varlistentry>
          <term><methodsynopsis>
              <methodname>Parse</methodname>

              <methodparam>
                <parameter>source</parameter>
              </methodparam>
            </methodsynopsis></term>

          <listitem>
            <para>The <methodname>Parse</methodname> function takes a single
            argument, which is a byte string (not unicode object), file-like
            object (stream), file path or URI.</para>
          </listitem>
        </varlistentry>
      </variablelist>

      <programlisting>XML = """
&lt;ham&gt;
&lt;eggs n='1'/&gt;
This is the string content with &lt;em&gt;emphasized text&lt;/em&gt; text
&lt;/ham&gt;"""

from Ft.Xml import Parse

doc = Parse(XML)
# If the above XML document were located in the file
# "target.xml", we could have used `Parse("target.xml")`.
print doc.xpath('string(ham//em[1])')</programlisting>
    </section>

    <section id="full_domlette_reader">
      <title>The full Domlette reader API</title>

      <para>You create Domlette instances by parsing XML documents with the
      reader system. For general use, the <systemitem class="library">Ft.Xml.Domlette</systemitem> package contains instances
      of the different reader classes that can be used directly after you
      import them. These instances include
      <constant>NonvalidatingReader</constant> and
      <constant>ValidatingReader</constant>, which provide non-validating
      parsing and validating parsing services, respectively. The validation in
      this case refers to DTD validation. For RELAX NG validation, see <xref linkend="RELAXNG"/>. All the reader classes (and, hence, their bundled
      instances) are described in later sections. After you have obtained one
      of these reader instances, you feed your XML document entity's byte
      stream to the reader. We summarize the available reader methods
      below.</para>

      <variablelist>
        <varlistentry>
          <term><methodsynopsis>
              <methodname>parseUri</methodname>

              <methodparam>
                <parameter>uri</parameter>
              </methodparam>
            </methodsynopsis></term>

          <listitem>
            <para>The <methodname>parseUri</methodname> method takes a single
            argument; this <parameter>uri</parameter> argument is the absolute
            URI of the document entity to parse. The URI will be dereferenced
            by the default resolver.</para>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term><methodsynopsis>
              <methodname>parseString</methodname>

              <methodparam>
                <parameter>st</parameter>
              </methodparam>

              <methodparam>
                <parameter>uri</parameter>
              </methodparam>
            </methodsynopsis></term>

          <listitem>
            <para>The <methodname>parseString</methodname> method takes two
            arguments; <parameter>st</parameter> is the XML document entity in
            the form of an encoded Python string (<emphasis role="bold">not a
            Unicode string</emphasis>). See the next section for details on
            the <parameter>uri</parameter> argument.</para>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term><methodsynopsis>
              <methodname>parseStream</methodname>

              <methodparam>
                <parameter>stream</parameter>
              </methodparam>

              <methodparam>
                <parameter>uri</parameter>
              </methodparam>
            </methodsynopsis></term>

          <listitem>
            <para>The <methodname>parseStream</methodname> method takes two
            arguments; <parameter>stream</parameter> is a Python file-like
            object that can supply the document entity's bytes via
            <methodname>read</methodname>() calls. See the next section for
            details on the <parameter>uri</parameter> argument.</para>
          </listitem>
        </varlistentry>

        <varlistentry>
          <term><methodsynopsis>
              <methodname>parse</methodname>

              <methodparam>
                <parameter>inputSource</parameter>
              </methodparam>
            </methodsynopsis></term>

          <listitem>
            <para>The <methodname>parse</methodname> method takes a single
            argument; <parameter>inputSource</parameter> is an
            <classname>Ft.Xml.InputSource.InputSource</classname> object,
            described in <xref linkend="InputSources"/>.</para>
          </listitem>
        </varlistentry>
      </variablelist>

      <para>The next two sections cover some of the issues that you should
      understand before using these functions. Then we start seeing some
      examples in <xref linkend="NonvalidatingReader"/>.</para>
    </section>

    <section id="base_URIs">
      <title>The importance of base URIs</title>

      <para>In the first 3 methods listed in the previous section, the
      <parameter>uri</parameter> argument is the URI of the document entity
      that you are feeding to the parser. It is a very important—but often
      overlooked—concept in document processing.</para>

      <para>The URI gives the document entity a unique identifier that can
      used to refer to the document as a whole. Also, each Domlette node
      derived from a particular entity inherits that entity's URI as the
      node's <varname>baseURI</varname> property, unless an alternative base
      URI was indicated, such as with <sgmltag class="attribute">xml:base</sgmltag>, or if part of the document was
      loaded as an external entity or XInclude.</para>

      <para>The document's URI is also used as the "base URI" for resolving
      any relative URI references that may appear within the document itself.
      Relative URI references may occur in a document in places like:</para>

      <itemizedlist>
        <listitem>
          <para><markup>&lt;!DOCTYPE&gt;</markup> or
          <markup>&lt;!ENTITY&gt;</markup>, immediately following the keyword
          <markup>SYSTEM</markup></para>
        </listitem>

        <listitem>
          <para><sgmltag class="element">&lt;xsl:import&gt;</sgmltag> and
          <sgmltag class="element">&lt;xsl:include&gt;</sgmltag>, in the value
          of the <sgmltag class="attribute">href</sgmltag> attribute</para>
        </listitem>

        <listitem>
          <para><sgmltag class="element">&lt;xi:include&gt;</sgmltag>, in the
          value of the <sgmltag class="attribute">href</sgmltag>
          attribute</para>
        </listitem>

        <listitem>
          <para><sgmltag class="element">&lt;exsl:document&gt;</sgmltag>, in
          the value of the <sgmltag class="attribute">href</sgmltag>
          attribute</para>
        </listitem>

        <listitem>
          <para>the arguments to XSLT's <function>document()</function>
          function</para>
        </listitem>
      </itemizedlist>

      <para>It is a common misconception that relative URI references in a
      document's content are considered to be relative to the processor's
      current working directory. They are actually resolved relative to the
      URI of the document that contains the relative URI reference (more
      specifically, relative to the URI of the <emphasis role="em">entity</emphasis> in which the reference occurs, keeping in
      mind that a document may be comprised of multiple entities, i.e.,
      separate files).</para>

      <para>In all cases, the document URI that you supply in the reader API
      must be "absolute", which means that it has a scheme, e.g.
      "<uri>http://spam/eggs.xml</uri>", not just
      "<filename>/spam/eggs.xml</filename>" or
      "<filename>eggs.xml</filename>".</para>

      <para>If you know there are not going to be any relative URI references
      to resolve during initial parsing or during processing of the Domlette
      by other tools, then you can safely omit the argument, or, preferably,
      supply a dummy URI like "<uri>urn:dummy</uri>" or
      "<uri>http://spam/eggs.xml</uri>". If you choose to omit URI arguments
      from APIs that need them, you may get a Python warning, and a random
      URI—which is probably not what you want—will be assigned.</para>

      <para>If you've understood all this and yet you want to just go ahead
      and not specify a base URI, you may have to turn off the likely
      warnings.  You can do so with code such as in the following example.</para>

      <programlisting>import  Ft.Xml.Domlette
import warnings
def disable_warnings(*args): pass

warnings.filterwarnings("ignore", category=Warning)
warnings.showwarning = disable_warnings

XML = "&lt;spam/&gt;"
doc  = Ft.Xml.Domlette.NonvalidatingReader.parseString(XML)
Ft.Xml.Domlette.Print(doc)
</programlisting>

      <para>You can also in such a case use the convenience function
      <methodname>Ft.Xml.Parse</methodname> (see above).</para>

    </section>

    <section>
      <title>Parsing XML that's already a Unicode string</title>

      <para>Because 4Suite is trying to provide as thin a wrapper as possible
      to the underlying parser, and due to complexities in the APIs of these
      parsers, there is no API in 4Suite for parsing Python's Unicode
      strings.</para>

      <para>If your XML is in the form of a Unicode string, you must encode
      the string as bytes so that the underlying parser can read it. Once you
      have an encoded string, you can pass it to the reader's
      <methodname>parseString</methodname>(), or wrap it in an
      <classname>InputSource</classname> using
      <methodname>Ft.Xml.CreateInputSource</methodname>, or the
      <methodname>fromString</methodname>() method of an
      <classname>InputSourceFactory</classname>. If the string is not UTF-16 or
      UTF-8 encoded, then you must tell the reader what encoding it actually
      uses. You can do this either by writing or replacing the XML declaration
      in the string itself, or (much easier) setting the optional encoding
      keyword argument in the reader's <methodname>parseString</methodname>()
      method or the <classname>InputSourceFactory</classname>'s
      <methodname>fromString</methodname>() method. For an example, see <ulink url="http://uche.ogbuji.net/tech/akara/nodes/2004-06-12/external-encoding">the
      Akara article on external encoding declarations</ulink>.</para>
    </section>

    <section id="NonvalidatingReader">
      <title><varname>NonvalidatingReader</varname></title>

      <para>Use <varname>NonvalidatingReader</varname> for basic parsing.
      <varname>NonvalidatingReader</varname> performs its parsing without
      validating against a DTD.</para>

      <para>The following example will parse an XML source taken from the
      supplied URI, which is treated as a URL by the default resolver.</para>

      <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
doc = NonvalidatingReader.parseUri(
  "http://www.w3.org/2000/08/w3c-synd/home.rss")</programlisting>

      <para>The following example also parses an XML source taken from the
      supplied URI, which is treated as a URL. In this case, the default
      resolver tries to read the XML source from the filesystem.</para>

      <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
doc = NonvalidatingReader.parseUri("file:///tmp/spam.xml")</programlisting>

      <para>The following example parses XML from the filesystem. When given a
      relative file path in the local OS's format, we must first convert that
      path to a URI that our reader objects can use.</para>

      <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
from Ft.Lib import Uri
file_uri = Uri.OsPathToUri('spam.xml')
doc = NonvalidatingReader.parseUri(file_uri)</programlisting>

      <para>The following example parses XML from a string. Note that it does
      not provide a document/base URI.</para>

      <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
doc = NonvalidatingReader.parseString("&lt;spam&gt;eggs&lt;/spam&gt;")</programlisting>

      <para>In the following example, we are parsing XML from a string in a
      case where the document does need a base URI to be specified.</para>

      <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
s = """&lt;!DOCTYPE spam [ &lt;!ENTITY eggs "eggs.xml"&gt; ]&gt;
&lt;spam&gt;&amp;eggs;&lt;/spam&gt;"""
doc = NonvalidatingReader.parseString(s, 'http://foo/test/spam.xml')
# during parsing, the replacement text for &amp;eggs;
# will be obtained from http://foo/test/eggs.xml</programlisting>

      <para>In all of the above examples, doc is now a Domlette node object.
      4Suite currently offers one Domlette implementation, written in C,
      called cDomlette.</para>
    </section>

    <section>
      <title><classname>EntityReader</classname> Examples</title>

      <para>Sometimes you need to parse a fragment of XML rather than the full
      document. If operating in non-validating mode is sufficient, Domlette
      has a reader that can handle this case. When parsing such a fragment,
      <property>EntityReader</property> returns a Domlette document fragment
      rather than a document object.</para>

      <programlisting>from Ft.Xml.Domlette import EntityReader
s = """
&lt;spam1&gt;eggs&lt;/spam1&gt;
&lt;spam2&gt;more eggs&lt;/spam2&gt;
"""
docfrag = EntityReader.parseString(s, 'http://foo/test/spam.xml')</programlisting>

      <note>
        <para>The content parsed by <classname>EntityReader</classname> must
        be an XML External Parsed Entity. This means that it can't be just any
        XML document. The main limitation is that it must not have a
        document type declaration.</para>
      </note>
    </section>

    <section>
      <title><classname>ValidatingReader</classname></title>

      <para>If you want to validate a document with a DTD as you parse it, use
      the <classname>ValidatingReader</classname> object instead. If
      <classname>ValidatingReader</classname> discovers that the document that
      it is currently parsing is invalid, then it throws a
      <classname>Ft.Xml.ReaderException</classname> and does not finish
      parsing the document. The following example illustrates these
      concepts.</para>

      <programlisting># ValidatingReader is a global instance
from Ft.Xml.Domlette import ValidatingReader

XML = """&lt;!DOCTYPE a [
  &lt;!ELEMENT a (b, b)&gt;
  &lt;!ELEMENT b EMPTY&gt;
]&gt;
&lt;a&gt;&lt;b/&gt;&lt;b/&gt;&lt;/a&gt;"""

doc = ValidatingReader.parseString(XML, "urn:x-example:valid-a")
# And of course, as with other readers, you can use `parse`, `parseUri`, and
# `parseStream` as well.

# The following document, however, is invalid because an `a` element can only
# have two `b` children according to its DTD.
XML = """&lt;!DOCTYPE a [
  &lt;!ELEMENT a (b, b)&gt;
  &lt;!ELEMENT b EMPTY&gt;
]&gt;
&lt;a&gt;&lt;b/&gt;&lt;b/&gt;&lt;b/&gt;&lt;/a&gt;"""

# This throws a `Ft.Xml.ReaderException` when it encounters invalid structure,
# and does not finish parsing the document into `doc`.
doc = ValidatingReader.parseString(XML, "urn:x-example:invalid-a")</programlisting>
    </section>

    <section>
      <title><classname>NoExtDtdReader</classname></title>

      <para>When using <classname>NonvalidatingReader</classname> to parse a
      document, that document's DTD is still opened and read to obtain
      information such as entity declarations and default attribute values.
      You cannot suppress reading of the internal DTD subset, but you can
      prevent the external subset from being accessed by using
      <classname>NoExtDtdReader</classname>. This won't affect the processing
      of external parameter entities defined in the internal DTD subset. Use
      this object as you would use
      <classname>NonvalidatingReader</classname>.</para>
    </section>

    <section>
      <title>Creating your own reader instance</title>

      <para>In some cases you might not want to use the global reader
      instances. For instance in multithreaded use, you might want a reader
      per thread. Or you might want to change some of the parameters on the
      readers. If so, you can create your own reader instance:</para>

      <programlisting>from Ft.Xml.Domlette import NonvalidatingReaderBase
reader = NonvalidatingReaderBase()
doc = reader.parseUri("http://xmlhack.com/read.php?item=1560")</programlisting>

      <para>Instead of <classname>NonvalidatingReaderBase</classname>, you
      could instead use <classname>NoExtDtdReaderBase</classname> or
      <classname>ValidatingReaderBase</classname>, depending on your needs.
      Each of these 3 readers take an optional
      <parameter>inputSourceFactory</parameter> constructor argument, which
      you can use to supply a custom URI resolver.</para>
    </section>

    <section id="InputSources">
      <title>InputSource objects</title>

      <para>All of the previous examples involve parsing URIs or strings of
      data. You can also handle <classname>InputSource</classname> objects. An
      <classname>InputSource</classname> is an object that encapsulates a
      source of encoded text for parsing, and a URI resolver. The advantage to
      using an <classname>InputSource</classname> is that it provides a
      standard API to the text stream, and—perhaps more importantly—allows you
      to associate a custom URI resolver with the stream.</para>

    <para>Normally, you can just get an <classname>InputSource</classname>
by calling the convenience function <methodname>Ft.Xml.CreateInputSource</methodname> with a single
argument, which is a string (not Unicode object), file-like
object (stream), file path or URI.  You can then pass the
      <classname>InputSource</classname> object to the reader's
      <methodname>parse</methodname>() method, as in the following
      example.
</para>

      <programlisting>from Ft.Xml import InputSource, CreateInputSource
from Ft.Xml.Domlette import NonvalidatingReader

#
# Use CreateInputSource to parse a URL:
#
isrc = CreateInputSource("http://xmlhack.com/read.php?item=1560")
doc1 = NonvalidatingReader.parse(isrc)
#
# Or a string:
#
isrc = CreateInputSource("&lt;spam&gt;eggs&lt;/spam&gt;", "http://spam.com/base")
doc2 = NonvalidatingReader.parse(isrc)
#
# InputSource is a file-like object, so you can treat it as such:
#
isrc = CreateInputSource("http://xmlhack.com/read.php?item=1560")
raw_text = isrc.read()
#
# The uri/system ID you used for it is maintained
#
print isrc.uri
#
# You can also create other InputSources from URIs relative to this one
#
isrc2 = isrc.resolve("read.php?item=1703")</programlisting>
    <para/>

      <para>
      If you need lower-level control you can use an
      <classname>InputSourceFactory</classname> instance, calling the appropriate method:
      <methodname>fromUri</methodname>(<parameter>uri</parameter>),
      <methodname>fromString</methodname>(<parameter>st</parameter>), or
      <methodname>fromStream</methodname>(<parameter>stream</parameter>), much
      like the reader API described earlier.  The following listing is
      functionally equivalent to the above one.</para>

      <programlisting>from Ft.Xml import InputSource
from Ft.Xml.Domlette import NonvalidatingReader

factory = InputSource.DefaultFactory
isrc = factory.fromUri("http://xmlhack.com/read.php?item=1560")
doc1 = NonvalidatingReader.parse(isrc)
#
# The factory is reusable. Here we also parse a string:
#
isrc = factory.fromString("&lt;spam&gt;eggs&lt;/spam&gt;", "http://spam.com/base")
doc2 = NonvalidatingReader.parse(isrc)
#
# InputSource is a file-like object, so you can treat it as such:
#
isrc = factory.fromUri("http://xmlhack.com/read.php?item=1560")
raw_text = isrc.read()
#
# The uri/system ID you used for it is maintained
#
print isrc.uri
#
# You can also create other InputSources from URIs relative to this one
#
isrc2 = isrc.resolve("read.php?item=1703")</programlisting>
    </section>

    <section id="converting_DOM">
      <title>Converting from other DOM libraries</title>

      <para>You can convert another Python DOM object (e.g. 4DOM or minidom)
      to a Domlette object using the function
      <methodname>ConvertDocument</methodname>:</para>

      <programlisting>from Ft.Xml.Domlette import ConvertDocument
converted_document = ConvertDocument(oldDocument, documentURI=u'http://www.example.org/')</programlisting>

      <para>The <parameter>DocumentURI</parameter> parameter provides a base
      URI for the converted nodes. If not specified, attributes documentURI
      and then baseURI are checked in the source DOM, as defined in <ulink url="http://www.w3.org/TR/DOM-Level-3-Core">DOM Level 3</ulink>. If no
      URI is found in this way, a warning is issued and a UUID URI is
      generated for the new Domlette object.</para>
    </section>
  </section>

  <section id="domlette_API">
    <title>Domlette API summary</title>

    <subtitle>Interacting with Domlette documents</subtitle>

    <para>You will use a large part of the Domlette API to interact with the
    model of your XML documents. The implementation of this part of the API is
    found in the <systemitem class="library">Ft.Xml.cDomlette</systemitem>
    module. This part of the API allows you to navigate around a document and
    modify the content of that document. It is very similar to <ulink url="http://www.w3.org/TR/DOM-Level-2-Core">the DOM Level 2
    specification</ulink> and follows some of <ulink url="http://www.w3.org/TR/DOM-Level-3-Core">the DOM Level 3
    specification</ulink>; feel free to refer to those specifications and the
    4Suite API documentation for details about the intended behavior of this
    API. You can find brief descriptions of the methods and attributes
    provided by this API listed below. This API is also nearly the same as the
    API for <systemitem class="library">xml.dom</systemitem>, which is bundled
    with Python. The node type constants are inherited directly from
    <literal>xml.dom.Node</literal>.</para>

    <para>Many objects that you will work with in the Domlette API are
    descendents of the Domlette <classname>Node</classname> class.
    <classname>Document</classname>s, document fragments (of class
    <classname>DocumentFragment</classname>), <classname>Element</classname>s,
    attributes (class <classname>Attr</classname>), text (class
    <classname>Text</classname>), processing instructions (class
    <classname>ProcessingInstruction</classname>), and comments (class
    <classname>Comment</classname>) are all nodes; any node operations are
    defined on objects of these types, as well. Some operations do not make
    sense on some objects, however. For example, it does not make sense to add
    children to an attribute node.</para>

    <para>In the DOM model of XML documents, there is a
    <classname>Document</classname> node which represents the starting point
    for the other pieces of the document. This node is <emphasis role="bold">not</emphasis> the root element of the document; rather, the
    <classname>Document</classname> node <emphasis role="bold">contains</emphasis> the root element as its only element
    child. The <classname>Document</classname> node may have other children,
    though, such as processing instructions and comments.</para>

    <para>You can easily access properties of a node directly. The following
    properties are available on any node. These properties generally store
    information about the structure of the document in the near "vicinity" of
    the target node.</para>

    <variablelist>
      <title>Properties available on every <classname>Node</classname>
      object</title>

      <varlistentry>
        <term><property>attributes</property></term>

        <listitem>
          <para>This is a python dictionary containing the attributes defined
          on the target node. The key for the dictionary is a tuple containing
          the namespace and local name of the attribute. The value associated
          with this attribute name tuple is the attribute (of class
          <classname>Attr</classname>) itself.</para>

          <programlisting>node = Parse("&lt;foo a='1'/&gt;")
print node.childNodes[0].attributes</programlisting>

          <screen>{(None, u'a'): &lt;Attr at 0x40870ecc: name u'a', value u'1'&gt;}</screen>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>baseURI</property></term>

        <listitem>
          <para>This is the base URI in scope for the target node as a Python
          unicode string. It is read-only and is computed dynamically according
          to DOM L3 Core.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>childNodes</property></term>

        <listitem>
          <para>This is the Python list of all the node children of the target
          node. Note that in DOM terminology, the attributes of a node are
          <emphasis role="bold">not</emphasis> children of that node.</para>

          <programlisting>node = Parse("&lt;foo a='1'/&gt;")
print node.childNodes</programlisting>

          <screen>[&lt;Element at 0x4086052c: name u'foo', 1 attributes, 0 children&gt;]</screen>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>firstChild</property></term>

        <listitem>
          <para>This is the first child node of the target node. This is
          equivalent to <code>childNodes[0]</code>, and is a useful property
          for quickly walking the document tree.</para>

          <programlisting>node = Parse("&lt;foo a='1'/&gt;")
print node.firstChild</programlisting>

          <screen>&lt;Element at 0x40860a6c: name u'foo', 1 attributes, 0 children&gt;</screen>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>lastChild</property></term>

        <listitem>
          <para>This is the last child node of the target node. This is
          equivalent to <code>childNodes[-1]</code>.</para>

          <programlisting>node = Parse("&lt;foo a='1'/&gt;&lt;!--Hi!--&gt;")
print node.lastChild</programlisting>

          <screen>&lt;Comment at 0x4087caf4: u'Hi!'&gt;</screen>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>localName</property></term>

        <listitem>
          <para>This is the local name of the target node as a Python unicode
          string.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>namespaceURI</property></term>

        <listitem>
          <para>This is the namespace URI of the target node as a Python
          unicode string.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>nextSibling</property></term>

        <listitem>
          <para>This is the node immediately following the target node, or
          <literal>None</literal> if the target node is the last child of its parent
          (or if the target node is an attribute, as attributes are
          unordered).</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>nodeValue</property></term>

        <listitem>
          <para>This is the value of the target node as a Python unicode
          string, if the target node has a string value. If not, this is
          <literal>None</literal>. To illustrate some of the possibilities,
          attributes and text nodes have values, while elements and documents
          do not.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>ownerDocument</property></term>

        <listitem>
          <para>This is the <classname>Document</classname> node in which the
          target node is contained.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>parentNode</property></term>

        <listitem>
          <para>This is the parent of the target node. If the target node is a
          <classname>Document</classname> node, then this will be
          <literal>None</literal>; <classname>Document</classname> nodes do not have
          parents.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>prefix</property></term>

        <listitem>
          <para>This is the namespace prefix of the current node, or
          <literal>None</literal> if the current node does not (or cannot) have a
          namespace prefix.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>previousSibling</property></term>

        <listitem>
          <para>This is the node immediately preceding the target node, or
          <literal>None</literal> if the target node is the first child of its
          parent (or if the target node is an attribute, as attributes are
          unordered).</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>rootNode</property></term>

        <listitem>
          <para>This is a synonym for
          <property>ownerDocument</property>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>xmlBase</property></term>

        <listitem>
          <para>This is a synonym for <property>baseURI</property>.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>In addition to accessing the structure relative to a node, there are
    also a set of operations that we can perform on these structures,
    including a variety of operations for modifying the document. Some of
    these methods allow you to add new nodes in various places; note that in
    the DOM, only <classname>Document</classname> nodes can create new nodes. See <xref linkend="Document_methods"/> for details. The following methods are
    available on any node.</para>

    <variablelist>
      <title>Methods available to every <classname>Node</classname>
      object</title>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>appendChild</methodname>

            <methodparam>
              <parameter>node</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method adds <parameter>node</parameter> as the last child
          of the current instance. This is useful for manually building a
          document in breadth-first document order.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>insertBefore</methodname>

            <methodparam>
              <parameter>newChild</parameter>
            </methodparam>

            <methodparam>
              <parameter>refChild</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method adds the node <parameter>newChild</parameter> to
          the current instance immediately before child node
          <parameter>refChild</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>replaceChild</methodname>

            <methodparam>
              <parameter>newChild</parameter>
            </methodparam>

            <methodparam>
              <parameter>oldChild</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method replaces the child node
          <parameter>oldChild</parameter> with the
          <parameter>newChild</parameter> node.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>removeChild</methodname>

            <methodparam>
              <parameter>oldChild</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method removes the <parameter>oldChild</parameter> node
          as a child of the instance node.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>cloneNode</methodname>

            <methodparam>
              <parameter>deep</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method returns a new copy of the current instance. If
          (and only if) <parameter>deep</parameter> is true, then we copy
          deeply: the node's attributes and children are also copied
          deeply.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>isSameNode</methodname>

            <methodparam>
              <parameter>otherNode</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method determines whether the instance node and
          <parameter>otherNode</parameter> are the same node based upon object
          identity.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>normalize</methodname>

            <void/>
          </methodsynopsis></term>

        <listitem>
          <para>This method merges any adjacent text nodes in the attributes
          or descendents of the current instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>hasChildNodes</methodname>

            <void/>
          </methodsynopsis></term>

        <listitem>
          <para>This method returns true if and only if the instance node has
          any child nodes.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>xpath</methodname>

            <methodparam>
              <parameter>expr</parameter>
            </methodparam>

            <methodparam choice="opt">
              <parameter>explicitNss</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method evaluates the XPath expression
          <parameter>expr</parameter> with the current instance as the
          expression context and returns an appropriately-valued result. The
          <parameter>explicitNss</parameter> parameter is optional; it is a
          Python dictionary mapping namespace prefixes to namespaces for use
          in the expression. See <xref linkend="xpath_engine"/> for
          details.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>In addition to their behavior as nodes,
    <classname>Document</classname> nodes are uniquely responsible for a
    number of tasks. For example, only <classname>Document</classname> nodes
    can create other nodes. The following methods are availble only to
    <classname>Document</classname> nodes.</para>

    <variablelist id="Document_methods">
      <title>Methods available to <classname>Document</classname>
      objects</title>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createElementNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>qualifiedName</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method creates and returns a new
          <classname>Element</classname> with the given namespace URI and
          qualified name.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createAttributeNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>qualifiedName</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method creates and returns a new attribute
          (<classname>Attr</classname> object) with the given namespace URI
          and qualified name.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createTextNode</methodname>

            <methodparam>
              <parameter>data</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method creates and returns a new
          <classname>Text</classname> node with the string value of
          <parameter>data</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createProcessingInstruction</methodname>

            <methodparam>
              <parameter>target</parameter>
            </methodparam>

            <methodparam>
              <parameter>data</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method creates and returns a new processing instruction
          (<classname>ProcessingInstruction</classname> object) with the given
          <parameter>target</parameter> name and contents taken from
          <parameter>data</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createComment</methodname>

            <methodparam>
              <parameter>data</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method creates and returns a new
          <classname>Comment</classname> with the string value of
          <parameter>data</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createDocumentFragment</methodname>

            <void/>
          </methodsynopsis></term>

        <listitem>
          <para>This method creates and returns a new, empty document fragment
          (<classname>DocumentFragment</classname> object).</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>importNode</methodname>

            <methodparam>
              <parameter>importedNode</parameter>
            </methodparam>

            <methodparam>
              <parameter>deep</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>Nodes can only belong to one document at a time. This method
          creates a copy of the node <parameter>importedNode</parameter> that
          belongs to the instance (but which does not yet have a parent). If
          (and only if) <parameter>deep</parameter> is true, then we copy
          deeply: the node's attributes and children are also copied deeply
          and imported.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>Document nodes also have a number of properties that are not found
    on other nodes. These properties are summarized in the following
    list.</para>

    <variablelist id="Document_properties">
      <title>Properties available on <classname>Document</classname>
      objects</title>

      <varlistentry>
        <term><property>doctype</property></term>

        <listitem>
          <para>This is a <classname>DocumentType</classname> object that
          encapsulates info about the document's "type", as described in its
          DOCTYPE tag. In Domlette, which doesn't use such objects, the value
          of the <property>doctype</property> property will always be
          <literal>None</literal>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>documentElement</property></term>

        <listitem>
          <para>This is the root element of the document.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>documentURI</property></term>

        <listitem>
          <para>This is the URI that identifies the document.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>implementation</property></term>

        <listitem>
          <para>This is the <classname>DOMImplementation</classname> that
          created the document.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>publicId</property></term>

        <listitem>
          <para>This Domlette-specific property is the public ID of the DTD of
          this document.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>rootNode</property></term>

        <listitem>
          <para>This refers to the current instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>systemId</property></term>

        <listitem>
          <para>This Domlette-specific property is the system ID of the DTD of
          this document.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>unparsedEntities</property></term>

        <listitem>
          <para>This is the list of unparsed entities in the current
          document.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>Attributes (<classname>Attr</classname> objects) do not have any
    special methods, but they do have a few additional properties. These
    properties are summarized in the following list.</para>

    <variablelist>
      <title>Properties available on <classname>Attr</classname>
      objects</title>

      <varlistentry>
        <term><property>name</property></term>

        <listitem>
          <para>This is the qualified name of the current instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>nodeName</property></term>

        <listitem>
          <para>This is a synonym for the <property>name</property>
          property.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>ownerElement</property></term>

        <listitem>
          <para>This is a synonym for the <property>parentNode</property>
          property.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>specified</property></term>

        <listitem>
          <para>You will probably never need this property. It is always
          <literal>1</literal>. DOM says it should be <literal>0</literal> if
          it is present through defaulting, rather than explicitly specified
          in the document. This is only possible if the DOM implementation
          preserves certain details from DTD processing, which 4Suite never
          does. Therefore the value is always <literal>0</literal>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>value</property></term>

        <listitem>
          <para>This is a synonym for the <property>nodeValue</property>
          property.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>Since attributes can only be attached to elements,
    <classname>Element</classname> objects have a set of special methods for
    managing which attributes are attached to them. We describe these methods
    below.</para>

    <variablelist>
      <title>Methods available to <classname>Element</classname>
      objects</title>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>hasAttributeNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>localName</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method returns true if the current instance has an
          attribute with the given namespace URI and local name, and false
          otherwise.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>getAttributeNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>localName</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method returns the attribute <emphasis role="bold">value</emphasis> of the attribute with the given
          namespace URI and local name, if one exists. If not, this returns
          <literal>None</literal>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>getAttributeNodeNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>localName</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method returns the <classname>Attr</classname> object of
          the attribute with the given namespace URI and local name, if one
          exists. If not, this returns <literal>None</literal>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>removeAttributeNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>localName</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method removes the attribute with the given namespace URI
          and local name from the current instance element.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>removeAttributeNode</methodname>

            <methodparam>
              <parameter>node</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method removes the attribute <parameter>node</parameter>
          from the current instance element.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>setAttributeNS</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>qualifiedName</parameter>
            </methodparam>

            <methodparam>
              <parameter>value</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method adds an attribute or replaces an attribute with
          the specified namespace URI and qualified name and sets the content
          of that attribute to <parameter>value</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>setAttributeNodeNS</methodname>

            <methodparam>
              <parameter>node</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method adds or replaces an attribute using the
          <classname>Attr</classname> object
          <parameter>node</parameter>.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para><classname>Element</classname>s also have several properties above
    and beyond what they get from being <classname>Node</classname>s. See the
    list below for details.</para>

    <variablelist>
      <title>Properties available on <classname>Element</classname>
      objects</title>

      <varlistentry>
        <term><property>nodeName</property></term>

        <listitem>
          <para>This is the qualified name of the current instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>tagName</property></term>

        <listitem>
          <para>This is a synonym for <property>nodeName</property>.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>Both <classname>Text</classname> and <classname>Comment</classname>
    nodes are also more general <classname>CharacterData</classname> nodes in
    the DOM. <classname>CharacterData</classname> nodes have several
    additional properties and methods for managing the string data that they
    contain. The individual <classname>Text</classname> and
    <classname>Comment</classname> nodes, however, do not add any
    functionality to their general <classname>CharacterData</classname> parent
    class. You can find descriptions of the properties and methods offered by
    <classname>CharacterData</classname> objects below.</para>

    <variablelist>
      <title>Properties available on <classname>CharacterData</classname>
      objects</title>

      <varlistentry>
        <term><property>data</property></term>

        <listitem>
          <para>This is the string content of the current instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>length</property></term>

        <listitem>
          <para>This is the length of the string content of the current
          instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><property>nodeValue</property></term>

        <listitem>
          <para>This is a synonym for <property>data</property>.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <variablelist>
      <title>Methods available to <classname>CharacterData</classname>
      objects</title>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>insertData</methodname>

            <methodparam>
              <parameter>offset</parameter>
            </methodparam>

            <methodparam>
              <parameter>data</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method inserts the string <parameter>data</parameter>
          into the content of the current instance at the index specified by
          <parameter>offset</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>appendData</methodname>

            <methodparam>
              <parameter>data</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method appends the string <parameter>data</parameter> to
          the end of the value of the current instance.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>replaceData</methodname>

            <methodparam>
              <parameter>offset</parameter>
            </methodparam>

            <methodparam>
              <parameter>count</parameter>
            </methodparam>

            <methodparam>
              <parameter>data</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method replaces <parameter>count</parameter> number of
          characters found at index <parameter>offset</parameter> in the
          current instance with the string <parameter>data</parameter>.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>substringData</methodname>

            <methodparam>
              <parameter>offset</parameter>
            </methodparam>

            <methodparam>
              <parameter>count</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method retrieves and returns the part of the string value
          of the current instance that begins at index
          <parameter>offset</parameter> and extends
          <parameter>count</parameter> characters.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>deleteData</methodname>

            <methodparam>
              <parameter>offset</parameter>
            </methodparam>

            <methodparam>
              <parameter>count</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method deletes the part of the string value of the
          current instance that begins at index <parameter>offset</parameter>
          and extends <parameter>count</parameter> characters.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>A few DOM actions are not "owned" by any individual document. In
    effect, they are general-purpose operations. They can be found in
    <classname>DOMImplementation</classname> objects. One such precreated
    instance can be conveniently found at and used from
    <property>Ft.Xml.Domlette.implementation</property>. The general methods
    that such a <classname>DOMImplementation</classname> object offers are
    listed below.</para>

    <variablelist>
      <title><classname>DOMImplementation</classname> methods:</title>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createDocument</methodname>

            <methodparam>
              <parameter>namespaceURI</parameter>
            </methodparam>

            <methodparam>
              <parameter>qualifiedName</parameter>
            </methodparam>

            <methodparam choice="opt">
              <parameter>doctype</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This standard DOM method creates and returns a
          <classname>Document</classname> object associated with the given
          <classname>DocumentTyype</classname> object, and having a single
          element child with the given QName and namespace. Since Domlette
          does not use <classname>DocumentTyype</classname> objects, the
          <parameter>doctype</parameter> argument must be given as <literal>None</literal>.
          </para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>createRootNode</methodname>

            <methodparam>
              <parameter>documentURI</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This Domlette-specific method creates a
          <classname>Document</classname> object with the specified document
          (base) URI. No document element is created. This method is generally
          preferred over <methodname>createDocument</methodname>(); see the
          following section, 'Building a DOM from scratch'.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><methodsynopsis>
            <methodname>hasFeature</methodname>

            <methodparam>
              <parameter>feature</parameter>
            </methodparam>

            <methodparam>
              <parameter>version</parameter>
            </methodparam>
          </methodsynopsis></term>

        <listitem>
          <para>This method tests whether the DOM implementation implements a
          specific feature.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <section>
      <title>What about
      <methodname>getElementsByTagName</methodname>()?</title>

      <para>The <methodname>getElementsByTagName</methodname>() method isn't
      supported, because there are better options. In particular, you can just
      use XPath:</para>

      <programlisting>doc.xpath(u"//tagname")</programlisting>

      <para>For more possibilities, see <ulink url="http://uche.ogbuji.net/tech/akara/nodes/2004-06-19/get-elements"><methodname>getElementsByTagName</methodname>
      Alternatives</ulink>.</para>
    </section>
  </section>

  <section id="domlette_serializing">
    <title>Serializing Domlette nodes</title>

    <para>Domlette comes with a couple of very fast printer functions which
    also go to great pains to correctly handle character encoding issues:
    <methodname>Print</methodname> and <methodname>PrettyPrint</methodname>.
    Here are some serialization examples using the Domlette printers, given a
    node '<literal>node</literal>' (it doesn't have to be a document
    node).</para>

    <programlisting>from Ft.Xml.Domlette import Print, PrettyPrint

# basic serialization to sys.stdout
Print(node)

# ... with extra whitespace (indenting)
PrettyPrint(node)

# ... using a single tab, rather than 2 spaces, to indent at each level
PrettyPrint(node, indent='\t')

# serializing to a utf-8 encoded file
f = open('output.xml','w')
Print(node, stream=f)
f.close()

# ... to an iso-8859-1 encoded file
f = open('output.xml','w')
Print(node, stream=f, encoding='iso-8859-1')
f.close()

# ... to an ascii encoded string
import cStringIO
buf = cStringIO.StringIO()
Print(node, stream=buf, encoding='us-ascii')
buf.close()
s = buf.getvalue()

# Normally, output syntax (XML or HTML) is chosen based on the DOM type,
# which is automatically detected. A Domlette or XML DOM can be output in
# HTML syntax if the asHtml=1 argument is given.
PrettyPrint(node, asHtml=1)</programlisting>

    <para>See also: <citetitle><ulink url="http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/dom-printing">Serializing
    XML from DOM or Domlette documents</ulink></citetitle></para>
  </section>

  <section>
    <title>Building a DOM from scratch</title>

    <para>As an alternative to parsing a preexisting XML document, you can
    also build a document model, with certain limitations, from the ground up.
    W3C and Python DOM facilities for doing this are intended mainly for creating
    a temporary document whose nodes will be imported into an existing document,
    and while Domlette does offer a more convenient document creation method,
    it has many of the same limitations. However, for most documents, its
    capabilities should be sufficient.</para>

    <para>The <systemitem class="library">Ft.Xml.Domlette</systemitem> module
    contains a <classname>DOMImplementation</classname> instance named
    <property>implementation</property> which provides a set of methods for
    initializing new <classname>Document</classname>s. The
    <methodname>implementation.createRootNode</methodname> method takes a base URI
    argument and provides a natural approach for creating an XPath model root node.
    This is similar to the DOM idea of a document node and even closer to a DOM
    document fragment (multiple element children are allowed). The
    <methodname>implementation.createDocument</methodname> method, on the
    other hand, is designed to come close to the DOM interface, although its
    doctype argument must be <literal>None</literal>.</para>

    <programlisting>doc = implementation.createRootNode('file:///article.xml')</programlisting>

    <para>is the equivalent of</para>

    <programlisting>from Ft.Xml import EMPTY_NAMESPACE
doc = implementation.createDocument(EMPTY_NAMESPACE, None, None)</programlisting>

    <para>with the added advantage of doc.baseURI being set to
    'file:///article.xml', which is not possible to set via standard DOM interfaces
    (the baseURI attribute is read-only).</para>

    <para>Similarly,</para>

    <programlisting>from Ft.Xml import EMPTY_NAMESPACE
doc = implementation.createRootNode('file:///article.xml')
docelement = doc.createElementNS(EMPTY_NAMESPACE, 'article')
doc.appendChild(docelement)</programlisting>

    <para>is the equivalent of</para>

    <programlisting>from Ft.Xml import EMPTY_NAMESPACE
doc = implementation.createDocument(EMPTY_NAMESPACE, 'article', None)</programlisting>

    <para>plus doc.baseURI being set to 'file:///article.xml'.</para>

    <para>If you want as much fidelity to the DOM API as Domlette offers, use
    <literal>implementation.createDocument</literal>. If you just want to
    create a document or other such root-level node, and never mind the
    strange parameters, use
    <methodname>implementation.createRootNode</methodname>.</para>
  </section>

  <section id="xpath_query">
    <title>XPath query</title>

    <para>You can easily perform XPath queries by use the
    <methodname>xpath</methodname> method for cDomlette nodes as
    follows:</para>

    <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
doc = NonvalidatingReader.parseString("&lt;spam&gt;eggs&lt;a/&gt;&lt;a/&gt;&lt;/spam&gt;")
print doc.xpath(u'//a')
print doc.xpath(u'string(/spam)')</programlisting>

    <para>Notice: this is nothing like W3C DOM's XPath query module. The
    emphasis, as usual with Domlette, is on speed, simplicity and
    pythonic-ness.</para>

    <para>The API, in brief:</para>

    <para><literal>node.xpath(expr[, explicitNss])</literal></para>

    <itemizedlist>
      <listitem>
        <para>node - will be used as core of the context for evaluating the
        XPath</para>
      </listitem>

      <listitem>
        <para>expr - XPath expression in string or compiled form</para>
      </listitem>

      <listitem>
        <para>explicitNss - (optional) any additional or overriding namespace
        mappings in the form of a dictionary that maps prefixes to namespace
        URIs. The base namespace mappings are taken from in-scope declarations
        on the given node. This explicit dictionary is superimposed on the
        base mappings.</para>
      </listitem>
    </itemizedlist>

    <para>For additional details, see <xref linkend="xpath_engine"/>.</para>
  </section>

  <section>
    <title>More on base URIs</title>

    <para>For some users, always specifying a base URI feels like an
    inconvenience. Perhaps they always generate XML sources from text or
    streams without naturally associated URIs, and they have to figure out
    schemes to come up with base URIs for the parse. But there is good reason
    for this pickiness. Just ask one of the users <ulink url="http://lists.fourthought.com/pipermail/4suite/2004-January/006064.html">who
    got bitten by carelessness with base URIs in practice</ulink>. It's better
    to always put some amount of thought into base URIs when processing XML,
    and 4Suite encourages this.</para>

    <para>Note that 4Suite only enforces the requirement for base URIs in
    cases where they are needed to make sense of a requested operation. Your
    document must have a valid base URI if you use external entities,
    XInclude, xsl:import, xsl:include, the XSLT document() function, the EXSLT
    exsl:document element, or any other operations that require access to an
    external resource. If your main use for URI resolution is XSLT import and
    includes, you can avoid having to give valid base URIs by using XSLT
    include paths.<!-- FIXME: add internal link--></para>

    <para>A valid base URI starts with a scheme, such as
    <literal>http:</literal>. A simple name, such as "spam" is a valid
    relative URI reference, but not a valid base URI. Without a base URI, a
    relative reference is no more useful than an apartment number given
    without the address of the entire apartment building. Merging a base URI
    with a relative reference is a string operation that is undertaken in a
    standard manner, and is generally only useful when the base URI is
    hierarchical; that is, it is a URL using one of the common schemes that
    have slashes as path separators (e.g., http:, ftp:, gopher:, and most
    file: URLs). The built-in 4Suite URI resolver
    <systemitem class="library">Ft.Lib.Uri.BASIC_RESOLVER</systemitem> knows
    how to perform such resolution.</para>

    <!--The
    discussion threads that lie behind the major re-write of the parsing and
    URI resolution infrastructure in 4Suite are scattered all over the place,
    but one instructive starting point is <ulink
    url="http://lists.fourthought.com/pipermail/4suite/2002-May/003665.html">this
    message to the 4Suite mailing list in May of 2002</ulink>. Also in this
    thread, Mike Brown <ulink
    url="http://lists.fourthought.com/pipermail/4suite/2002-May/003673.html">points
    to</ulink> a similar situation when using the Saxon XSLT processor, which
    has led to a FAQ for that community.
-->
  </section>

  <section>
    <title>Why does Domlette diverge from the DOM specification?</title>

    <para>Domlette is not a complete or fully conformant DOM implementation,
    but it does provide an interface very close to W3C DOM Level 2 and the
    corresponding Python mapping as laid out in the
    <systemitem class="library">xml.dom</systemitem> API docs.
    </para>

    <para>The areas of divergence are inconsequential for most users,
    and generally reflect decisions made in the interest of eliminating
    redundancy, inefficiency, and, to some degree, un-Pythonic design.
    Also, one of the important design principles for Domlette is that
    where DOM and XPath disagree, XPath wins; aside from making things
    more efficient to implement, this behavior is generally what people
    want in an XML document model.</para>

    <para>It is also worth noting that in the interest of usability,
    all DOM implementations exhibit some degree of variation from the
    specs. Coding a completely implementation-agnostic DOM application
    is difficult and usually unnecessary.</para>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/Saxlette.xml" id="saxlette">
  <title>SAX</title>

  <para>Saxlette is a fast SAX implementation, all written in C. Its API is
  similar to those of <ulink url="http://docs.python.org/lib/content-handler-objects.html">Python's
  built-in SAX</ulink>.</para>

  <programlisting>from xml import sax
from Ft.Xml import CreateInputSource

class element_counter(sax.ContentHandler):
    def startDocument(self):
        self.ecount = 0

    def startElementNS(self, name, qname, attribs):
        self.ecount += 1

parser = sax.make_parser(['Ft.Xml.Sax'])
handler = element_counter()
parser.setContentHandler(handler)
#'file:ot.xml' or file('ot.xml') or file('ot.xml').read() would work just as well, of course
parser.parse(CreateInputSource('ot.xml'))
print "Elements counted:", handler.ecount</programlisting>

  <para>If you don't care about PySax compatibility, you can use the more
  specialized API, which involves the following lines in place of the
  equivalents above:</para>

  <programlisting>from <systemitem class="library">Ft.Xml</systemitem> import <classname>Sax</classname>
...
class element_counter:
....
parser = Sax.CreateParser()</programlisting>

  <para>The biggest API differences between Saxlette and PySax are that
  Saxlette only supports SAX 2. For example,
  <literal>feature_namespaces</literal> is hard-wired to
  <literal>True</literal> and <literal>feature_namespace_prefixes</literal> to
  <literal>False</literal> (which is exactly what SAX2 says is required).
  Saxlette also combines all adgacent text events, which eliminates one of the
  pain points of PySax.</para>

  <para>The argument to the <function>parse</function> method is a URI, a SAX
  input source or a 4Suite input source. In the example above a URI was used.
  The following example shows similar code using 4Suite's <systemitem class="library">Ft.Xml.InputSource</systemitem>.</para>

  <programlisting>from Ft.Xml import InputSource, Sax
factory = InputSource.DefaultFactory
isrc = factory.fromUri("file:ot.xml")
doc1 = NonvalidatingReader.parse(isrc)

class element_counter:
    def startDocument(self):
        self.ecount = 0

    def startElementNS(self, name, qname, attribs):
        self.ecount += 1

parser = Sax.CreateParser()
handler = element_counter()
parser.setContentHandler(handler)
parser.parse(isrc)
print "Elements counted:", handler.ecount</programlisting>

  <section>
    <title>Validating a document while parsing it using SAX</title>

    <para>To enable validation of your documents while otherwise parsing them
    normally with SAX, set the
    <constant>xml.sax.handler.feature_validation</constant> feature to
    <literal>True</literal> on your parser using a line similar to
    <code>parser.setFeature(xml.sax.handler.feature_validation, True)</code>.
    The parser will then throw an
    <classname>xml.sax._exceptions.SAXParseException</classname> exception if
    it determines that the document is invalid, and it will stop parsing the
    document. Handlers for document components that have been parsed will be
    called, however. The following example illustrates these concepts.</para>

    <programlisting>from Ft.Xml import InputSource, Sax
factory = InputSource.DefaultFactory

XML = """&lt;!DOCTYPE a [
  &lt;!ELEMENT a (b, b)&gt;
  &lt;!ELEMENT b EMPTY&gt;
]&gt;
&lt;a&gt;&lt;b/&gt;&lt;b/&gt;&lt;/a&gt;"""

isrc = factory.fromString(XML, 'urn:x-example:valid-a')

class element_counter:
    def startDocument(self):
        self.scount = 0
        self.ecount = 0

    def startElementNS(self, name, qname, attribs):
        self.scount += 1

    def endElementNS(self, name, qname):
        self.ecount += 1

parser = Sax.CreateParser()
handler = element_counter()
parser.setContentHandler(handler)
# And now, to enable validation...
import xml
parser.setFeature(xml.sax.handler.feature_validation, True)
parser.parse(isrc)
print "Saw", handler.scount, "start tags"
print "Saw", handler.ecount, "end tags"

# And now we show what happens on an invalid document:
XML = """&lt;!DOCTYPE a [
  &lt;!ELEMENT a (b, b)&gt;
  &lt;!ELEMENT b EMPTY&gt;
]&gt;
&lt;a&gt;&lt;b/&gt;&lt;b/&gt;&lt;b/&gt;&lt;/a&gt;"""

isrc = factory.fromString(XML, 'urn:x-example:invalid-a')
parser.parse(isrc)
print "Saw", handler.scount, "start tags"
print "Saw", handler.ecount, "end tags"
# The above document is invalid; it has one more `b` element than is
# allowed by the DTD.  The handlers have still been called for those
# parts of the document that have been parsed.</programlisting>
  </section>

  <section id="saxlette_domwalker">
    <title>Walking a DOM to fire SAX events</title>

    <para>Saxlette has the ability to walk a Domlette tree, firing off events
    to a handler as if from a source document parse. This ability used to be
    too well, hidden, though, and I made an API addition to make it more
    readily available. This is the new
    <classname>Ft.Xml.Domlette.SaxWalker</classname>. The following example
    should show how easy it is to use:</para>

    <programlisting>from Ft.Xml.Domlette import SaxWalker
from Ft.Xml import Parse

XML = "&lt;a&gt;&lt;b/&gt;&lt;b/&gt;&lt;/a&gt;"

class element_counter:
    def startDocument(self):
        self.ecount = 0

    def startElementNS(self, name, qname, attribs):
        self.ecount += 1

#First get a Domlette document node
doc = Parse(XML)
#Then SAX "parse" it
parser = SaxWalker(doc)
handler = element_counter()
parser.setContentHandler(handler)
#You can set any properties or features, or do whatever
#you would to a regular SAX2 parser instance here
parser.parse() #called without any argument
print "Elements counted:", handler.ecount</programlisting>
  </section>

  <section id="saxlette_dombuilder">
    <title>Building a Domlette from SAX events</title>

    <para>Saxlette includes a convenience ContentHandler
    (<classname>Ft.Xml.Sax.DomBuilder</classname>) which listens for SAX
    events and constructs Domlette Documents.</para>
  </section>

  <section id="saxlette_generator">
    <title>Feeding a generator from SAX events</title>

    <para>Python's generators are special functions that can produce a series
    of partial results within the course of running. The calling program can
    start up a generator, which is suspended when a partial result is yielded,
    and resumed explicitly by the program when the next result is required.
    This capability is mirrored in the Expat parser that is the basis of
    Saxlette. Saxlette has a feature, <literal>FEATURE_GENERATOR</literal>
    which you can set on a parser object to enable generator semantics. If
    this feature is set, the <literal>parse()</literal> method returns an
    iterator. This iterator yields results set by the the SAX handlers. The
    handlers specify the partial results by setting the property
    <literal>PROPERTY_YIELD_RESULT</literal> with the value to be yielded. As
    an example, the following code reports the name of all attributes used in
    the document.</para>

    <programlisting>class report_attributes:
    def __init__(self, parser):
        self.parser = parser
        return

    def startElementNS(self, name, qname, attribs):
        self.parser.setProperty(Sax.PROPERTY_YIELD_RESULT, attribs)
        return

from Ft.Xml import Sax, CreateInputSource

parser = Sax.CreateParser()
parser.setFeature(Sax.FEATURE_GENERATOR, True)
handler = report_attributes(parser)
parser.setContentHandler(handler)
attribs_iterator = parser.parse(CreateInputSource('test.xhtml'))
for attribs in attribs_iterator:
     for name in attribs.keys(): print name</programlisting>
  </section>

  <section>
    <title>SAX filters</title>

    <para>In SAX processing, the parser passes to the application a stream of events that represents the XML content. An important aspect of SAX is the user's ability to create SAX filters, which accept a stream of SAX events and pass on a modified stream. For example, you might use a SAX filter to take look for DOcbook <literal>sect1</literal>,  <literal>sect2</literal> etc. elements, and rename them to <literal>section</literal> elements before passing them on for further processing (presumably by a SAX handler that only understands how to deal with the latter form).  You can chain SAX filters as well, and the idea behind SAX filters is usually reuse across a broad array of applications, focusing each filter they on a single task that can be cleanly separated from upstream and downstream processing.  SAX filters can thus be useful building blocks for XML pipelines.</para>

  <programlisting>from xml import sax
from xml.sax.saxutils import XMLFilterBase
from Ft.Xml import CreateInputSource, XML_NAMESPACE as XMLNS
from Ft.Xml.Sax import SaxPrinter

XML = """&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;menu&gt;
  &lt;item id="A" xml:lang="en"&gt;Orange juice&lt;/item&gt;
  &lt;item id="A" xml:lang="es"&gt;Jugo de naranja&lt;/item&gt;
  &lt;item id="B" xml:lang="en"&gt;Toast&lt;/item&gt;
  &lt;item id="B" xml:lang="es"&gt;Pan tostada
    &lt;note xml:lang="en"&gt;Wheat bread only, please&lt;/note&gt;
  &lt;/item&gt;
&lt;/menu&gt;
"""

#Define constants for the two states we care about
ALLOW_CONTENT = 1
SUPPRESS_CONTENT = 2

class english_only_filter(XMLFilterBase):
    def __init__(self, downstream):
        XMLFilterBase.__init__(self, downstream)
        return

    def startDocument(self):
        #Set the initial state, and set up the stack of states
        self._state_stack = [ALLOW_CONTENT]
        XMLFilterBase.startDocument(self)
        return

    def startElementNS(self, name, qname, attrs):
        #Check if there is any language attribute
        lang = attrs.get((XMLNS, 'lang'))
        if lang:
            #Set the state as appropriate
            if lang[:2] == 'en':
                self._state_stack.append(ALLOW_CONTENT)
            else:
                self._state_stack.append(SUPPRESS_CONTENT)
        #Always update the stack with the current state
        #Even if it has not changed
        
        #Only forward the event if the state warrants it
        if self._state_stack[-1] == ALLOW_CONTENT:
            XMLFilterBase.startElementNS(self, name, qname, attrs)
        return

    def endElementNS(self, name, qname):
        self._state_stack.pop()
        #Only forward the event if the state warrants it
        if self._state_stack[-1] == ALLOW_CONTENT:
            XMLFilterBase.endElementNS(self, name, qname)
        return

    def characters(self, content):
        #Only forward the event if the state warrants it
        if self._state_stack[-1] == ALLOW_CONTENT:
            XMLFilterBase.characters(self, content)
        return

if __name__ == "__main__":
    parser = sax.make_parser(['Ft.Xml.Sax'])
    #SaxPrinter is a special SAX handler that merely writes
    #SAX events back into an XML document
    filtered_parser = english_only_filter(parser)
    handler = SaxPrinter()
    filtered_parser.setContentHandler(handler)
    filtered_parser.parse(CreateInputSource(XML))
</programlisting>

    <para>Most SAX handlers operate as state machines, meaning they manage some variables based on the stream of events that come in, and change behavior based on these variables. english_only_filter is set up to be in one of two states: one in which content is passed on to the downstream handler, and one in which content is suppressed. This state is marked in the self._state_stack. The state is initially set to <literal>ALLOW_CONTENT</literal>, and changed to <literal>SUPPRESS_CONTENT</literal> if the filter encounters an xml:lang attribute that represents a language other than English (which can be done by checking the first two characters of the value, according to the rules of standard language codes).  It has to be a stack because XML language specifications are scoped, so that in the example XML at the top of the listing the string "Pan tostada" is within the scope of the element with the attribute xml:lang="es", and so it is marked as being in Spanish. The entire note element, however, is marked as being in English by an overriding xml:lang="en" attribute.</para>

    <para>The SAX handler is set to <classname>Ft.Xml.SaxPrinter</classname>, which channels the final SAX evenis onto a 4Suite printer which creates a serialized XML document.  It's quite easy to chain filters.  If you wanted the parser to send events to a filter of class <classname>some_other_filter</classname> which then passed on events to <classname>english_only_filter</classname> the relevant line would look as follows:</para>

<!-- Expected Output:
<?xml version="1.0" encoding="utf-8"?>
<menu>
  <item xml:lang="en" id="A">Orange juice</item>
  <item xml:lang="en" id="B">Toast</item>
  <note xml:lang="en">Wheat bread only, please</note>
  </menu>
-->

  <programlisting>    filtered_parser = english_only_filter(some_other_filter(parser))
</programlisting>

  </section>

  <section>
    <title>Streaming canonicalization</title>

  <para>The combination of streaming parsing using Saxlette and streaming serialization using <classname>Ft.Xml.Lib.CanonicalXmlPrinter</classname> allows for
very efficient XML canonicalization (c14n).
</para>

  <para/>

  <programlisting>import sys
from xml import sax
from Ft.Xml import CreateInputSource
from Ft.Xml.Sax import SaxPrinter
from Ft.Xml.Lib.XmlPrinter import CanonicalXmlPrinter

parser = sax.make_parser(['Ft.Xml.Sax'])
handler = SaxPrinter(CanonicalXmlPrinter(sys.stdout))
parser.setContentHandler(handler)
parser.parse(CreateInputSource('   &lt;a&gt;&lt;b b="1" a="2"/&gt;&lt;/a&gt;   '))

</programlisting>

<!-- Expected Output:
<a><b a="2" b="1"></b></a>
-->

  </section>

</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/XPath.xml" id="xpath_engine">
  <title>XPath queries</title>

  <para>4Suite provides an XPath processing engine, compliant with <ulink url="http://www.w3.org/TR/xpath">the W3C XPath 1.0 specification</ulink>.
  This query engine is accessible through <systemitem class="library">Ft.Xml.XPath</systemitem>.</para>

  <section id="xpath_quick">
    <title>The quickest option</title>

    <para>If you are using Domlette, as described above, the quickest and
    easiest way to use the XPath facility in 4Suite is the
    <methodname>xpath()</methodname> method, which any Domlette
    <classname>Node</classname> supports:</para>

    <programlisting>from Ft.Xml.Domlette import NonvalidatingReader
doc = NonvalidatingReader.parseString("&lt;spam&gt;eggs&lt;a/&gt;&lt;a/&gt;&lt;/spam&gt;")
doc2 = NonvalidatingReader.parseString("&lt;spam&gt;eggs&lt;eggs n='1'&gt; and ham&lt;/eggs&gt;&lt;/spam&gt;")
print doc.xpath(u'(//a)[1]')
print doc.xpath(u'string(/spam)')
print doc2.xpath(u'string(//eggs/@n)')</programlisting>

    <para>The line</para>

    <programlisting>print doc.xpath(u'(//a)[1]')</programlisting>

    <para>Is actually a shortcut for the following more involved construct,
    which is described in detail in the next section:</para>

    <programlisting>from Ft.Xml.XPath import Evaluate
print Evaluate(u'(//a)[1]', contextNode=doc)</programlisting>

    <para>This example prints three lines. The first line shows a string
    representation of a list containing a single element. As we see from this
    line, an XPath selection of nodes returns a Python list. In this case, it
    is a list containing a single element—the first element with a local name
    of <sgmltag class="element">a</sgmltag>, which has no attributes and no
    children. The second line shows the correct string value of the selected
    <sgmltag class="element">spam</sgmltag> element, and the third line shows
    the correct string value of the <sgmltag class="attribute">n</sgmltag>
    attribute.</para>

    <screen>[&lt;Element at 0xb7d10bb4: name u'a', 0 attributes, 0 children&gt;]
eggs
1</screen>
  </section>

  <section id="typeMap">
    <title>Type mappings</title>

    <para>4Suite XPath functions return results with Python types that depend
    on the XPath data model type of the query result. The following list shows
    how the five XPath result types (String, number, boolean, node-set and
    object) are mapped to Python types:</para>

    <itemizedlist>
      <listitem>
        <para>XPath string: Python unicode type</para>
      </listitem>

      <listitem>
        <para>XPath number: Python float type (int or long also accepted), or
        instance of Ft.Lib.number.nan (for NaN) or Ft.Lib.number.inf (for
        Infinity)</para>
      </listitem>

      <listitem>
        <para>XPath boolean: Ft.Lib.boolean instance</para>
      </listitem>

      <listitem>
        <para>XPath node-set: Python list of Domlette nodes, in document
        order, with no duplicates</para>
      </listitem>

      <listitem>
        <para>XPath foreign object: any other Python object (you will very
        rarely encounter this case)</para>
      </listitem>
    </itemizedlist>
  </section>

  <section>
    <title>Advanced use</title>

    <para>XPath expressions can refer to both variables and qualified names
    (QNames) that must be defined by the environment that is executing the
    XPath expression. This section describes how to use these advanced
    features of XPath using the 4Suite interface.</para>

    <para>4Suite's XPath implementation uses a Domlette node as the context
    node for XPath operations. The following example demonstrates the use of
    XPath to extract content from an XML document. The document must be parsed
    before Xpath can be used to access it. The following example parses the
    XML document and explicitly sets up an XPath context to run an XPath
    query.</para>

    <programlisting>XML = """
&lt;ham&gt;
&lt;eggs n='1'/&gt;
This is the string content with &lt;em&gt;emphasized text&lt;/em&gt; text
&lt;/ham&gt;"""

from Ft.Xml import Parse
from Ft.Xml.XPath.Context import Context
from Ft.Xml.XPath import Evaluate

doc = Parse(XML)
ctx = Context(doc)
nodes = Evaluate(u'//em', ctx)

# The return value, a node set, comes back as a Python list of nodes
# which may be accessed using an iterator
for n in nodes:
    # print dir(n)
    print n.tagName
    print n.firstChild.nodeValue</programlisting>

    <para>XPath always requires a context for execution; a common XPath
    context is the root of the target document, such as we did in the above
    example. Think about an XPath query being executed from some location in
    an XML document. This location in the document is a necessary component of
    using XPath.</para>

    <para>There is more to an XPath context than just the context node, but if
    your needs are as straightforward as that of the above example, there is
    an abbreviated version of the <methodname>Evaluate</methodname> method for
    this purpose. For example, the following fragment is equivalent to the two
    lines creating a context and evaluating the expression in the above
    example.</para>

    <programlisting># No need to create a context object
Evaluate(u'//em', contextNode=doc)</programlisting>

    <para>If your source document uses XML Namespaces you will likely need to
    use QNames in your XPath expressions. For this to work, you'll need to
    introduce namespace mappings into your XPath context. For example, if the
    elements of our XML document above are in an XML namespace, then we must
    set up our context slightly differently.</para>

    <programlisting>XML = """&lt;ham xmlns="http://example.com/ns#"&gt;
&lt;eggs n='1'/&gt;
This is the string content with &lt;em type='bold'&gt;emphasized Namespaced Text&lt;/em&gt; text
&lt;/ham&gt;"""

from Ft.Xml import Parse
from Ft.Xml.XPath.Context import Context
from Ft.Xml.XPath import Evaluate

NSS = {u'ex': u'http://example.com/ns#'}
doc = Parse(XML)
ctx = Context(doc, processorNss=NSS)
nodes = Evaluate(u'//ex:em', ctx)
for n in nodes:
    # print dir(n)
    print n.tagName
    print n.firstChild.nodeValue</programlisting>

    <para>You define XPath namespace prefixes through a Python dictionary
    (<varname>NSS</varname> in the above example) which maps these prefixes,
    such as '<literal>ex</literal>' in the above example, to the appropriate
    namespace URI, such as '<literal>http://example.com/ns#</literal>' in the
    above example. This prefix mapping is added to your XPath context using
    the <parameter>processorNss</parameter> parameter to the
    <function>Context</function> function.</para>

    <para>In a similar way, you can also pass in variable bindings which may
    be used as values later in your XPath expressions. In this case, however,
    variables are Python tuples containing the namespace URI and local name of
    the variable.</para>

    <programlisting>ctx = Context(node, varBindings=
  {(EMPTY_NAMESPACE, u'date'): u'2003-06-20'})
Evaluate('event[@date = $date]', context=ctx)</programlisting>

    <para>This creates a variable in the default namespace named 'date', with
    a value of '<literal>2003-06-20</literal>'; this is then used for
    comparison with the date attribute in the Xpath expression.</para>

    <para>XPath variables are Qnames, so you pass in variable names as
    namespace/local name tuples. The values can be numbers, unicode objects or
    boolean objects:</para>

    <programlisting>from Ft.Xml.XPath import boolean
ctx = Context(node, varBindings={(EMPTY_NAMESPACE, u'test'): boolean.true})</programlisting>

    <para>This sets the variable 'test' to the boolean value true (remember
    that this is for the XPath environment, not the Python one), and again
    this may be used as in any XSLT stylesheet.</para>

    <para>If you only want a value once, you may of course still use string
    constants, as in</para>

    <programlisting>nodes=Evaluate(u'//testPrefix:em[@type="bold"]',ctx)</programlisting>

    <para>Note the quotes used? These must be balanced, hence the literal
    value uses double quotes.</para>
  </section>

  <section>
    <title>Reusing parsed XPath queries</title>

    <para>Sometimes you want to re-use an XPath expression and namespace
    mapping multiple times, for efficiency and convenience. The following
    example shows an example of this:</para>

    <programlisting>from Ft.Xml.XPath.Context import Context
from Ft.Xml.XPath import Compile, Evaluate
from Ft.Xml import Parse

DOCS = ["&lt;spam xmlns='http://spam.com'&gt;eggs&lt;/spam&gt;",
        "&lt;spam xmlns='http://spam.com'&gt;grail&lt;/spam&gt;",
        "&lt;spam xmlns='http://spam.com'&gt;nicht&lt;/spam&gt;",
       ]

# Pre-compile for efficiency and convenience
expr = Compile(u"/a:spam[contains(., 'i')]")
ctx = Context(None, processorNss={u"a": u"http://spam.com"})

i = 1
for doc in DOCS:
    doc = NonvalidatingReader.parseString(doc.encode('UTF-8'),
                                          "http://spam.com/base")
    retval = Evaluate(expr, doc, ctx)
    if len(retval):
        print "Document", i, "meets our criteria"
    i += 1</programlisting>

    <para>Which should display:</para>

    <screen>Document 2 meets our criteria
Document 3 meets our criteria</screen>
  </section>

  <section>
    <title>Migration from PyXML's XPath</title>

    <para>There is a usable XPath module in PyXML (warning: PyXML's XSLT
    implementation is not usable: use 4Suite if you need XSLT), but there are
    a lot of updates and improvements in the XPath library version in
    4Suite.</para>

    <para>If you are familiar with PyXML, you may have used a different form
    of imports to load in XPath and XSLT features. The imports are different
    under 4Suite.</para>

    <para>Usage example:</para>

    <orderedlist>
      <listitem>
        <para>PyXML usage (do not use with 4Suite):</para>

        <programlisting>import xml.xslt
import xml.xpath</programlisting>
      </listitem>

      <listitem>
        <para>4Suite usage (use these imports):</para>

        <programlisting>import Ft.Xml.XPath
import Ft.Xml.Xslt</programlisting>
      </listitem>
    </orderedlist>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/XSLT.xml" id="xslt_engine">
  <title>XSLT processing</title>

  <section id="simple_xslt_api">
    <title>The super-simple XSLT API</title>

    <para>For basic XSLT transform needs, or to get started quickly, the
    <systemitem class="library">Ft.Xml.Xslt</systemitem> module offers a quick
    way to apply transforms XML documents and get back the simple string
    result. Within this module, the function of interest is
    <function>Transform</function>.</para>

    <variablelist>
      <varlistentry>
        <term><methodsynopsis>
            <methodname>Transform</methodname>

            <methodparam>
              <parameter>fname_or_uri</parameter>
            </methodparam>

            <methodparam>
              <parameter>string_stream_fname_uri_isrc</parameter>
            </methodparam>

            <methodparam>
              <parameter>[param]</parameter>
            </methodparam>

            <methodparam>
              <parameter>[output]</parameter>
            </methodparam>

          </methodsynopsis></term>

        <listitem>
          <para>The <function>Transform</function> function takes two
          arguments, with an optional third. The first is the source XML for the transform. The
          second is the XSLT document.  Both are given as a string, an object like an
          open file, a local file path on your computer, an absolute URI, or
          an InputSource object.  The optional <parameter>params</parameter> is a dictionary of stylesheet parameters, the keys of
             which may be given as unicode objects if they have no namespace,
             or as (uri, localname) tuples if they do.  The values are the overriden parameter values. If you do not supply the optional <parameter>output</parameter> parameter the return value is a string with the result
          of this transform.  If you do supply this parameter it must be a file-like object to which the output will be written, and then the return value is None.</para>

<!--
      <warning>
        <para>This function will get you started quickly because it
        specifically chooses some default values for some of the more advanced
        parsing features. If you are passing in a string or stream, and the material in <xref linkend="base_URIs" />
        applies to your parsing situation, then you will want to use the
        full-featured API. In brief, if your XML document references external
        resources, you should not use
            this convenience function. See <xref linkend="full_XSLT_API" />,
            instead.</para>
      </warning>
-->

        </listitem>
      </varlistentry>
    </variablelist>

    <programlisting>XML = """
&lt;ham&gt;
&lt;eggs n='1'/&gt;
This is the string content with &lt;em&gt;emphasized text&lt;/em&gt; text
&lt;/ham&gt;"""

from Ft.Xml.Xslt import Transform
# URL for the identity transform: reproduces the input XML in the result
ID_TRANSFORM = 'http://cvs.4suite.org/viewcvs/*checkout*/4Suite/Ft/Data/identity.xslt'

result = Transform(XML, ID_TRANSFORM)
print result

# If the above XML document were located in the file
# "target.xml", we could have used `Transform("target.xml", ID_TRANSFORM)`.

#It's more efficient to redirect the processor output to an output stream.  The following does so:
import sys
result = Transform(XML, ID_TRANSFORM, output=sys.stdout)
print result</programlisting>
  </section>

  <section id="full_XSLT_API">
    <title>Full XSLT processing API</title>

    <para>Here is the general procedure for using the Python API for XSLT
    processing:</para>

    <orderedlist>
      <listitem>
        <para>Create an <classname>Ft.Xml.Xslt.Processor.Processor</classname>
        instance.</para>
      </listitem>

      <listitem>
        <para>Prepare <classname>Ft.Xml.InputSource</classname> instances (via
        their factory) for the source XML and stylesheet.</para>
      </listitem>

      <listitem>
        <para>Call the Processor's <methodname>appendStylesheet</methodname>
        method, passing it the stylesheet's
        <classname>InputSource</classname>.</para>
      </listitem>

      <listitem>
        <para>Call the Processor's <methodname>run</methodname> method,
        passing it the source document's
        <classname>InputSource</classname>.</para>
      </listitem>
    </orderedlist>

    <para>For input to our transform, we will use the namespaced example as in
    the last section.</para>

    <screen>$ cat testNS.xml
&lt;ham xmlns="http://example.com/ns#"&gt;
&lt;eggs n='1'/&gt;
This is the string content with
 &lt;em type='bold' f='2'&gt;emphasized Namespaced Text&lt;/em&gt;
text
&lt;/ham&gt;</screen>

    <para>For our stylesheet, we will again use one of the simplest useful
    examples, the identity stylesheet.</para>

    <screen>$ cat identity.xsl
&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"&gt;

  &lt;xsl:template match="@*|node()"&gt;
    &lt;xsl:copy&gt;
      &lt;xsl:apply-templates select="@*|node()"/&gt;
    &lt;/xsl:copy&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;</screen>

    <para>The code below follows the processing outline, having converted the
    input file and stylesheet to the URI format.</para>

    <programlisting>from Ft.Xml.Xslt import Processor
# We use the InputSource architecture
from Ft.Xml import InputSource
from Ft.Lib.Uri import OsPathToUri  # path to URI conversions

processor = Processor.Processor()

# Prepare an InputSource for the source document
# Convert from local file to uri
srcAsUri = OsPathToUri('testNS.xml')
source = InputSource.DefaultFactory.fromUri(srcAsUri)

# Prepare an InputSource for the stylesheet
# Convert from local file to uri
ssAsUri = OsPathToUri('identity.xsl')
transform = InputSource.DefaultFactory.fromUri(ssAsUri)

processor.appendStylesheet(transform)
result = processor.run(source)

# result is a string with the serialized transform result
print result</programlisting>

    <para>You can call <methodname>run</methodname> multiple times on
    different <classname>InputSource</classname>s. When you're done, the
    processor's <methodname>reset</methodname> method can be used to restore a
    clean slate (at which point you would have to append stylesheets to the
    processor again).</para>

    <para>The following example uses our <varname>processor</varname> from the
    previous example to transform a new XML document, this one constructed
    manually.</para>

    <programlisting>XML = """&lt;foo&gt;&lt;bar/&gt;&lt;/foo&gt;"""
source = InputSource.DefaultFactory.fromString(XML, 'http://example.org/foo')

result = processor.run(source)

# result is a string with the serialized transform result
print result</programlisting>

    <para>This code continues from the previous example to process the second
    document, using the same <varname>processor</varname> and stylesheet. This
    is a useful form when there is a requirement for server side processing of
    multiple input documents with a common stylesheet.</para>
  </section>

  <section>
    <title>Example</title>

    <para>In the example below, strings are used as the source of the
    transform (stylesheet) and source documents, and we are careful to pass in
    a URI to identify each of them. In the source document, the URI is needed
    for resolving external entity references and XIncludes. In the stylesheet,
    the URI is needed for resolving <function>document</function> function
    calls, <sgmltag class="element">xsl:include</sgmltag>s and <sgmltag class="element">xsl:import</sgmltag>s.</para>

    <para>If you do not provide a URI and you attempt to use any of these
    features, you may get an exception.</para>

    <programlisting># The identity transform: duplicates the input to output
TRANSFORM = """
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"&gt;

  &lt;xsl:template match="@*|node()"&gt;
    &lt;xsl:copy&gt;
      &lt;xsl:apply-templates select="@*|node()"/&gt;
    &lt;/xsl:copy&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

SOURCE = """&lt;spam id="eggs"&gt;I don't like spam&lt;/spam&gt;"""

# The processor class is the core of the XSLT API
from Ft.Xml.Xslt import Processor
processor = Processor.Processor()

# We use the InputSource architecture
from Ft.Xml import InputSource

# Prepare an InputSource for the transform
transform = InputSource.DefaultFactory.fromString(TRANSFORM,
  "http://spam.com/identity.xslt")

# Prepare an InputSource for the source document
source = InputSource.DefaultFactory.fromString(SOURCE,
  "http://spam.com/doc.xml")
processor.appendStylesheet(transform)
result = processor.run(source)

# result is a string with the serialized transform result
print result</programlisting>
  </section>

  <section>
    <title>Using Domlette objects instead of InputSources</title>

    <para>If your documents are already in the form of Domlette documents, you
    don't need to create <classname>InputSource</classname>s for them; you can
    just use the <classname>Processor</classname>'s
    <methodname>appendStylesheetNode</methodname> and
    <methodname>runNode</methodname> methods instead of
    <methodname>appendStylesheet</methodname> and
    <methodname>run</methodname>, respectively.</para>

    <note>
      <para>It is usually slower to read the stylesheet from a Domlette object
      than to parse a serialized document.</para>
    </note>

    <note>
      <para>The Domlette documents used in the following example are obtained
      by parsing existing XML, but this approach can just as easily be used on
      Domlette documents that are built programmatically (i.e. using the DOM
      API).</para>
    </note>

    <programlisting># The identity transform: duplicates the input to output
TRANSFORM = """
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"&gt;

  &lt;xsl:template match="@*|node()"&gt;
    &lt;xsl:copy&gt;
      &lt;xsl:apply-templates select="@*|node()"/&gt;
    &lt;/xsl:copy&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

SOURCE = """&lt;spam id="eggs"&gt;I don't like spam&lt;/spam&gt;"""

from Ft.Xml.Xslt import Processor
processor = Processor.Processor()
from Ft.Xml.Domlette import NonvalidatingReader

# Create a DOM for the transform
transform = NonvalidatingReader.parseString(TRANSFORM,
  "http://spam.com/identity.xslt")

# Create a DOM for the source document
source = NonvalidatingReader.parseString(SOURCE, "http://spam.com/doc.xml")
processor.appendStylesheetNode(transform, "http://spam.com/identity.xslt")
result = processor.runNode(source, "http://spam.com/doc.xml")
print result</programlisting>

    <para>If you have objects from another DOM library, you can first convert
    them to Domlette objects as shown in <xref linkend="converting_DOM"/>.</para>
  </section>

  <section>
    <title>Top-level parameters</title>

    <subtitle>Passing parameters to a stylesheet</subtitle>

    <para>You can pass in stylesheet parameters as a Python dictionary. Use
    the parameter names for keys. Values use the 4Suite XPath library's
    standard type mappings, which are described in <xref linkend="typeMap"/>.</para>

    <para>Parameter and variable names in XPath/XSLT are actually
    expanded-names, which we represent as (namespaceURI, localName) tuples. If
    your parameter name is in a namespace, you have to use a tuple as the
    mapping key. Otherwise, you may simply use a unicode string that
    represents the local-name part only
    (<constant>Ft.Xml.EMPTY_NAMESPACE</constant> is the default
    namespace).</para>

    <para>Here is an example, which passes in the computed "date" parameter to
    the stylesheet from the program:</para>

    <programlisting>SRC = """&lt;?xml version="1.0"?&gt;&lt;dummy/&gt;"""

STY = """&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:param name="date" select="'unknown'"/&gt;

  &lt;xsl:output method="xml" indent="yes" encoding="us-ascii"/&gt;

    &lt;xsl:template match="/"&gt;
      &lt;result&gt;
        &lt;xsl:value-of select="$date"/&gt;
      &lt;/result&gt;
    &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor
import time
src_isrc = InputSource.DefaultFactory.fromString(SRC, 'http://foo/dummy.xml')
sty_isrc = InputSource.DefaultFactory.fromString(STY, 'http://foo/dummy.xsl')

proc = Processor.Processor()
proc.appendStylesheet(sty_isrc)
params = {u'date': unicode(time.asctime())}
result = proc.run(src_isrc, topLevelParams=params)
print result</programlisting>
  </section>

  <section>
    <title>Using xml-stylesheet processing instructions</title>

    <para>4Suite honors the <ulink url="http://www.w3.org/TR/xml-stylesheet/">Associating Stylesheets with
    XML Documents</ulink> W3C Recommendation and <ulink url="http://www.faqs.org/rfcs/rfc3023.html">RFC 3023: XML Media
    Types</ulink>. Instead of (or in addition to) using the processor's
    explicit APIs to establish the stylesheet to be used for the
    transformation, the source document may contain an xml-stylesheet
    processing instruction (PI) that refers to a stylesheet via a URI
    reference.</para>

    <para>The xml-stylesheet PI must meet the following criteria:</para>

    <itemizedlist>
      <listitem>
        <para>It must appear in the document prolog.</para>
      </listitem>

      <listitem>
        <para>It must contain a "type" pseudo-attribute having one of the
        following values: <itemizedlist>
            <listitem>
              <para>application/xslt+xml</para>
            </listitem>

            <listitem>
              <para>application/xslt</para>
            </listitem>

            <listitem>
              <para>text/xml</para>
            </listitem>

            <listitem>
              <para>application/xml</para>
            </listitem>
          </itemizedlist></para>
      </listitem>

      <listitem>
        <para>It must contain an "href" pseudo-attribute that is a URI
        reference for the stylesheet. It will be resolved relative to the base
        URI of the source document that contains the xml-stylesheet PI.</para>
      </listitem>
    </itemizedlist>

    <para>This example shows a PI being used to refer to the identity
    stylesheet mentioned earlier</para>

    <programlisting>&lt;?xml-stylesheet type="application/xslt" href="identity.xsl"?&gt;</programlisting>

    <para>If you need to add to the supported media types, e.g., to add the
    nonstandard "text/xsl", then follow the example given in <ulink url="http://mail.python.org/pipermail/xml-sig/2004-January/010090.html">this
    mailing list message</ulink>.</para>

    <para>If the PI contains "alternate" and "media" pseudo-attributes, the
    package will do its best to handle them. See <ulink url="http://lists.fourthought.com/pipermail/4suite/2003-September/012218.html">this
    message</ulink> for details and examples.</para>
  </section>

  <section>
    <title>Alternative output destinations</title>

    <para>Normally, the processor buffers all output, then returns it as a
    byte string. If you want to write directly to some other stream (any
    Python file-like object that has a <methodname>write</methodname> method),
    you can supply the stream as the optional
    <parameter>outputStream</parameter> argument to the Processor's
    <methodname>run</methodname> method. When you supply your own output
    stream, the <methodname>run</methodname> method will return
    <literal>None</literal>. Here is an example that writes directly to
    <constant>stdout</constant>:</para>

    <example id="ex.stdout">
      <title>Transform output sent to standard out</title>

      <programlisting>SRC = """&lt;?xml version="1.0"?&gt;&lt;dummy/&gt;"""

STY = """&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:output method="xml" indent="yes" encoding="us-ascii"/&gt;

  &lt;xsl:template match="/"&gt;
    &lt;result&gt;hello world&lt;/result&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

import sys
from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor

src_isrc = InputSource.DefaultFactory.fromString(SRC, 'http://foo/dummy.xml')
sty_isrc = InputSource.DefaultFactory.fromString(STY, 'http://foo/dummy.xsl')

proc = Processor.Processor()
proc.appendStylesheet(sty_isrc)
result = proc.run(src_isrc, outputStream=sys.stdout)</programlisting>
    </example>

    <para>You also have the option of other kinds of output. Just set the
    <literal>writer</literal> argument of the processor's
    <methodname>run</methodname> method to an instance of an XSLT output
    writer, which is a handler of SAX-like events coming from the processor as
    it generates the result tree. 4Suite provides several writer classes for
    alternative output:</para>

    <itemizedlist>
      <listitem>
        <para>If you want the XSLT output as SAX events, use an instance of
        <classname>Ft.Xml.Xslt.SaxWriter.SaxWriter</classname>. Give its
        constructor a <parameter>saxHandler</parameter> keyword argument that
        is your own PyXML SAX2 event handler.</para>
      </listitem>

      <listitem>
        <para>If you want the XSLT output as a Domlette document, use an
        instance of <classname>Ft.Xml.Xslt.RtfWriter.RtfWriter</classname>.
        Give its constructor a second argument: the base URI of the document
        to create. Obtain the document by calling the writer's
        <methodname>getResult</methodname> method after XSLT processing is
        finished.</para>
      </listitem>

      <listitem>
        <para>If you want the XSLT output as any other kind of Python DOM
        document, use an instance of
        <classname>Ft.Xml.Xslt.DomWriter.DomWriter</classname>. Give its
        constructor an <parameter>implementation</parameter> keyword argument
        that is your desired DOM implementation. Also try to set the
        <parameter>ownerDoc</parameter> to an existing Document node (from the
        same implementation) from which a base URI for the new document can be
        obtained.</para>
      </listitem>

      <listitem>
        <para>If you want the XSLT output in a regular file, open a file for
        writing then pass this file object to the
        <function>proc.run</function> as the
        <parameter>outputStream</parameter> parameter value, in the same way
        as the example above which used the <constant>sys.stdout</constant>
        file object. An example is shown below.</para>
      </listitem>

      <listitem>
        <para>If you want to make a custom output writer, just make your class
        extend <classname>Ft.Xml.Xslt.NullWriter.NullWriter</classname>. If it
        needs access to the XSLT output parameters, then the constructor
        should take an instance of
        <classname>Ft.Xml.Xslt.OutputParameters.OutputParameters</classname>,
        which will have the data attributes method, version, encoding,
        omitXmlDeclaration, standalone, doctypeSystem, doctypePublic,
        mediaType, cdataSectionElements, and indent, which your writer can act
        upon, if appropriate. See the <literal>NullWriter</literal> API
        documentation for further info.</para>
      </listitem>
    </itemizedlist>

    <para>Here is an example of writing to a regular Domlette document:</para>

    <programlisting>SRC = """&lt;?xml version="1.0"?&gt;&lt;dummy/&gt;"""

STY = """&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:output method="xml" indent="yes" encoding="us-ascii"/&gt;

  &lt;xsl:template match="/"&gt;
    &lt;result&gt;hello world&lt;/result&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

import sys
from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor
from Ft.Xml.Xslt.DomWriter import DomWriter
from Ft.Xml.Domlette import PrettyPrint

src_isrc = InputSource.DefaultFactory.fromString(SRC, 'http://foo/dummy.xml')
sty_isrc = InputSource.DefaultFactory.fromString(STY, 'http://foo/dummy.xsl')

from Ft.Xml.Domlette import implementation as impl
domlette_writer = DomWriter(implementation=impl)

proc = Processor.Processor()
proc.appendStylesheet(sty_isrc)
proc.run(src_isrc, writer=domlette_writer)

result_doc = domlette_writer.getResult()
PrettyPrint(result_doc)</programlisting>

    <para>This example writes the transform output to a file. This is a
    variant of <link linkend="ex.stdout">the earlier one</link>. Output is
    written to <filename>tmp.xml</filename>.</para>

    <programlisting>SRC = """&lt;?xml version="1.0"?&gt;&lt;dummy/&gt;"""

STY = """&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:output method="xml" indent="yes" encoding="us-ascii"/&gt;

  &lt;xsl:template match="/"&gt;
    &lt;result&gt;hello world&lt;/result&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

import sys
from Ft.Xml import InputSource
from Ft.Xml.Xslt import Processor

src_isrc = InputSource.DefaultFactory.fromString(SRC, 'http://foo/dummy.xml')
sty_isrc = InputSource.DefaultFactory.fromString(STY, 'http://foo/dummy.xsl')

proc = Processor.Processor()
proc.appendStylesheet(sty_isrc)

f = open('tmp.xml', mode='w')
result = proc.run(src_isrc, outputStream=f)
f.close()</programlisting>

    <para>There are many more options available for customizing XSLT
    development; see the <classname>Processor</classname> module documentation
    for details:</para>

    <screen>&gt;&gt;&gt; from Ft.Xml.Xslt import Processor
&gt;&gt;&gt; help(Processor)</screen>
  </section>

  <section>
    <title>Transform chaining</title>

    <para>4Suite provides some hooks for scenarios where the output from one
    transform becomes the source document for another. This is called
    transform chaining. The user still has to write the sequence of transform
    invocations in the Python API (the 4xslt command can perform chaining for
    the user). This section shows how.</para>

    <para>In the following example the next transform in the chain is set from
    within XSLT.</para>

    <programlisting># The first transform: just reproduces all para elements within a wrapper
TRANSFORM = """
&lt;xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:f="http://xmlns.4suite.org/ext"
  extension-element-prefixes="f"
&gt;

&lt;!-- Top level param so that user can pass in the next transform in the
     chain.  By default, use the identity transform --&gt;
&lt;xsl:param name="next-xslt"/&gt;

&lt;!-- grab just the first paras for the output --&gt;
&lt;xsl:template match="/"&gt;
  &lt;parawrapper&gt;
    &lt;xsl:apply-templates select="//para"/&gt;
  &lt;/parawrapper&gt;
  &lt;!-- Set the next transform in the chain.  You can also set to a
       hard-coded string --&gt;
  &lt;!-- notice that this is within a template, for instantiation --&gt;
  &lt;f:chain-to href="{$next-xslt}"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="para"&gt;
  &lt;xsl:copy-of select="."/&gt;
&lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;"""

DOC = """&lt;doc&gt;a&lt;para&gt;1&lt;/para&gt;b&lt;para&gt;2&lt;/para&gt;c&lt;/doc&gt;"""

from Ft.Xml.Xslt import Processor
from Ft.Xml import InputSource

transform = InputSource.DefaultFactory.fromString(TRANSFORM, "urn:x-bogus:main.xslt")

IDT = u'http://cvs.4suite.org/viewcvs/*checkout*/4Suite/Ft/Data/identity.xslt'

processor = Processor.Processor()
processor.appendStylesheet(transform)
source = InputSource.DefaultFactory.fromString(DOC, "urn:x-bogus:doc.xml")
result = processor.run(source, topLevelParams={(None, 'next-xslt'): IDT})
print result

# processor.chainTo is the fully-resolved absolute URI of the next transform,
# or None if there was no f:chain-to element instantiated in the transform that
# the processor last processed.
next = processor.chainTo

processor = Processor.Processor()
processor.appendStylesheet(InputSource.DefaultFactory.fromUri(next))
source = InputSource.DefaultFactory.fromString(DOC, "urn:x-bogus:doc.xml")
result = processor.run(source)
print result

next = processor.chainTo                      # Should now be None
print "chainTo:", processor.chainTo</programlisting>

    <para>Note: There is not yet an API for automating the transform chain
    loop above. Ideas were discussed and an experiment was conducted <ulink url="http://mail.python.org/pipermail/xml-sig/2004-February/010146.html">here</ulink>.
    If you have ideas for a good API, please submit them to the mailing
    list.</para>
  </section>

  <section>
    <title>XSLT patterns</title>

    <para>XSLT defines a pattern language based on XPath which is used to
    declare rules for matching patterns in the XML source against which to
    fire XSLT templates. The pattern implementation that 4Suite's XSLT library
    uses is also exposed as a library of its own. XSLT patterns are useful
    when your task is not so much to compute arbitrary information from a
    given node but, rather, to choose quickly from a collection of nodes the
    ones that meet some basic rules. This might seem a subtle difference. The
    following example might help illustrate it.</para>

    <itemizedlist>
      <listitem>
        <para>XPath task: extract the class attribute from all the child
        elements of the context node</para>
      </listitem>

      <listitem>
        <para>XSLT pattern task: given a list of nodes, sort them into piles
        of those that have a class attribute and those that have a title
        child</para>
      </listitem>
    </itemizedlist>

    <para>The main API for pattern processing in 4Suite is
    <classname>Ft.Xml.Xslt.PatternList</classname>. The following is a code
    snippet that takes a node and returns a list of patterns it
    matches.</para>

    <programlisting>from <systemitem class="library">Ft.Xml.Xslt</systemitem> import <classname>PatternList</classname>
from <systemitem class="library">Ft.Xml.Domlette</systemitem> import <classname>NonvalidatingReader</classname>

# first pattern matches nodes with an href attribute
# the second matches elements with a title child
PATTERNS = ["*[@class]", "*[title]"]

# Second parameter is a dictionary of prefix to namespace mappings
plist = PatternList(PATTERNS, {})

DOC = """
&lt;spam&gt;
  &lt;e1 class="1"/&gt;
  &lt;e2&gt;&lt;title&gt;A&lt;/title&gt;&lt;/e2&gt;
  &lt;e3 class="2"&gt;&lt;title&gt;B&lt;/title&gt;&lt;/e3&gt;
&lt;/spam&gt;"""

doc = NonvalidatingReader.parseString(DOC, "file:foo.xml")
for node in doc.documentElement.childNodes:
    # Don't forget that the white space text nodes before and after
    # e1, e2 and e3 elements are also child nodes of the spam element
    if node.nodeName[0] == "e":
        print plist.lookup(node)</programlisting>

    <para>The <classname>PatternList</classname> initializer takes my list of
    strings, which it conveniently converts to a list of compiled pattern
    objects. Such objects have a <methodname>match</methodname> method that
    returns a boolean value, but I don't use these methods directly in this
    example. The <classname>PatternList</classname> initializer also takes a
    dictionary that makes up the namespace mapping. In this example, we use no
    namespaces, so the dictionary is empty. The
    <methodname>lookup</methodname> method is applied to a selection of the
    children of the <sgmltag class="element">spam</sgmltag> element (all the
    nodes whose name starts with "e", which happens to be all the element
    nodes). The output of listing 4 follows:</para>

    <screen>[*[attribute::class]]
[*[child::title]]
[*[attribute::class], *[child::title]]</screen>

    <para>The output is a list of the representations of the pattern objects
    that matched each node. Notice how the axis abbreviations have been
    expanded in the pattern object representation.</para>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/XPath_and_XSLT_Extensions.xml">
  <title>XPath and XSLT extensions</title>

  <para>Sometimes the built-in facilities of XPath and XSLT aren't quite
  enough to meet your processing needs. Luckily it's easy to extend the
  function of these libraries using user extension functions and elements,
  which are written in Python.</para>

  <section>
    <title>Extension functions (XPath and XSLT)</title>

    <para>To define your own extension functions for XPath and XSLT, you write
    corresponding Python function in a module, and provide a mapping from the
    desired XPath function names to Python function objects (or any callables).  Start with a simple example.  The following is a complete module which defines a single XPath function, <methodname>unichr(s)</methodname>  a simple example that takes a string and returns the Unicode code point number for the first character in that string.</para>

    <programlisting>#ord.py
from Ft.Xml.XPath import Conversions

def Ord(context, s):
    '''
    Available in XPath as ord() as defined by ExtFunctions mapping below
    Takes an object, which is coerced to string
    Returns the Unicode code point number for the first character in that string    Or returns -1 if it's an empty string
    '''
    s = Conversions.StringValue(s)  #Coerce the passed object to string
    if s:
        return ord(s[0])
    else:
        return -1

ExtFunctions = {
    (u'urn:x-4suite:x', u'ord'): Ord,
}

</programlisting>

    <para>As this simple example illustrates, The
    basic way to map XPath function names to Python function objects is in
    dictionary named "ExtFunctions", global to the module in which the
    extension function is defined.   The XPath/XSLT extension names are
    expressed as a Python tuple of two Unicode objects.  If you're familiar
    with XPath, this is just a Python representation of an expanded name.
    The first item in the expanded name tuple is
    the namespace URI for the element, and the second is the local name.
    The namespace URI cannot be an empty string.</para>

    <para>You have to actually tell the processor to load your extension modules.  There are several ways to do so.</para>

    <orderedlist>
      <listitem>
        <para>From Python code you can register them in a context object used for XPath processing
        by using the optional
        <parameter>extModuleList</parameter> to pass in a list of module
        objects.</para>
      </listitem>

      <programlisting>from Ft.Xml import Parse
from Ft.Xml.XPath.Context import Context
from Ft.Xml.XPath import Evaluate
import ord #The code listed above

doc = Parse('&lt;doc&gt;abc&lt;/doc&gt;')
ctx = Context(doc, extModuleList=[ord],
              processorNss={u'ext': u'urn:x-4suite:x'})
nodes = Evaluate(u'ext:ord(.)', ctx)
      </programlisting>

      <listitem>
        <para>You can also register particular functions rather than a
complete module in a XPath context object using the
        optional <parameter>extFunctionMap</parameter> argument.  It takes
        a mapping dictionary similar to the <constant>ExtFunctions</constant> dictionary shown in the above sample module.</para>
      </listitem>

      <programlisting>from Ft.Xml import Parse
from Ft.Xml.XPath.Context import Context
from Ft.Xml.XPath import Evaluate
from ord import Ord #The code listed above

doc = Parse('&lt;doc&gt;abc&lt;/doc&gt;')
ctx = Context(doc, extFunctionMap={(u'urn:x-4suite:x', u'ord'): Ord},
              processorNss={u'ext': u'urn:x-4suite:x'})
nodes = Evaluate(u'ext:ord(.)', ctx)
      </programlisting>

      <listitem>
        <para>If you are using the XSLT processor you can register extension functions on a processor object using
        the <methodname>registerExtensionModules()</methodname> method.</para>
      </listitem>

      <programlisting>from Ft.Xml.Xslt import Processor
import ord #The code listed above

processor = Processor.Processor()
processor.registerExtensionModules([ord])
#Now continue to append stylesheets and run against source docs.  The
#extensions i the module ord will be available
      </programlisting>

      <listitem>
        <para>When using the XSLT processor you can also register individual extension functions on a processor object using
        <methodname>registerExtentionFunction()</methodname> method.  It takes
        the namespace and
        localName for the extension function and the callable object that implements it).</para>
      </listitem>

      <programlisting>from Ft.Xml.Xslt import Processor
from ord import Ord #The code listed above

processor = Processor.Processor()
processor.registerExtensionFunction(u'urn:x-4suite:x', u'ord', Ord)
#Now continue to append stylesheets and run against source docs.  The
#ext:ord() function will be available
      </programlisting>

      <listitem>
        <para>In some cases the user can list extension modules using
        the environment variable "EXTMODULES". "EXTMODULES" is a
        colon-separated list of Python modules names. This works for the 4xslt
        command line and for <classname>Ft.Xml.XPath.Evaluate</classname>. For
        other APIs, use one of the other two methods, which can easily be
        extended to read the "EXTMODULES" variable.  In general the other methods for registering extensions are preferable.</para>
      </listitem>
    </orderedlist>

    <para>Note that extension modules will automatically be
    searched for XSLT extension elements as well as functions.</para>

    <para>The following is a longer example, a module that implements two functions are. One returns
    the current time and the other creates a hash of the context node name:</para>

    <programlisting># demo.py
import time, urlparse
from Ft.Xml.XPath import Conversions

def GetCurrentTime(context):
    '''available in XPath as get-current-time()'''
    return time.asctime(time.localtime())

def HashContextName(context, maxkey):
    '''
    available in XPath as hash-context-name(maxkey),
    where maxkey is an object converted to number
    '''
    # It is a good idea to use the appropriate core function to coerce
    # arguments to the expected type
    maxkey = Conversions.NumberValue(maxkey)
    key = reduce(lambda a, b: a + b, context.node.nodeName)
    return key % maxkey

ExtFunctions = {
    ('urn:x-4suite:x', 'get-current-time'): GetCurrentTime,
    ('urn:x-4suite:x', 'hash-context-name'): HashContextName
}

</programlisting>

    <para>You can use this in plain XPath as follows:</para>

    <programlisting>from <systemitem class="library">Ft.Xml.XPath.Context</systemitem> import <classname>Context</classname>
from <systemitem class="library">Ft.Xml.XPath</systemitem> import <classname>Compile</classname>, <classname>Evaluate</classname>
from <systemitem class="library">Ft.Xml.Domlette</systemitem> import <classname>NonvalidatingReader</classname>

DOC = "&lt;spam xmlns='http://spam.com'&gt;eggs&lt;/spam&gt;"

ctx = Context(None, extFunctionMap=demo.ExtFunctions,
              processorNss={"a": "http://spam.com"})
expr = Compile("get-current-time()")

doc = NonvalidatingReader.parseString(DOC, "http://spam.com/base")
print Evaluate(expr, doc, ctx)</programlisting>

    <para>Notice that you might choose to use None for the extension function
    namespaces. If so, you don't need to specify the processorNss context
    attribute, but you might want to watch out for clashes with other
    extenstion function names, including the built-in library. Again, if you
    plan to use an extension function from within XSLT, its namespace URI must
    not be None.</para>

    <para>You can use this in XSLT just as easily:</para>

    <programlisting># useextfunc.py

TRANSFORM = """&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:s="urn:x-4suite:x"
  version="1.0"&gt;

  &lt;xsl:template match="/"&gt;
    &lt;xsl:value-of select="s:get-current-time()"/&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;
"""

SOURCE = """&lt;dummy/&gt;"""

from <systemitem class="library">Ft.Xml.Xslt</systemitem> import <classname>Processor</classname>
processor = Processor.Processor()

# Register the extension function using method (3)
processor.registerExtensionModules(['demo'])
from Ft.Xml import InputSource
transform = InputSource.DefaultFactory.fromString(TRANSFORM, "http://foo.com")
source = InputSource.DefaultFactory.fromString(SOURCE, "http://foo.com")
processor.appendStylesheet(transform)
result = processor.run(source)
print result</programlisting>

<!-- Not sure why this example is needed, and besides, it hardcodes UNIX environment separator
    <para>You could also add the demo module to the enviroment, in
    "EXTMODULES", and still ensure that it is loaded by using the following
    code:</para>

    <programlisting>module_list = os.environ.get("EXTMODULES")
if module_list:
    processor.registerExtensionModules(moduleList.split(":"))</programlisting>
-->

    <para>For good examples of modules with extension elements, see the source code for the modules
    <classname>Ft.Xml.XPath.BuiltInExtFunctions</classname>,
    <classname>Ft.Xml.Xslt.BuiltInExtFunctions</classname> and the modules in
    <systemitem class="library">Ft.Xml.Xslt.Exslt</systemitem>. The latter are
    especially good examples given their diversity and detailed specifications
    at <ulink url="http://exslt.org">exslt.org</ulink>.</para>

  </section>

  <section>
    <title>Extension elements (XSLT)</title>

    <para>To define your own extension elements, define a class derived from
    <classname>Ft.Xml.Xslt.XsltElement</classname>. The module in which it is
    defined should have a global dictionary named "ExtElements" mapping element
    expanded names to element class objects.</para>

    <para>Finally, modules containing any extension elements used must be
    indicated as such to the processor in one of several ways.</para>

    <orderedlist>
      <listitem>
        <para>You can register all extension functions and elements in a module by using a processor object's
        <methodname>registerExtensionModules()</methodname> method.</para>
      </listitem>

      <programlisting>from Ft.Xml.Xslt import Processor
import ord #The code listed above

processor = Processor.Processor()
processor.registerExtensionModules([ord])
#Now continue to append stylesheets and run against source docs.  The
#extensions i the module ord will be available
      </programlisting>

      <listitem>
        <para>You can also register individual extension elements on a processor object using
        <methodname>registerExtensionElement()</methodname> method.  It takes
        the namespace and
        localName for the extension function and the callable object that implements it).</para>
      </listitem>

<!--
      <programlisting><![CDATA[from Ft.Xml.Xslt import Processor
from ord import Ord #The code listed above

processor = Processor.Processor()
processor.registerExtensionElement(u'urn:x-4suite:x', u'ord', Ord)
#Now continue to append stylesheets and run against source docs.  The
#ext:ord() function will be available
      ]]></programlisting>
-->

      <listitem>
        <para>In some cases the user can list extension modules using
        the environment variable "EXTMODULES". "EXTMODULES" is a
        colon-separated list of Python modules names. This works for the 4xslt
        command line and for <classname>Ft.Xml.XPath.Evaluate</classname>. For
        other APIs, use one of the other two methods, which can easily be
        extended to read the "EXTMODULES" variable.  In general the other methods for registering extensions are preferable.</para>
      </listitem>
    </orderedlist>


    <para>Note that extension modules will automatically be
    searched for XPath extension functions as well as Extension
    elements.</para>

<!--
    <para>For example:</para>

    <programlisting># extelement.py

import os
from <systemitem class="library">Ft.Xml.Xslt</systemitem> import <constant>XSL_NAMESPACE</constant>, <classname>XsltElement</classname>, <classname>XsltException</classname>, <classname>Error</classname>
from <systemitem class="library">Ft.Xml.Xslt</systemitem> import <classname>ContentInfo</classname>, <classname>AttributeInfo</classname>

EXT_NAMESPACE = 'http://foo.org/namespaces/ext-xslt'

class SystemElement(XsltElement):
    """
    Execute an arbitrary operating system command.
    Because of the security issues, use at your risk  :-)
    """
    content = ContentInfo.Empty   #Specify that this must be an empty element
    legalAttrs = {
      'command': AttributeInfo.StringAvt(description='The command to be executed'),
    }

    def instantiate(self, context, processor):
        command = self._command.evaluate(context)
        os.system(command)
        return (context,)

# The global dictionary that must be present in all extensions
ExtElements = {
  (EXT_NAMESPACE, 'system'): SystemElement,
}

# And optional dictionary, purely for documentation purposes,
# Which gives the prefix to use for extension namespaces in documentation
ExtNamespaces = {
  EXT_NAMESPACE : 'e',
}</programlisting>

    <para>And a little script that demonstrates using it (with registration
    method (2):</para>

    <programlisting># useext.py

TRANSFORM = """&lt;?xml version="1.0"?&gt;
&lt;xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:ext="http://foo.org/namespaces/ext-xslt"
  extension-element-prefixes="ext"
  version="1.0"&gt;

  &lt;xsl:template match="execute-command"&gt;
    &lt;ext:system command="{@cmd}"/&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;
"""

SOURCE = """&lt;execute-command cmd="dir"/&gt;"""

from <systemitem class="library">Ft.Xml.Xslt</systemitem> import <classname>Processor</classname>
processor = Processor.Processor()
# Register the extension element
processor.registerExtensionModules(['extelement'])
from Ft.Xml import InputSource
transform = InputSource.DefaultFactory.fromString(TRANSFORM, "http://foo.com")
source = InputSource.DefaultFactory.fromString(SOURCE, "http://foo.com")
processor.appendStylesheet(transform)
result = processor.run(source)
print result</programlisting>
-->

  </section>

  <section>
    <title>Extension element API</title>

    <para>There are several aspects of the extension element API worth
    discussing in more detail.</para>

    <para>The class-level "content" variable specifies a content model to be
    enforced by the XSLT processor. If the element is used in a way that
    doesn't meet the specified content model, the user will get an error
    message. The content model is a structure that uses certain special
    classes, including:</para>

    <itemizedlist>
      <listitem>
        <para>ContentInfo.Empty - matches no content at all (empty
        element)</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Text - matches plain text content</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Seq - matches the given sequence of
        sub-patterns</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Alt - matches one of the given choice of
        sub-patterns</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Rep - matches 0 or more repeated instances of the
        given sub-pattern</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Rep1 - matches 0 or more repeated instances of the
        given sub-pattern</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Opt - matches zero or one of the given
        sub-pattern</para>
      </listitem>

      <listitem>
        <para>ContentInfo.ResultElements - matches elements not in the XSL
        namespace</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Instructions - matches any sequence of XSLT elements
        categorized as instructions in the spec</para>
      </listitem>

      <listitem>
        <para>ContentInfo.Template - matches an XSLT template body according
        to the spec</para>
      </listitem>

      <listitem>
        <para>ContentInfo.TopLevelElements - matches any sequence of XSLT
        elements categorized as top level in the spec</para>
      </listitem>

      <listitem>
        <para>ContentInfo.QName - matches a particular element by giving its
        namespace and node name (the prefix in the node name is only used for
        documentation and error messages)</para>
      </listitem>
    </itemizedlist>

    <para>So, for instance, the xsl:choose element would be described
    as</para>

    <programlisting>content = ContentInfo.Seq(
    ContentInfo.Rep1(ContentInfo.QName(XSL_NAMESPACE, 'xsl:when')),
    ContentInfo.Opt(ContentInfo.QName(XSL_NAMESPACE, 'xsl:otherwise')),
    )</programlisting>

    <para>The class-level "legalAttrs" variable specifies the attributes
    allowed or required on the element. It is a Python dictionary mapping
    attribute name to its specification. The specification is a class
    according o the type of attribute.</para>

    <para>The following are the supported attribute classes. The parameters
    specified are for the initializer. Note that most general patterns have a
    plain variant and an attribute value template (AVT) variant:</para>

    <itemizedlist>
      <listitem>
        <para>AttributeInfo.String - any XPath string</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.StringAvt - an AVT yielding any string</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Char - any XPath string of length 1</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.CharAvt - AVT version of Char</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Choice - a string which must be one of a number of
        given values. The values are given by a list of strings with is the
        first parameter</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.ChoiceAvt - AVT version of Choice</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.YesNo - Abbreviation for AttributeInfo.Choice (
        See <ulink url="http://www.oasis-open.org/committees/relax-ng/tutorial-20011203.html">Oasis
        web site</ulink>)</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.YesNoAvt - AVT version of YesNo</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Number - any XPath number</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NumberAvt - AVT version of Number</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.UriReference - XPath string that is syntactically
        a URI reference</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.UriReferenceAvt - AVT version of
        UriReference</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Id - XPath string that is syntactically an XML
        ID</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.IdAvt - AVT version of Id</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.QName - XPath string that is syntactically an XML
        namespaces qualified name</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.QNameAvt - AVT version of QName</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NCName - XPath string that is syntactically an XML
        namespaces "no colon" name</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NCNameAvt - AVT version of NCName</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Prefix - Same as NCName</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.PrefixAvt - Same as NCNameAvt</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NMToken - XPath string that is syntactically an
        XML Name token</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NMTokenAvt - AVT version of NMToken</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.QNameButNotNCName - A QName that contains a
        colon</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.QNameButNotNCNameAvt - AVT version of
        QNameButNotNCName</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Token - XPath string that is syntactically an
        XPath name test (i.e. "foo", "ns:foo", ns:<emphasis role="strong">" or
        "</emphasis>")</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.TokenAvt - AVT version of Token</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Expression - XPath string that is syntactically an
        XPath expression</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.ExpressionAvt - AVT version of Expression</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.StringExpression - XPath string that is
        syntactically an XPath expression, which would be expected to return a
        string value</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.StringExpressionAvt - AVT version of
        StringExpression</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NodeSetExpression - XPath string that is
        syntactically an XPath expression, which would be expected to return a
        node set value</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NodeSetExpressionAvt - AVT version of
        NodeSetExpression</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NumberExpression - XPath string that is
        syntactically an XPath expression, which would be expected to return a
        number value</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.NumberExpressionAvt - AVT version of
        NumberExpression</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.BooleanExpression - XPath string that is
        syntactically an XPath expression, which would be expected to return a
        boolean value</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.BooleanExpressionAvt - AVT version of
        BooleanExpression</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Pattern - XPath string that is syntactically an
        XSLY pattern</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.PatternAvt - AVT version of Pattern</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Tokens - XPath string that is syntactically a
        space-delimited series of tokens</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.TokensAvt - AVT version of Tokens</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.QNames - XPath string that is syntactically a
        space-delimited series of QNames</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.QNamesAvt - AVT version of QNames</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.Prefixes - XPath string that is syntactically a
        space-delimited series of NCNames</para>
      </listitem>

      <listitem>
        <para>AttributeInfo.PrefixesAvt - AVT version of Prefixes</para>
      </listitem>
    </itemizedlist>

    <para>All of these classes take the following optional keyword
    parameters:</para>

    <itemizedlist>
      <listitem>
        <para>description - for documentation</para>
      </listitem>

      <listitem>
        <para>default - the default value of the attribute to be used if
        omitted</para>
      </listitem>
    </itemizedlist>

    <para>Some examples from the XSLT spec:</para>

    <para>xsl:output</para>

    <programlisting>content = ContentInfo.Empty
legalAttrs = {
    'method' : AttributeInfo.QName(),
    'version' : AttributeInfo.NMToken(),
    'encoding' : AttributeInfo.String(),
    'omit-xml-declaration' : AttributeInfo.YesNo(),
    'standalone' : AttributeInfo.YesNo(),
    'doctype-public' : AttributeInfo.String(),
    'doctype-system' : AttributeInfo.String(),
    'cdata-section-elements' : AttributeInfo.QNames(),
    'indent' : AttributeInfo.YesNo(),
    'media-type' : AttributeInfo.String(),
    }</programlisting>

    <para>xsl:sort</para>

    <programlisting>content = ContentInfo.Empty
legalAttrs = {
    'select' : AttributeInfo.StringExpression(default='.'),
    'lang' : AttributeInfo.NMTokenAvt(),
    # We don't support any additional data-types, hence no
    # AttributeInfo.QNameButNotNCName()
    'data-type' : AttributeInfo.ChoiceAvt(['text', 'number'],
                                          default='text'),
    'order' : AttributeInfo.ChoiceAvt(['ascending', 'descending'],
                                      default='ascending'),
    'case-order' : AttributeInfo.ChoiceAvt(['upper-first', 'lower-first']),
    }</programlisting>

    <para>xsl:number</para>

    <programlisting>content = ContentInfo.Empty
legalAttrs = {
    'level' : AttributeInfo.Choice(['single', 'multiple', 'any'],
                                   default='single'),
    'count' : AttributeInfo.Pattern(),
    'from' : AttributeInfo.Pattern(),
    'value' : AttributeInfo.Expression(),
    'format' : AttributeInfo.StringAvt(default='1'),
    'lang' : AttributeInfo.NMToken(),
    'letter-value' : AttributeInfo.ChoiceAvt(['alphabetic', 'traditional']),
    'grouping-separator' : AttributeInfo.CharAvt(),
    'grouping-size' : AttributeInfo.NumberAvt(default=0),
    }</programlisting>

    <para>Of course, it's always a good idea to use descriptions, which the
    above do not.</para>

    <!-- Perhaps add links to modules on viewcvs -->

    <para>For good examples of modules with extension elements, see the source code for the modules
    Ft.Xml.Xslt.BuiltInExtElements and Ft.Xml.Xslt.Exslt.Common . The various
    modules in Ft.Xml.Xslt.Exslt have a strong diversity and make good
    examples, especially given their detailed specifications at <ulink url="http://exslt.org">exslt.org</ulink></para>

    <section>
      <title>Controlling output from XSLT extensions</title>

      <para>The most common special need for XSLT extensions is to generate
      XSLT output. For extension elements this is easy enough to do using the
      API on the procssor instance that is passed to the instantiate() method
      of extension element classes. For example</para>

      <programlisting>class SpamElement(XsltElement):
    legalAttrs = {}
    def instantiate(self, context, processor):
        processor.output().startElement('title')
        processor.output().text('Life of Brian'))
        processor.output().endElement('title')
        return (context,)</programlisting>

      <para>Extension functions are not passed a processor instance directly,
      but context objects hold a reference to the processor in effect, so the
      following example works:</para>

      <programlisting>def Spam(context):
    context.processor.output().startElement('title')
    context.processor.output().text('Life of Brian'))
    context.processor.output().endElement('title')
    return</programlisting>

      <para>However, it is probably better design to reserve such side effects
      as output for extension elements rather than functions.</para>

      <para>In the above examples the elements and text out out just use the
      current output parameters. In order to change output parameters or
      change the output stream, you can stack a new output handler:</para>

      <programlisting>stream = cStringIO.StringIO()

# Clone the current outputparameters
op = processor.writers[-1]._outputParams.clone()

# Force XML output method with XML declaration
# Output method is a qualified name, so must flag null ns
# to use standard xml method
op.method = (EMPTY_NAMESPACE, 'xml')
op.omitXmlDeclaration = "yes"

# Push the new handler to the top of the writer stack
processor.addHandler(op, stream)
processor.output().startElement('title')
processor.output().text('Life of Brian'))
processor.output().endElement('title')

# Pop back to the previous handler stream.getvalue()
# now contains the new  output
processor.removeHandler()</programlisting>
    </section>

    <section>
      <title>Creating result tree fragments</title>

      <para>Another common need is to treat the body of an extension element
      as a template so that something can be done with the RTF that results
      from it. The following example demonstrates this:</para>

      <programlisting>try:
    # Set the output to an RTF riter, which wll create an RTF for us
    processor.pushResultTree(self.baseUri)

    # The template is manifested as children of the extension element
    # node.  Instantiate each in turn
    for child in self.children:
        child.instantiate(context, processor)
# You want to be sure you re-balance the stack even in case of error
finally:
    # Retrieve the resulting RTF
    result_rtf = processor.popResult()</programlisting>
    </section>

    <section>
      <title>Comunicating with the external code that invokes XSLT</title>

      <para>You can set and communicate state information with external code
      by using the processor.extensionParams attribute. For example, the
      following sents a time stamp of precisely when the extension was
      instantiated, which can later be retrieved from the processor after the
      XSLT process, or even by later extensions. In a similar way, state can
      be set up by calling functions and retrieved by extensions.</para>

      <programlisting># Extension parameters have fully qualified names, so you must come up
# with a namespace to set them
processor.extensionParams[(SPAM_NAMESPACE, 'tstamp')] = time.time()</programlisting>
    </section>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/MarkupWriter.xml" id="mwriter">
  <title>Streaming XML output</title>

  <para><classname>MarkupWriter</classname> is a streaming
  <acronym>API</acronym> for generating <acronym>XML</acronym>. The
  <literal>Ft.Xml.MarkupWriter</literal> class is specialized for creating
  <acronym>XML</acronym> documents from scratch. Documents written with
  <classname>MarkupWriter</classname> are written to the output (standard
  output or another file-like object) as you build them, so if you need to
  process the document in memory, you may need another tool such as a DOM-like
  tool (e.g. Domlette, Amara, etc).</para>

  <para>4Suite partitions <acronym>XML</acronym> serializers into two
  categories: writers and printers.<itemizedlist>
      <listitem>
        <para>A writer is a module that exposes a broad public
        <acronym>API</acronym> for building output incrementally.</para>
      </listitem>

      <listitem>
        <para>A printer is a module that simply takes a <acronym>DOM</acronym>
        and creates output from it as a whole, within one
        <acronym>API</acronym> invocation.</para>
      </listitem>
    </itemizedlist><classname>MarkupWriter</classname> is the primary example
  of this writer category of <acronym>XML</acronym> serializers.</para>

  <para>The following example uses this class for generating a simple
  <acronym>XML</acronym> Software Autoupdate (XSA) file. XSA is a
  <acronym>XML</acronym> data format for listing and describing software
  packages.</para>

  <programlisting>from <systemitem class="library">Ft.Xml</systemitem> import <classname>MarkupWriter</classname>

# Set the output doc type details (required by XSA)
SYSID = u"http://www.garshol.priv.no/download/xsa/xsa.dtd"
PUBID = u"-//LM Garshol//DTD XML Software Autoupdate 1.0//EN//XML"
writer = MarkupWriter(indent=u"yes", doctypeSystem=SYSID,
                      doctypePublic=PUBID)
writer.startDocument()
writer.startElement(u'xsa')
writer.startElement(u'vendor')

# Element with simple text (#PCDATA) content
writer.simpleElement(u'name', content=u'Centigrade systems')
writer.simpleElement(u'email', content=u"info@centigrade.bogus")
writer.endElement(u'vendor')

# Element with an attribute
writer.startElement(u'product', attributes={u'id': u"100\u00B0"})
writer.simpleElement(u'name', content=u"100\u00B0 Server")
writer.simpleElement(u'version', content=u"1.0")
writer.simpleElement(u'last-release')
writer.text(u"20030401")

# Empty element
writer.simpleElement(u'changes')
writer.endElement(u'product')
writer.endElement(u'xsa')
writer.endDocument()</programlisting>

  <para>This is the output we get from the code above: <screen>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE xsa PUBLIC "-//LM Garshol//DTD XML Software Autoupdate 1.0//EN//XML" "http://www.garshol.priv.no/download/xsa/xsa.dtd"&gt;
&lt;xsa&gt;
  &lt;vendor&gt;
    &lt;name&gt;Centigrade systems&lt;/name&gt;
    &lt;email&gt;info@centigrade.bogus&lt;/email&gt;
  &lt;/vendor&gt;
  &lt;product id="100°"&gt;
    &lt;name&gt;100° Server&lt;/name&gt;
    &lt;version&gt;1.0&lt;/version&gt;
    &lt;last-release&gt;20030401&lt;/last-release&gt;
    &lt;changes/&gt;
  &lt;/product&gt;
&lt;/xsa&gt;</screen></para>

  <para>The above example illustrates some of the basics of using the
  <classname>MarkupWriter</classname> class. The following sections describe
  both the essential and the advanced features of this class. In many cases,
  there often exists more than one way to output a given document
  section.</para>

  <section id="mwriter.begin">
    <title>Starting with MarkupWriter</title>

    <para>After importing the <classname>MarkupWriter</classname> class, you
    have to create a <classname>MarkupWriter</classname> object instance and
    then start the new Document. (See below for <link linkend="mwriter.output">output options</link> of
    <classname>MarkupWriter</classname>.) Remember that you are working with a
    streaming <acronym>API</acronym>. You must decide what features you want
    your output to have before you start to write that output.<programlisting>&gt;&gt;&gt; from Ft.Xml import MarkupWriter
&gt;&gt;&gt; writer = MarkupWriter()
&gt;&gt;&gt; writer.startDocument() </programlisting></para>

    <para>You are now ready to add data to the new document.</para>

    <important>
      <para>Make sure that all of your data (element names, attributes,
      content, etc) are Python unicode objects.</para>
    </important>
  </section>

  <section id="mwriter.elements">
    <title>How to insert elements</title>

    <para>There are two ways to add new elements as children of other document
    or element nodes.</para>

    <orderedlist>
      <listitem>
        <para>When you want to add a new element that will itself have child
        elements, you can use the
        <methodname>startElement</methodname>/<methodname>endElement</methodname>
        method combination to signal the beginning and the ending of an
        element, respectively.<programlisting>writer.startElement(u'xsa')
# other document content can be output here
writer.endElement(u'xsa')</programlisting></para>
      </listitem>

      <listitem>
        <para>Alternatively, you can use the
        <methodname>simpleElement</methodname> method, which is a shortcut for
        the
        <methodname>startElement</methodname>/<methodname>endElement</methodname>
        combination and produces an element with no content or with text
        content (if you specify the <parameter>content</parameter> parameter).
        <programlisting>writer.simpleElement(u'xsa')</programlisting></para>
      </listitem>
    </orderedlist>
  </section>

  <section id="mwriter.attributes">
    <title>How to insert attributes</title>

    <para>There are two ways to add attributes to elements:</para>

    <orderedlist>
      <listitem>
        <para>First, you can use the <parameter>attributes</parameter>
        parameter of the <methodname>startElement</methodname> method. This
        parameter is a dictionary which maps each attribute name to the value
        of that attribute. If an attribute's name is in a namespace, then you
        must specify the name as a Python tuple, with the attribute's QName as
        the first member of the tuple, and the namespace URI as the second
        member of the tuple. For an example of this advanced syntax, see <xref linkend="mwriter.examples.xhtml"/>.<programlisting>writer.startElement(u'product', attributes={u'id': u"100\u00B0"}</programlisting></para>
      </listitem>

      <listitem>
        <para>Alternatively, you can use a distinct
        <methodname>attribute</methodname> method with two parameters: the
        attribute's name and the attribute's value. As with the dictionary
        approach above, if the attribute's name is in a namespace, then the
        whole name should be a Python tuple. <programlisting>writer.startElement(u'product')
writer.attribute(u'id', u"100\u00B0")</programlisting></para>
      </listitem>
    </orderedlist>
  </section>

  <section id="mwriter.text">
    <title>How to insert text nodes</title>

    <para>Similarly, there are two ways to add text nodes to elements.</para>

    <orderedlist>
      <listitem>
        <para>First, the <methodname>simpleElement</methodname> method takes a
        <parameter>content</parameter> parameter, which can be used to create
        a single text node child of the node with the specified
        name.<programlisting>writer.simpleElement(u'name', content=u'Centigrade systems')</programlisting></para>
      </listitem>

      <listitem>
        <para>Alternatively, instances of the
        <classname>MarkupWriter</classname> class, such as
        <varname>writer</varname>, have a <methodname>text</methodname> method
        that inserts a single text node as the <emphasis>next</emphasis> child
        of the element which was last started with the
        <methodname>startElement</methodname> method and which has not yet
        been closed with the <methodname>endElement</methodname>
        method.<programlisting>writer.startElement(u'product')
writer.text(u'Centigrade systems')
writer.endElement(u'product')</programlisting></para>
      </listitem>
    </orderedlist>
  </section>

  <section id="mwriter.chunk">
    <title>How to insert a complete chunk</title>

    <para><classname>MarkupWriter</classname> also allows you to insert
    well-formed <acronym>XML</acronym> entities as complete chunks in the
    output. This is a very convenient way to emit boilerplate
    <acronym>XML</acronym> without breaking it down into all the separate
    element/attribute/content bits. As such the lines:</para>

    <programlisting>writer.simpleElement(u'name', content=u"100\u00B0 Server")
writer.simpleElement(u'version', content=u"1.0")
writer.simpleElement(u'last-release', content=u"20030401")</programlisting>

    <para>Could instead be written:</para>

    <programlisting>writer.xmlFragment("""
&lt;name&gt;100° Server&lt;/name&gt;
&lt;version&gt;1.0&lt;/version&gt;
&lt;last-release&gt;20030401&lt;/last-release&gt;""")</programlisting>

    <important>
      <para>The parameter of <methodname>xmlFragment</methodname> is a string,
      not a unicode object.</para>
    </important>
  </section>

  <section id="mwriter.pi">
    <title>How to insert processing instructions and comments</title>

    <para>The API provides the <methodname>comment</methodname> and
    <methodname>processingInstruction</methodname> methods for inserting
    processing instructions and comments. The <methodname>comment</methodname>
    method takes a unicode string, which is the intended value of the comment.
    The <methodname>processingInstruction</methodname> method takes two
    unicode strings. The first is the name of the processing instruction, and
    the second is the value of the processing instruction. For example, the
    following code:<programlisting>writer.comment(u"This is a processing instruction")
writer.processingInstruction(u'xml-stylesheet', u'type="text/xsl" href="akara.xsl"')</programlisting>produces
    the following output:<screen>&lt;!--This is a processing instruction--&gt;
&lt;?xml-stylesheet type="text/xsl" href="akara.xsl"?&gt;</screen></para>
  </section>

  <section id="mwriter.ns">
    <title>Using namespaces</title>

    <para>When you create a new element or an attribute, you can use
    namespaces. See the next program:</para>

    <programlisting>from <systemitem class="library">Ft.Xml</systemitem> import <classname>MarkupWriter</classname>

writer = <classname>MarkupWriter</classname>(indent=u'yes')
writer.startDocument()

RDFNS = u"http://www.w3.org/1999/02/22-rdf-syntax-ns#"

writer.startElement(u"rdf:RDF", RDFNS)
writer.startElement(u"rdf:Description", RDFNS,
    attributes={(u'rdf:about', RDFNS): u'http://media.example.com/audio/guide.ra'})
writer.endElement(u'rdf:Description', RDFNS)
writer.endElement(u'rdf:RDF', RDFNS)</programlisting>

    <para>And this is the output:</para>

    <screen>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"&gt;
    &lt;rdf:Description rdf:about="http://media.example.com/audio/guide.ra"/&gt;
&lt;/rdf:RDF&gt;</screen>
  </section>

  <section id="mwriter.output">
    <title>Setting up the output</title>

    <para>In the above example, you can see how parameters that control the
    output are passed into the <classname>MarkupWriter</classname>
    initializer, including document type info and whether to indent (pretty
    print).</para>

    <para>You can pass any of the usual controls for XSLT output into the
    initializer this way.</para>

    <variablelist>
      <varlistentry>
        <term><parameter>stream</parameter></term>

        <listitem>
          <para>By default <classname>MarkupWriter</classname> sends its
          output to <varname>sys.stdout</varname>, but you can substitute any
          file-like object by passing in an initializer parameter. This stream
          parameter should be the first argument to the
          <classname>MarkupWriter</classname> constructor. For example:
          <programlisting>output_file = file('output.xml', 'w')
writer = MarkupWriter(output_file, indent=u"yes")</programlisting></para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>indent</parameter></term>

        <listitem>
          <para>The indent named parameter controls whether or not the output
          will have whitespace inserted to indent tags in the output. The
          default is "no".</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>doctypeSystem</parameter>,
        <parameter>doctypePublic</parameter></term>

        <listitem>
          <para>These two named parameters control the system and public
          identifiers that will be included in the output.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>omitXmlDeclaration</parameter>=u"yes"</term>

        <listitem>
          <para>This named parameter can be used to suppress output of the
          <acronym>XML</acronym> declaration. The default is "no".</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>encoding</parameter></term>

        <listitem>
          <para>This named parameter controls the character encoding to use.
          (The default is UTF-8.) The writer will automatically use character
          entities where necessary.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>standalone</parameter></term>

        <listitem>
          <para>Set this named parameter to "yes" to set standalone in the
          <acronym>XML</acronym> declaration.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>mediaType</parameter></term>

        <listitem>
          <para>This parameter sets the media type of the output. You will
          probably never need this.</para>
        </listitem>
      </varlistentry>

      <varlistentry>
        <term><parameter>cdataSectionElements</parameter></term>

        <listitem>
          <para>This named parameter is a list of element names whose output
          will be wrapped in a CDATA section. This can provide for friendlier
          output in some cases.</para>
        </listitem>
      </varlistentry>
    </variablelist>

    <para>The <acronym>XSLT</acronym> spec also defines a method parameter to
    choose between <acronym>XML</acronym>, <acronym>HTML</acronym> or plain
    text output rules, but for <classname>MarkupWriter</classname> at the
    moment you should stick to <acronym>XML</acronym>. The result of changing
    the method is undefined. We'll probably relax this restriction in later
    releases.</para>
  </section>

  <section id="mwriter.examples">
    <title>More examples</title>

    <section id="mwriter.examples.xhtml">
      <title>Writing XHTML with <classname>MarkupWriter</classname></title>

      <para>Uche Ogbuji provides <ulink url="http://copia.ogbuji.net/blog/2005-08-01/Another_sm">this
      example</ulink>, which writes a simple XHTML file, in his blog:</para>

      <programlisting>from Ft.Xml.MarkupWriter import MarkupWriter
from xml.dom import XHTML_NAMESPACE, XML_NAMESPACE

XHTML_NS = unicode(XHTML_NAMESPACE)
XML_NS = unicode(XML_NAMESPACE)
XHTML11_SYSID = u"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"
XHTML11_PUBID = u"-//W3C//DTD XHTML 1.1//EN"

writer = MarkupWriter(indent=u"yes", doctypeSystem=XHTML11_SYSID,
                      doctypePublic=XHTML11_PUBID)
writer.startDocument()
writer.startElement(u'html', XHTML_NS, attributes={(u'xml:lang', XML_NS): u'en'})
writer.startElement(u'head', XHTML_NS)
writer.simpleElement(u'title', XHTML_NS, content=u'Virtual Library')
writer.endElement(u'head', XHTML_NS)
writer.startElement(u'body', XHTML_NS)
writer.startElement(u'p', XHTML_NS)
writer.text(u'Moved to ')
writer.simpleElement(u'a', XHTML_NS,
                     attributes={u'href': u'http://vlib.org/'},
                     content=u'vlib.org')
writer.text(u'.')
writer.endElement(u'p', XHTML_NS)
writer.endElement(u'body', XHTML_NS)
writer.endElement(u'html', XHTML_NS)
writer.endDocument()</programlisting>

      <para>This example results in the following XHTML document:</para>

      <screen>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"&gt;
  &lt;head&gt;
    &lt;title&gt;Virtual Library&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;p&gt;Moved to &lt;a href="http://vlib.org/"&gt;vlib.org&lt;/a&gt;.&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;  </screen>
    </section>

    <section id="mwriter.examples.dirlist">
      <title>Writing information of directory listing as a
      <acronym>XML</acronym> document</title>

      <para>This recursive example builds an <acronym>XML</acronym> document
      with the information of a directory listing. The example has two
      functions. The first initializes the writer. The second walks through
      the filesystem and outputs information about the filesystem as
      <acronym>XML</acronym>. The complete <ulink url="http://copia.ogbuji.net/files/dirlist.py">dirlist.py
      program</ulink> can be found on Uche Ogbuji's blog.</para>

      <programlisting>def genXML(dir,out):
    print "Processing %s" % dir
    writer = MarkupWriter(out, indent=u"yes")
    writer.startDocument()
    recurse_dir(dir,writer)

def recurse_dir(path,writer,d):
    d=d+1
    for cdir, subdirs, files in os.walk(path):
        writer.startElement(u'directory', attributes={u'name': unicode(cdir)})
        for f in files:
            writer.simpleElement(u'file', attributes={u'name': unicode(f)})
        for subdir in subdirs: recurse_dir(os.path.join(cdir, subdir), writer,d)
        writer.endElement(u'directory')
        break</programlisting>
    </section>

    <section>
      <title>Building a bot</title>

      <para>As a more complex example, the <ulink url="http://metacognition.info/Emeka/Emeka.py">Emeka
      <acronym>IRC</acronym> bot</ulink> uses
      <classname>MarkupWriter</classname> to build an RDF document. It writes
      namespaces. See this chunk of the code: <programlisting>DCE_NS = u'http://purl.org/dc/elements/1.1/'
for nada,category in item['categories']:
    if len(category.split(' ')) &gt; 0:
        for category in category.split(' '):
            writer.startElement(u"dc:subject", DCE_NS)
            writer.text(category)
            writer.endElement(u"dc:subject")
    else:
        writer.startElement(u"dc:subject", DCE_NS)
        writer.text(category)
        writer.endElement(u"dc:subject", DCE_NS)</programlisting></para>
    </section>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/RELAX_NG_Validation.xml" id="RELAXNG">
  <title>Validation using RELAX NG</title>

  <para>4Suite has RELAX NG support based on a bundling of Eric van der
  Vlist's <ulink url="http://www.advogato.org/proj/xvif/">XVIF
  implementation</ulink>.</para>

  <para>First of all, you can use the 4xml command line for RELAX NG
  validation with the --rng flag. For instance, take the following RELAX NG
  schema (rng-tut3.rng):</para>

  <programlisting>&lt;element name="addressBook" xmlns="[http://relaxng.org/ns/structure/1.0][13]"&gt;
  &lt;zeroOrMore&gt;
    &lt;element name="card"&gt;
      &lt;element name="name"&gt;
        &lt;text/&gt;
      &lt;/element&gt;
      &lt;element name="email"&gt;
        &lt;text/&gt;
      &lt;/element&gt;
    &lt;/element&gt;
  &lt;/zeroOrMore&gt;
&lt;/element&gt;</programlisting>

  <para>The following document (rng-tut1.xml) is valid against the
  schema:</para>

  <programlisting>&lt;addressBook&gt;
  &lt;card&gt;
    &lt;name&gt;John Smith&lt;/name&gt;
    &lt;email&gt;js@example.com&lt;/email&gt;
  &lt;/card&gt;
  &lt;card&gt;
    &lt;name&gt;Fred Bloggs&lt;/name&gt;
    &lt;email&gt;fb@example.net&lt;/email&gt;
  &lt;/card&gt;
&lt;/addressBook&gt;</programlisting>

  <para>As you can check as follows:</para>

  <screen>$ 4xml --rng=rng-tut3.rng rng-tut1.xml
&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;addressBook&gt;
  &lt;card&gt;
    &lt;name&gt;John Smith&lt;/name&gt;
    &lt;email&gt;js@example.com&lt;/email&gt;
  &lt;/card&gt;
  &lt;card&gt;
    &lt;name&gt;Fred Bloggs&lt;/name&gt;
    &lt;email&gt;fb@example.net&lt;/email&gt;
  &lt;/card&gt;
&lt;/addressBook&gt;</screen>

  <para>Since it passes the schema, 4xml continues normal operation,
  re-serializing the XML back to stdout.</para>

  <para>The following document (rng-tut7.xml) is not valid against the
  schema:</para>

  <programlisting>&lt;addressBook&gt;i
  &lt;card&gt;
    &lt;givenName&gt;John&lt;/givenName&gt;
    &lt;familyName&gt;Smith&lt;/familyName&gt;
  &lt;email&gt;js@example.com&lt;/email&gt;
  &lt;/card&gt;
  &lt;card&gt;
    &lt;name&gt;Fred Bloggs&lt;/name&gt;
    &lt;email&gt;fb@example.net&lt;/email&gt;
  &lt;/card&gt;
&lt;/addressBook&gt;</programlisting>

  <para>Which you can check as follows:</para>

  <screen>$ 4xml --rng=rng-tut7.rng rng-tut1.xml 
Traceback (most recent call last):
  File "/home/uogbuji/lib/python2.2/site-packages/Ft/Share/Bin/4xml", line 5, in ?
    XmlCommandLineApp().run()
  File "/home/uogbuji/lib/python2.2/site-packages/Ft/Lib/CommandLine/CommandLineApp.py", line 90, in run
    cmd.run_command(self.authenticationFunction)
  File "/home/uogbuji/lib/python2.2/site-packages/Ft/Lib/CommandLine/Command.py", line 83, in run_command
    self.function(self.clOptions, self.clArguments)
  File "/home/uogbuji/lib/python2.2/site-packages/Ft/Xml/_4xml.py", line 89, in Run
    raise RngInvalid(result)
Ft.Xml.Xvif.RngInvalid: _Pattern Empty, no content expected, 
node &lt;cElement at 0x838d7f4: name u'card', 0 attributes, 7 children&gt;</screen>

  <para>The exception is for the invalid pattern.</para>

  <para>You can also access validation through the Python API using the new
  Ft.Xml.Xvif.RelaxNgValidator class. For example:</para>

  <programlisting>from Ft.Xml.Xvif import RelaxNgValidator
from Ft.Xml import InputSource
from Ft.Lib import Uri
factory = InputSource.DefaultFactory
rng_uri = Uri.OsPathToUri("rng-tut3.rng", attemptAbsolute=1)
src_uri = Uri.OsPathToUri("rng-tut1.xml", attemptAbsolute=1)
rng_isrc = factory.fromUri(rng_uri)
src_isrc = factory.fromUri(src_uri)

validator = RelaxNgValidator(rng_isrc)
result = validator.isValid(src_isrc)
if result:
    print "Valid"
else:
    print "Invalid"</programlisting>

  <para>The isValid() method returns a 1 or 0 for validity. To get the actual
  structure returned by the validator, use the validate() method instead. This
  structure can easily be turned into an exception object. The following
  variation prints "Valid" if valid, and raises an exception if not:</para>

  <programlisting>from Ft.Xml.Xvif import RelaxNgValidator, RngInvalid
from Ft.Xml import InputSource
factory = InputSource.DefaultFactory
from Ft.Lib import Uri
factory = InputSource.DefaultFactory
rng_uri = Uri.OsPathToUri("rng-tut3.rng", attemptAbsolute=1)
src_uri = Uri.OsPathToUri("rng-tut1.xml", attemptAbsolute=1)
rng_isrc = factory.fromUri(rng_uri)
src_isrc = factory.fromUri(src_uri)

validator = RelaxNgValidator(rng_isrc)
result = validator.validate(src_isrc)
if result.nullable():
    print "Valid"
else:
    raise RngInvalid(result)</programlisting>

  <para>If you want to use the validation error message without raising an
  exception:</para>

  <programlisting># Set-up as above
result = validator.validate(src_isrc)
if result.nullable():
    print "Valid"
else:
    print result.msg</programlisting>

  <para>Xvif does not report the location of validation errors, and stops after the first error. It does not support RELAX
    NG compact syntax (RNC) or nameClasses (<code>name</code>, <code>anyName</code>, <code>nsName</code>, and
    <code>except</code> elements in the schema). In addition, its support of XML Schema datatypes is incomplete, but has been
    extended by 4Suite to accommodate a number of types, including the following (asterisk indicates support is exclusive to
    4Suite):</para>
  
  <itemizedlist>
    <listitem><para><literal>xs:string</literal></para></listitem>
    <listitem><para><literal>xs:normalizedString</literal></para></listitem>
    <listitem><para><literal>xs:token</literal></para></listitem>
    <listitem><para><literal>xs:ID</literal> *</para></listitem>
    <listitem><para><literal>xs:IDREF</literal> *</para></listitem>
    <listitem><para><literal>xs:integer</literal></para></listitem>
    <listitem><para><literal>xs:nonPositiveInteger</literal></para></listitem>
    <listitem><para><literal>xs:nonNegativeInteger</literal></para></listitem>
    <listitem><para><literal>xs:PositiveInteger</literal></para></listitem>
    <listitem><para><literal>xs:negativeInteger</literal></para></listitem>
    <listitem><para><literal>xs:unsignedLong</literal></para></listitem>
    <listitem><para><literal>xs:unsignedInt</literal></para></listitem>
    <listitem><para><literal>xs:long</literal></para></listitem>
    <listitem><para><literal>xs:int</literal></para></listitem>
    <listitem><para><literal>xs:short</literal></para></listitem>
    <listitem><para><literal>xs:unsignedShort</literal></para></listitem>
    <listitem><para><literal>xs:byte</literal></para></listitem>
    <listitem><para><literal>xs:unsignedByte</literal></para></listitem>
    <listitem><para><literal>xs:decimal</literal></para></listitem>
    <listitem><para><literal>xs:date</literal> *</para></listitem>
    <listitem><para><literal>xs:boolean</literal> *</para></listitem>
    <listitem><para><literal>xs:time</literal> *</para></listitem>
    <listitem><para><literal>xs:dateTime</literal> *</para></listitem>
    <listitem><para><literal>xs:anyURI</literal> *</para></listitem>
  </itemizedlist>
  
  <para>The numeric types all support the <literal>totalDigits</literal>, <literal>minInclusive</literal>,
    <literal>maxInclusive</literal>, <literal>minExclusive</literal>, and <literal>maxExclusive</literal> facets.
    <literal>xs:decimal</literal> also supports the <literal>fractionDigits</literal> facet.</para>
  
  <para>The <literal>xs:string</literal>, <literal>xs:normalizedString</literal>, and <literal>xs:token</literal> types
    support the <literal>length</literal> facet. In 4Suite only, <literal>xs:string</literal> and
    <literal>xs:normalizedString</literal> support<literal> minLength</literal>, <literal>maxLength</literal>, and
    <literal>pattern</literal> facets.</para>

</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/XUpdate.xml" id="xupdate">
  <title>XUpdate processing</title>

  <para><ulink url="http://xmldb-org.sourceforge.net/xupdate/">XUpdate</ulink> is
  a community specification for using an XML vocabulary to express
  modifications to XML documents. It is essentially an XPath-based XML
  transformation language, like XSLT. An XUpdate document is an XML document
  that specifies what changes should be made to another XML document. XUpdate
  is supported by many XML processing tools - especially in the open source
  category - and XUpdate is neither a W3C Recommendation nor an ISO or IETF
  standard. It is just a project of the XML:DB Initiative's XUpdate Working
  Group, and it never advanced beyond a Working Draft published in September,
  2000. It is not very well specified, but it is very convenient and enables a
  basic level of functionality, so it has enjoyed popularity in a number of
  implementations.</para>

  <para>4Suite's XUpdate implementation, 4XUpdate, consists of a Python API
  (via the Ft.Xml.XUpdate module) and a command-line script (4xupdate). The
  APIs involve taking a source document (the XML to be updated) and an XUpdate
  document (the changes to apply), and either producing a new document or
  updating the source document in-place. The command line tool can be used,
  for example, as a patching utility for XML. All of XUpdate (such as it's
  specified) is currently implemented.</para>

  <para>The Python API can be invoked directly on Domlette objects or on
  InputSources. Here is an example of using the ApplyXUpdate convenience
  function, which takes InputSources:</para>

  <programlisting>from Ft.Xml.Domlette import PrettyPrint
from Ft.Xml.InputSource import DefaultFactory
try:
    from Ft.Xml.XUpdate import ApplyXUpdate
except ImportError:
    # the function name changed between 1.0a3 and 1.0b1
    from Ft.Xml.XUpdate import ApplyXupdate as ApplyXUpdate

SOURCE='''&lt;?xml version = "1.0"?&gt;
&lt;ADDRBOOK xmlns="http://bogus/"&gt;
  &lt;ENTRY ID="fr"&gt;
    &lt;NAME&gt;fred&lt;/NAME&gt;
  &lt;/ENTRY&gt;
&lt;/ADDRBOOK&gt;'''

XU='''&lt;?xml version="1.0"?&gt;
&lt;xu:modifications version="1.0" xmlns:xu="http://www.xmldb.org/xupdate"
  xmlns:myns="http://bogus/"&gt;
  &lt;xu:append select="/myns:ADDRBOOK" child="last()"&gt;
    &lt;ENTRY ID="vz"&gt;
      &lt;NAME&gt;Vasia Zhugenev&lt;/NAME&gt;
    &lt;/ENTRY&gt;
  &lt;/xu:append&gt;
&lt;/xu:modifications&gt;'''

src_isrc = DefaultFactory.fromString(SOURCE, "http://test1/")
xup_isrc = DefaultFactory.fromString(XU, "http://test2/")

result_dom = ApplyXUpdate(src_isrc, xup_isrc)
PrettyPrint(result_dom)

#expected:
#&lt;?xml version="1.0" encoding="UTF-8"?&gt;
#&lt;ADDRBOOK xmlns="http://bogus/"&gt;
#  &lt;ENTRY ID="fr"&gt;
#    &lt;NAME&gt;fred&lt;/NAME&gt;
#  &lt;/ENTRY&gt;
#&lt;ENTRY ID="vz"&gt;
#    &lt;NAME&gt;Vasia Zhugenev&lt;/NAME&gt;
#  &lt;/ENTRY&gt;
#&lt;/ADDRBOOK&gt;</programlisting>

  <para>If you have both the source document and XUpdate document as Domlette
  nodes already, you can use the XUpdate processor directly:</para>

  <programlisting># add to the above script...
from Ft.Xml.Domlette import NonvalidatingReader
from Ft.Xml.XUpdate import Processor
src_isrc = DefaultFactory.fromString(SOURCE, "http://test1/")
xup_isrc = DefaultFactory.fromString(XU, "[http://test2/")
src_dom = NonvalidatingReader.parse(src_isrc)
xup_dom = NonvalidatingReader.parse(xup_isrc)
proc = Processor()
proc.execute(src_dom, xup_dom)

# src_dom has been modified in-place
PrettyPrint(src_dom)</programlisting>

  <para>Using the processor directly allows you to set XPath variables, if
  needed:</para>

  <programlisting>from Ft.Xml import EMPTY_NAMESPACE

# execute with $x='foo'
proc.execute(src_dom, xup_dom, {(EMPTY_NAMESPACE, u'x'): u'foo'})</programlisting>

  <para>The command-line script works on local files or even URIs, if
  resolvable, and normally sends the result XML to standard output, although
  it can also be made to write to a file. See "4xupdate -h" for usage
  instructions.</para>

  <section>
    <title>XUpdate and namespaces</title>

    <para>In order to show how to use XUpdate to make namespace-aware
    modifications, The following tasks will be demonstrated:</para>

    <orderedlist>
      <listitem>
        <para>Add a new element in the products namespace, but using no
        prefix.</para>
      </listitem>

      <listitem>
        <para>Add a new element with a prefix and in the products
        namespace.</para>
      </listitem>

      <listitem>
        <para>Add a new element that is not in any namespace.</para>
      </listitem>

      <listitem>
        <para>Add a new global attribute in the XHTML namespace.</para>
      </listitem>

      <listitem>
        <para>Add a new global attribute in the special XML namespace.</para>
      </listitem>

      <listitem>
        <para>Add a new attribute in no namespace.</para>
      </listitem>

      <listitem>
        <para>Remove only the <literal>code</literal> element in the XHTML
        namespace</para>
      </listitem>

      <listitem>
        <para>Remove a global attribute</para>
      </listitem>

      <listitem>
        <para>Remove an attribute that is not in any namespace</para>
      </listitem>
    </orderedlist>

    <para>Modification in place can always be simulated with an addition and
    then a removal. The following code shows how these tasks can be performed
    in XUpdate.</para>

    <programlisting>&lt;xup:modifications version="1.0"
  xmlns:xup="http://www.xmldb.org/xupdate"
  xmlns:p="http://example.com/product-info"
  xmlns:html="http://www.w3.org/1999/xhtml"
  xmlns:xl="http://www.w3.org/1999/xlink"
&gt;

  &lt;!-- Task 1 --&gt;
  &lt;xup:append select="/products/p:product[1]"&gt;
    &lt;xup:element
      name="launch-date"
      namespace="http://example.com/product-info"/&gt;
  &lt;/xup:append&gt;

  &lt;!-- Task 2 --&gt;
  &lt;xup:append select="/products/p:product[1]"&gt;
    &lt;xup:element
      name="p:launch-date"
      namespace="http://example.com/product-info"/&gt;
  &lt;/xup:append&gt;

  &lt;!-- Can also be accomplished using literal result elements:
  &lt;xup:append select="/products/p:product[1]"&gt;
    &lt;p:launch-date/&gt;
  &lt;/xup:append&gt;
  --&gt;

  &lt;!-- Task 3 --&gt;
  &lt;xup:append select="/products/p:product[1]"&gt;
    &lt;xup:element name="island"/&gt;
  &lt;/xup:append&gt;

  &lt;!-- Can also be accomplished using literal result elements:
  &lt;xup:append select="/products/p:product[1]"&gt;
    &lt;island/&gt;
  &lt;/xup:append&gt;
  --&gt;

  &lt;!-- Task 4 --&gt;
  &lt;xup:append select="/products/p:product/p:description/html:div"&gt;
    &lt;xup:attribute name="global"
      namespace="http://www.w3.org/1999/xhtml"&gt;spam&lt;/xup:attribute&gt;
  &lt;/xup:append&gt;

  &lt;!-- Task 5 --&gt;
  &lt;xup:append select="/products/p:product/p:description/html:div"&gt;
    &lt;xup:attribute name="xml:lang"&gt;en&lt;/xup:attribute&gt;
  &lt;/xup:append&gt;

  &lt;!-- Task 6 --&gt;
  &lt;xup:append select="/products/p:product/p:description/html:div"&gt;
    &lt;xup:attribute name="class"&gt;eggs&lt;/xup:attribute&gt;
  &lt;/xup:append&gt;

  &lt;!-- Task 7 --&gt;
  &lt;xup:remove select="//html:code"/&gt;

  &lt;!-- Task 8 --&gt;
  &lt;xup:remove select="/products/p:product/p:description/html:div/ref/@xl:href"/&gt;

  &lt;!-- Task 9 --&gt;
  &lt;xup:remove select="/products/p:product[1]/@id"/&gt;

&lt;/xup:modifications&gt;</programlisting>

    <para>If you're familiar with XSLT, then you'll see the resemblance of
    XUpdate at first glance. The envelope element for modifications expressed
    in XUpdate is <literal>xup:modifications</literal>, similar to
    <literal>xsl:transform</literal> or <literal>xsl:stylesheet</literal>. The
    namespace declarations on this element assign prefixes for use in the
    XUpdate script and <emphasis role="strong">have no connection</emphasis>
    to the prefixes used in the document being modified (the <emphasis role="strong">source document</emphasis>), even though they happen to be
    the same. If you want to access elements in a namespace declared as the
    default in the source document, then just as in XSLT you must declare and
    use a prefix for the namespace in the XUpdate script.</para>

    <para>Each modification request is expressed as an XUpdate instruction.
    This example demonstrates <literal>xup:append</literal> and
    <literal>xup:remove</literal>. There are other instructions providing
    types of modification such as <literal>xup:insert-before</literal>
    <literal>xup:update</literal> and there are also control constructs such
    as <literal>xup:if</literal>, which is similar to
    <literal>xsl:if</literal>. Instructions usually have a
    <literal>select</literal> attribute containing an XPath expression that
    specifies the node to be used as a reference for modification. In the case
    of <literal>xup:append</literal>, <literal>select</literal> specifies a
    node after which some new XML will be appended. In the case of
    <literal>xup:remove</literal>, <literal>select</literal> identifies nodes
    to be removed. When an instruction needs to specify a chunk of XML to be
    used in the modification it is expressed as the content of the
    instructions in a similar fashion to XSLT templates. In the case of
    <literal>xup:append</literal> this template expresses the chunk of XML to
    be inserted into the document. In order to generate elements and
    attributes XUpdate provides output instructions such as
    <literal>xup:element</literal> and <literal>xup:attribute</literal>, which
    are very similar to their XSLT equivalents. In another idea borrowed from
    XSLT, XUpdate allows you to create element by placing literal result
    elements in the templates. If you'd like to get a closer look at XUpdate,
    the best way is by browsing the very clear examples in the <ulink url="http://www.xmldatabases.org/projects/XUpdate-UseCases/">XUpdate Use
    Cases</ulink> compiled by Kimbro Staken. The following listing is a Python
    code that can be used to apply an XUpdate script. It's a simplified
    version of the code for the 4xupdate command line.</para>

    <programlisting>import sys
from Ft.Xml import XUpdate
from Ft.Xml import Domlette, InputSource
from Ft.Lib import Uri

# Set up reader objects for parsing the XML files
reader = Domlette.NonvalidatingReader
xureader = XUpdate.Reader()

# Parse the source file
source_uri = Uri.OsPathToUri(sys.argv[1], attemptAbsolute=1)
source = reader.parseUri(source_uri)

# Parse the XUpdate file
xupdate_uri = Uri.OsPathToUri(sys.argv[2], attemptAbsolute=1)
isrc = InputSource.DefaultFactory.fromUri(xupdate_uri)
xupdate = xureader.fromSrc(isrc)

# Set up the XUpdate processor and run against the source file
# The Domlette for the source is modified in place
processor = XUpdate.Processor()
processor.execute(source, xupdate)

# Print the updated DOM node to standard output
Domlette.Print(source)</programlisting>

    <para>Notice the use of <literal>Uri.OsPathToUri</literal> to convert file
    system paths to proper URIs for use in 4Suite. I strongly recommend this
    convention as one way to help minimize confusion between file
    specifications and URIs -- the basis of many frequently asked questions.
    The <literal>XUpdate.Processor</literal> class defines the environment for
    running XUpdate commands and <literal>execute()</literal> is the method
    that actually kicks off the processing. It operates on a Domlette
    instance, modifying it in place (so be careful when using using XUpdate in
    this way). The updated document object is printed to standard output using
    <literal>Domlette.Print</literal>.</para>

    <para>The following snippet illustrates how to run the test script, and
    the output result.</para>

    <programlisting>$ python listing4.py products.xml listing3.xup
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;products xmlns:p="http://example.com/product-info"
xmlns:html="http://www.w3.org/1999/xhtml"
xmlns:xl="http://www.w3.org/1999/xlink"
&gt;
  &lt;product xmlns="http://example.com/product-info"&gt;
    &lt;name xml:lang="en"&gt;Python Perfect IDE&lt;/name&gt;
    &lt;description&gt;
      Uses mind-reading technology to anticipate and accommodate
      all user needs in Python development.  Implements all
       features though
      the year 3000.  Works well with &lt;code&gt;1166&lt;/code&gt;.
    &lt;/description&gt;
  &lt;launch-date/&gt;&lt;p:launch-date/&gt;&lt;island/&gt;&lt;/product&gt;
  &lt;p:product id="1166"&gt;
    &lt;p:name&gt;XSLT Perfect IDE&lt;/p:name&gt;
    &lt;p:description&gt;
      &lt;p:code&gt;red&lt;/p:code&gt;
      &lt;html:code&gt;blue&lt;/html:code&gt;
      &lt;html:div global="spam" class="eggs" xml:lang="en"&gt;
        &lt;ref xl:type="simple"&gt;A link&lt;/ref&gt;
      &lt;/html:div&gt;
    &lt;/p:description&gt;
  &lt;/p:product&gt;
&lt;/products&gt;</programlisting>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/XInclude.xml" id="xinclude">
  <title>XInclude processing</title>

  <section id="about_xinclude">
    <title>About XInclude</title>

    <para><ulink url="http://www.w3.org/TR/xinclude/">XML Inclusions
    (XInclude)</ulink> is a W3C Recommendation that provides XML document
    authors with a robust way of supporting document modularity via the use
    of <ulink url="http://en.wikipedia.org/wiki/Transclusion">transclusions</ulink>
    (inclusions by reference). Such modularity would otherwise require using
    references to external entities declared in a DTD, a system which has
    various limitations inherited from SGML.</para>

    <para>Unlike XML's built-in entity-reference system, the processing of
    XIncludes is, fundamentally, an XML Infoset transformation, not strictly
    an operation performed on the serialized (textual) form of a document.
    Therefore, there is no requirement for when and where XInclude
    processing should occur; it could happen at parse time if the parser
    supports it, or could occur on an already-parsed document.</para>

    <para>XInclude references consist of two special elements that are
    placed in the XML document into which external content is to be
    included: <code>&lt;include&gt;</code> and
    <code>&lt;fallback&gt;</code>, both in the namespace
    <literal>http://www.w3.org/2001/XInclude</literal>. When processed,
    these elements are replaced with the content they reference, which can
    be XML or any other text.</para>
  </section>

  <section id="xinclude_in_4suite">
    <title>XInclude support in 4Suite</title>

    <para>4Suite supports XInclude processing only at parse time, as an
    optional feature of the Domlette readers. It is turned on by default, so
    if you want to suppress it, you must use the full parsing API — not
    the <methodname>Ft.Xml.Parse</methodname> and
    <methodname>Ft.Xml.CreateInputSource</methodname> convenience functions
    — and set the parameter <parameter>processIncludes</parameter> to
    <literal>False</literal> either when creating an
    <classname>InputSource</classname> or when calling the
    <methodname>parseString</methodname>, <methodname>parseUri</methodname>,
    or <methodname>parseStream</methodname> method of the Domlette
    reader.</para>
  </section>

  <section revisionflag="added" id="xinclude_examples">
    <title>Examples</title>

    <para>The following example includes one section stub into a larger
    article but has to use the fallback for the second section stub, where
    resolution fails. <xref linkend="xinclude.example.main_doc"/> lists the
    contents of the file <filename>article.xml</filename>, which references
    two sections using XInclude and provides a fallback for each in case
    they fail to load. <xref linkend="xinclude.example.included_section"/>
    lists the contents of <filename>section1.xml</filename>, but this
    example purposefully does not provide a
    <filename>section2.xml</filename> in order to illustrate the fallback
    behaviour. <xref linkend="xinclude.example.load"/> lists the Python
    code used to parse and print this document; note that XInclude
    processing is done automatically by default.</para>

    <figure id="xinclude.example.main_doc">
      <title>Document using XInclude</title>

      <programlisting>&lt;article&gt;
  &lt;title&gt;My important article&lt;/title&gt;
  &lt;xi:include href="section1.xml" xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
    &lt;xi:fallback&gt;&lt;!-- Section 1 failed to load! --&gt;&lt;/xi:fallback&gt;
  &lt;/xi:include&gt;
  &lt;xi:include href="section2.xml" xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
    &lt;xi:fallback&gt;&lt;!-- Section 2 failed to load! --&gt;&lt;/xi:fallback&gt;
  &lt;/xi:include&gt;
&lt;/article&gt;</programlisting>
    </figure>

    <figure id="xinclude.example.included_section">
      <title>Section to be included</title>

      <programlisting>&lt;section&gt;
  &lt;title&gt;Section 1&lt;/title&gt;
  &lt;!-- Write me! --&gt;
&lt;/section&gt;</programlisting>
    </figure>

    <figure id="xinclude.example.load">
      <title>Loading the document</title>

      <programlisting>from Ft.Xml import Parse
from Ft.Xml.Domlette import PrettyPrint
doc = Parse("article.xml")
PrettyPrint(doc)</programlisting>
    </figure>

    <para><xref linkend="xinclude.example.complete"/> is very similar to
    the above example, only this version is self-contained; the resources
    are stored in Python strings and resolved using a custom
    resolver.</para>

    <figure id="xinclude.example.complete">
      <title>Self-contained example</title>

      <programlisting>article = """&lt;article&gt;&lt;title&gt;My important article&lt;/title&gt;
&lt;xi:include href="ex:section" xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
  &lt;xi:fallback&gt;&lt;!-- Section 1 failed to load! --&gt;&lt;/xi:fallback&gt;
&lt;/xi:include&gt;
&lt;xi:include href="ex:section2" xmlns:xi="http://www.w3.org/2001/XInclude"&gt;
  &lt;xi:fallback&gt;&lt;!-- Section 2 failed to load! --&gt;&lt;/xi:fallback&gt;
&lt;/xi:include&gt;
&lt;/article&gt;"""

section = "&lt;section&gt;&lt;title&gt;Section 1&lt;/title&gt;&lt;!-- Write me! --&gt;&lt;/section&gt;"

from Ft.Lib.Uri import FtUriResolver, Absolutize
from Ft.Lib import UriException
from cStringIO import StringIO
class MyResolver (FtUriResolver):
  def normalize(self, uriRef, baseUri):
    return Absolutize(uriRef, baseUri)
  def resolve(self, uri):
    if uri == "ex:article":
      return StringIO(article)
    elif uri == "ex:section":
      return StringIO(section)
    else:
      raise UriException(UriException.RESOURCE_ERROR,
                         loc=uri, msg="not found, sorry")

myResolver = MyResolver()

from Ft.Xml.InputSource import InputSourceFactory
from Ft.Xml.Domlette import NonvalidatingReader, PrettyPrint
factory = InputSourceFactory(resolver=myResolver)
isrc = factory.fromUri("ex:article")
doc = NonvalidatingReader.parse(isrc)
PrettyPrint(doc)</programlisting>
    </figure>

    <para>To turn off XInclude behavior in <xref linkend="xinclude.example.complete"/>, replace the last three lines
    with these three lines:</para>

    <programlisting>isrc = factory.fromUri("ex:article", processIncludes=False)
doc = NonvalidatingReader.parse(isrc)
PrettyPrint(doc)</programlisting>

    <para><xref linkend="xinclude.example.load"/> uses the "super simple"
    parsing API; we need to use the full parsing API in order to disable
    XInclude expansion (which, paradoxically, takes one less line):</para>

    <programlisting>from Ft.Xml.Domlette import NonvalidatingReader, PrettyPrint
doc = NonvalidatingReader.parseStream(file("article.xml"), processIncludes=False)
PrettyPrint(doc)</programlisting>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/XPointer.xml" id="xpointer">
  <title>XPointer processing</title>

  <section id="about_xpointer">
    <title>About XPointer</title>

    <para><ulink url="http://www.w3.org/TR/xptr-framework/">XPointer</ulink>
    is a set of W3C specifications (one part of which is, as of 2006, still
    a Working Draft) that provide a means of identifying and referring to a
    portion of an XML document. The portion being referenced need not be
    contiguous, and need not constitute a well-formed general entity.
    XPointers were originally intended to be used in the fragment component
    of a URI or IRI (the fragment being the part after
    "<literal>#</literal>"), but the specifications actually place no
    restrictions on where they can be used.</para>

    <para>An example of an XPointer embedded in a URI would be</para>

    <para><literal>http://example.com/inventory.xml#xpointer(//part%5Bstarts-with(sku,%20'999')%5D)</literal></para>

    <para>The XPointer in that example is actually</para>

    <para><literal>xpointer(//part[starts-with(sku,
    '999')])</literal></para>

    <para>but the URI syntax requires further encoding of some data. The
    result of evaluating this XPointer would be the same as evaluating the
    XPath expression <code>//part[starts-with(sku, '999')]</code> against
    the document identified by the URI
    <literal>http://example.com/inventory.xml</literal>.</para>

    <para>XPointer syntax is simple: a <firstterm>shorthand
    XPointer</firstterm> is just a name, and refers to the element with that
    ID (as determined by a DTD or other schema, typically), much like the
    XPath 1.0 expression <code>id(somename)</code>, but with a little more
    flexibility, since <code>id()</code> is limited to DTD-based data
    typing.</para>

    <para>A <firstterm>scheme-based XPointer</firstterm> consists of a
    series of one or more <firstterm baseform="part">parts</firstterm>,
    separated by optional whitespace, with each part looking like a function
    call. What appear to be function names are actually syntactic and
    semantic <firstterm>schemes</firstterm>, of which the most common is the
    ID-oriented <literal>element</literal> scheme, and of which the most
    versatile is the XPath-oriented <literal>xpointer</literal>
    scheme.</para>

    <para>If a scheme-based XPointer contains more than one part, then the
    parts are evaluated from left to right, skipping any
    unsupported/unrecognized schemes, until one is found that identifies
    something that exists in the document. Some schemes, like the
    namespace/prefix-binding <literal>xmlns</literal>, identify nothing (by
    design), and instead just influence the interpretation of subsequent
    parts. It's possible for an XPointer to produce different results with
    different processors, if the author doesn't take care to ensure each
    part identifies the same thing.</para>

    <para>Here are some more examples:</para>

    <para>The XPath 1.0 expression <code>id(somename)</code> means the same
    thing as the XPointer <code>xpointer(id(somename))</code>, and nearly
    the same thing as the XPointers <code>element(somename)</code> and
    <code>somename</code>, which just have more flexibility in where the ID
    can be drawn from.</para>

    <para>The XPointer <code>element(somename/3/1)</code> means nearly the
    same thing as the XPath expression
    <code>id(somename)/*[3]/*[1]</code>.</para>

    <para>The XPointer
    <code>xmlns(xhtml=http://www.w3.org/1999/xhtml)xpointer(//xhtml:a[@href])</code>
    could be used to refer to all of the links in an XHTML 1.0
    document.</para>
  </section>

  <section id="xpointer_in_4suite">
    <title>XPointer support in 4Suite</title>

    <para>4Suite's XPointer implementation, sometimes called 4XPointer, has
    no command-line interface, but can be used within Python applications.
    It supports XPointers to different degrees, depending on the
    circumstances:</para>

    <orderedlist>
      <listitem>
        <para>When an XML document is being parsed into a Domlette with
        XInclude processing enabled, any XPointer encountered in an
        <code>xi:include</code> element is automatically evaluated when the
        included document is parsed. In this mode the XPointer must use an
        XPath LocationPath that only uses steps along the child axis.
        Furthermore, any predicates must be literal numbers, or must be of
        the specific form <code>[@attname='attvalue']</code>. For example,
        <code>/foo[3]</code> and <code>/foo[@bar='baz']</code> will work,
        but <code>../foo</code> and <code>foo/[.='bar']</code> will not.
        Function calls are not allowed.</para>
      </listitem>

      <listitem>
        <para>If you have not yet parsed an XML document, but have a URI for
        it, then you can use
        <methodname>Ft.Xml.XPointer.SelectUri</methodname>() to parse the
        document and evaluate an XPointer embedded in the URI's fragment
        component. The parsing is performed with Domlette's default
        <classname>NonvalidatingReader</classname> instance. There are some
        implementation gaps to note when using the
        <literal>xpointer</literal> scheme: the only additional function
        fully supported is <methodname>here</methodname>(), and the
        following functions always return empty location-sets:
        <methodname>string-range</methodname>(),
        <methodname>range-to</methodname>(),
        <methodname>start-point</methodname>(),
        <methodname>end-point</methodname>(), and
        <methodname>origin</methodname>(). <methodname>origin</methodname>
        is illegal to use outside of extended XLinks, anyway.</para>
      </listitem>

      <listitem>
        <para>If you have already parsed the document into a Domlette, then
        you can evaluate an arbitrary XPointer against it by using
        <methodname>Ft.Xml.XPointer.SelectNode</methodname>(). The same
        implementation gaps as noted in the description of
        <methodname>Ft.Xml.XPointer.SelectUri</methodname>() apply.</para>
      </listitem>
    </orderedlist>

    <para>Ranges are not supported because Domlette does not support DOM
    Level 2 Ranges. Uche Ogbuji posted <ulink url="http://lists.fourthought.com/pipermail/4suite/2003-September/005813.html">some
    thoughts about this topic</ulink> a while back. Also note that although
    the <literal>element</literal> scheme is streamable, it is not yet
    supported in XIncludes due to ID-related limitations in Domlette. Since
    <literal>element</literal> and shorthand pointer support are
    requirements for full XInclude conformance, they will probably be
    implemented in the future.</para>

    <para>In 4Suite 1.0b1 and earlier, the implementation was based on older
    versions of the specs, and several additional restrictions were in
    effect: the <literal>element</literal> scheme was not even an option,
    XPointers in XIncludes had to be given via URIs (not attributes) and
    couldn't contain NameTests involving "<code>*</code>", and all other
    XPointers were only allowed to identify a single node.</para>
  </section>

  <section revisionflag="added" id="xpointer_examples">
    <title>Examples</title>

    <para>The following example uses XInclude with XPointer references to
    include various sections from one document into another document. <xref linkend="xpointer.example.main_doc"/> lists the contents of the file
    <filename>article.xml</filename>, which references one section using a
    shorthand pointer and then references any sections that have their
    <sgmltag class="attribute">condition</sgmltag> attribute set to <sgmltag class="attvalue">unfinished</sgmltag>. <xref linkend="xpointer.example.referenced_doc"/> lists the contents of the
    file <filename>article2.xml</filename>, which is referenced from
    <filename>article.xml</filename>. <xref linkend="xpointer.example.load"/> lists the Python code used to parse
    and print this document; note that XPointer processing is driven from
    XInclude processing, which is done automatically by default.</para>

    <figure id="xpointer.example.main_doc">
      <title><filename>article.xml</filename>: Document using XInclude with
      XPointer references</title>

      <programlisting>&lt;article&gt;
  &lt;title&gt;My important article&lt;/title&gt;
  &lt;xi:include href="article2.xml"
              xpointer="woo"
              xmlns:xi="http://www.w3.org/2001/XInclude"/&gt;
  &lt;xi:include href="article2.xml"
              xpointer="xpointer(article/section[@condition='unfinished'])"
              xmlns:xi="http://www.w3.org/2001/XInclude"/&gt;
&lt;/article&gt;</programlisting>
    </figure>

    <figure id="xpointer.example.referenced_doc">
      <title><filename>article2.xml</filename>: Document with content
      referenced from <filename>article.xml</filename></title>

      <programlisting>&lt;article&gt;
  &lt;section condition="unfinished"&gt;
    &lt;title&gt;Section 1&lt;/title&gt;
    &lt;!-- Write me! --&gt;
  &lt;/section&gt;
  &lt;section xml:id="woo"&gt;
    &lt;title&gt;Section 2&lt;/title&gt;
    &lt;para&gt;Yeah, content.&lt;/para&gt;
  &lt;/section&gt;
  &lt;section condition="unfinished"&gt;
    &lt;title&gt;Section 3&lt;/title&gt;
    &lt;!-- Write me, too! --&gt;
  &lt;/section&gt;
&lt;/article&gt;</programlisting>
    </figure>

    <figure id="xpointer.example.load">
      <title>Loading the document</title>

      <programlisting>from Ft.Xml import Parse
from Ft.Xml.Domlette import PrettyPrint
doc = Parse("article.xml")
PrettyPrint(doc)</programlisting>
    </figure>

    <para>As mentioned earlier, XPointer is most commonly used along with
    XInclude, but 4Suite provides an API for using XPointer directly from
    Python. Using <filename>article2.xml</filename> as listed above in <xref linkend="xpointer.example.referenced_doc"/>, <xref linkend="xpointer.example.direct"/> loads two of the nodes loaded
    previously with XInclude. Note that when using the standalone interface,
    the code is able to take advantage of more of the XPointer
    syntax.</para>

    <figure id="xpointer.example.direct">
      <title>Using XPointer directly from Python</title>

      <programlisting>from Ft.Xml import Parse
from Ft.Xml.Domlette import PrettyPrint
from Ft.Xml.XPointer import SelectNode

article2 = Parse("article2.xml")
# Shorthand XPointer syntax
node = SelectNode(article2, "woo")[0]
PrettyPrint(node)
# Scheme-based XPointer syntax
node = SelectNode(article2,
                  "xpointer(//section[@condition='unfinished'][2])")[0]
PrettyPrint(node)</programlisting>
    </figure>

    <para><xref linkend="xpointer.example.complete"/> is very similar to
    the examples above, only this version is self-contained; the resources
    are stored in Python strings and resolved using a custom
    resolver.</para>

    <figure id="xpointer.example.complete">
      <title>Self-contained example</title>

      <programlisting>article = """&lt;article&gt;&lt;title&gt;My important article&lt;/title&gt;
&lt;xi:include href="ex:article2"
            xpointer="woo"
            xmlns:xi="http://www.w3.org/2001/XInclude"/&gt;
&lt;xi:include href="ex:article2"
            xpointer="xpointer(article/section[@condition='unfinished'])"
            xmlns:xi="http://www.w3.org/2001/XInclude"/&gt;
&lt;/article&gt;"""

article2 = """&lt;article&gt;
&lt;section condition="unfinished"&gt;&lt;title&gt;Section 1&lt;/title&gt;&lt;!-- Write me! --&gt;&lt;/section&gt;
&lt;section xml:id="woo"&gt;&lt;title&gt;Section 2&lt;/title&gt;&lt;para&gt;Yeah, content.&lt;/para&gt;&lt;/section&gt;
&lt;section condition="unfinished"&gt;&lt;title&gt;Section 3&lt;/title&gt;&lt;!-- Write me, too! --&gt;&lt;/section&gt;
&lt;/article&gt;"""

from Ft.Lib.Uri import FtUriResolver, Absolutize
from Ft.Lib import UriException
from cStringIO import StringIO
class MyResolver (FtUriResolver):
  def normalize(self, uriRef, baseUri):
    return Absolutize(uriRef, baseUri)
  def resolve(self, uri):
    if uri == "ex:article":
      return StringIO(article)
    elif uri == "ex:article2":
      return StringIO(article2)
    else:
      raise UriException(UriException.RESOURCE_ERROR,
                         loc=uri, msg="not found, sorry")

myResolver = MyResolver()

from Ft.Xml.InputSource import InputSourceFactory
from Ft.Xml.Domlette import NonvalidatingReader, PrettyPrint
factory = InputSourceFactory(resolver=myResolver)
isrc = factory.fromUri("ex:article")
doc = NonvalidatingReader.parse(isrc)
PrettyPrint(doc)

from Ft.Xml.XPointer import SelectNode

isrc = factory.fromUri("ex:article2")
article2 = NonvalidatingReader.parse(isrc)
node = SelectNode(article2, "woo")[0]
PrettyPrint(node)
node = SelectNode(article2,
                  "xpointer(//section[@condition='unfinished'][2])")[0]
PrettyPrint(node)
</programlisting>
    </figure>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/Examples.xml">
  <title>Comprehensive examples</title>

  <para>This section contains a set of examples that transcend the boundaries
  of individual topics. These examples combine multiple different techniques
  and often address more common use-cases found "in the wild".</para>

  <section>
    <title>Transforming DocBook using the DocBook XSL stylesheets</title>

    <para>In the XML universe, one common use-case is converting <ulink url="http://www.docbook.org/">DocBook</ulink> (a common XML application)
    to various output formats for publishing using the <ulink url="http://docbook.sourceforge.net/projects/xsl/index.html">DocBook
    XSL</ulink> set of XSLT scripts. If you have the DocBook XSL distribution
    installed (or if you have an Internet connection), you can transform
    DocBook files completely within the 4Suite XML API. The following example
    illustrates how this can be done, and in the process this example touches
    on a wide variety of 4Suite XML techniques. These are listed below.</para>

    <itemizedlist>
      <title>4Suite techniques used in this example</title>

      <listitem>
        <para>Building a Domlette XML model manually</para>
      </listitem>

      <listitem>
        <para>Parsing XML into a Domlette XML model</para>
      </listitem>

      <listitem>
        <para>Using XSLT in 4Suite XML</para>
      </listitem>

      <listitem>
        <para>Using <classname>InputSource</classname>s with automatic XML
        Catalog resolution</para>
      </listitem>

      <listitem>
        <para>Managing URIs</para>
      </listitem>

      <listitem>
        <para>Writing XML from a Domlette XML model</para>
      </listitem>

      <listitem>
        <para>And a bonus feature unrelated to 4Suite: i18n with the DocBook
        XSL scripts!</para>
      </listitem>
    </itemizedlist>

    <programlisting>from Ft.Xml.Domlette import implementation, PrettyPrint, NonvalidatingReader
from Ft.Xml.Xslt import Processor
from Ft.Xml import Catalog, InputSource, EMPTY_NAMESPACE
from Ft.Lib import Uri, UriException

# New processor
processor = Processor.Processor()

# If you have the DocBook XSL scripts installed in your system, then they are likely
# integrated into the system catalog, which is often at `/etc/xml/catalog` on
# Unix-like systems.  If you have a catalog which resolves the DocBook XSL URIs
# located in a different filename, you can change this filename below.  Otherwise,
# this example will access the DocBook XSL scripts directly (i.e. over the network).
catalog_filename = '/etc/xml/catalog'
# Turn the catalog filename into the corresponding `file` URI.
catalog_URI = Uri.OsPathToUri(catalog_filename)
# Try to load the catalog, moving right along if it doesn't exist.
theCatalog = None
try:
  theCatalog = Catalog.Catalog(catalog_URI)
except UriException, e:
  pass

# Create a new `InputSourceFactory` object to use our catalog.
inputSourceFactory = InputSource.InputSourceFactory(catalog = theCatalog)
docbook_xsl_URI = 'http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl'
# Set up an `InputSource` for the DocBook XSL stylesheets.
docbook_xsl_source = inputSourceFactory.fromUri(docbook_xsl_URI)
# Build a DOM of our stylesheet, then load the stylesheet into the XSLT processor.
transform = NonvalidatingReader.parse(docbook_xsl_source)
processor.appendStylesheetNode(transform, docbook_xsl_URI)

# Now we build our DocBook DOM, with a document root of myDoc.
myDoc = implementation.createRootNode('file:///article.xml')
article = myDoc.createElementNS(EMPTY_NAMESPACE,  'article')
myDoc.appendChild(article)
article.setAttributeNS(None, 'lang', "es")
myDoc.publicId="-//OASIS//DTD DocBook XML V4.2//EN"
myDoc.systemId="http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"

element = myDoc.createElementNS(EMPTY_NAMESPACE, 'title')
element.appendChild(myDoc.createTextNode('Title of article'))
article.appendChild(element)

section = myDoc.createElementNS(EMPTY_NAMESPACE, 'section')
article.appendChild(section)

element = myDoc.createElementNS(EMPTY_NAMESPACE, 'title')
element.appendChild(myDoc.createTextNode('Title of section'))
section.appendChild(element)

element = myDoc.createElementNS(EMPTY_NAMESPACE, 'para')
element.appendChild(myDoc.createTextNode('paragraph of section'))
section.appendChild(element)

print '************************ xml *******************************'
# Serialize the source document as XML.
PrettyPrint(myDoc)

print '************************ html *******************************'
# Print the result of transforming the document.
result = processor.runNode(myDoc)
print result</programlisting>
  </section>
</section>

  <section xml:base="file:///Users/uche/share/4Suite/Documentation/CoreManualSections/Resources.xml">
  <title>Resources</title>

  <para>Sources of additional information</para>

  <para>More on DOMs in Python: <ulink url="http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/dom">Basic DOM
  processing</ulink></para>

  <para><ulink url="http://uche.ogbuji.net/tech/akara/nodes/2004-06-12/external-encoding">External
  encoding declarations</ulink></para>

  <para>[XML
  Catalogs|http://uche.ogbuji.net/tech/akara/nodes/2004-06-12/external-encoding]</para>

  <para>There is more coverage of the 4Suite XPath package in this <ulink url="http://www.xml.com/pub/a/2002/10/16/py-xml.html">Tour of
  4Suite</ulink>.</para>

  <para><ulink url="http://www.logilab.org/XMLTutorial/diap60.html">This slide
  and the following</ulink> from Alexandre Fayolles' excellent <ulink url="http://www.logilab.org/XMLTutorial/">EuroPython 2002 tutorial on
  Python/XML processing</ulink> is an great introduction to XPath and XSLT
  processing in Python.</para>

  <para>This <ulink url="http://www-105.ibm.com/developerworks/education.nsf/xml-onlinecourse-bytitle/BE1A7E60838F9F7686256AF400523C58">XPath
  and 4XPath tutorial</ulink> is a bit dated, but very comprehensive. Free
  registration is required.</para>

  <para>You can use EXSLT's node-set extension to provide functionality much
  like transform chaining. FOr more details see <ulink url="http://www-128.ibm.com/developerworks/xml/library/x-tipxsltmp.html">"Tip:
  Multi-pass XSLT"</ulink></para>

  <para>For more on RELAX NG in general, see <ulink url="http://www.oasis-open.org/committees/relax-ng/tutorial-20011203.html">The
  official RELAX NG tutorial</ulink>.</para>

  <para>For more on XVIF, see this <ulink url="http://lists.fourthought.com/pipermail/4suite/2002-October/004432.html">follow-up
  by Eric</ulink>.</para>

  <para>I use 4xml's --rng option in examples in <ulink url="http://www-106.ibm.com/developerworks/xml/library/x-xmptron/">my
  article on Examplotron</ulink></para>

  <para>If you want to try out online 4suite and RelaxNG, go to <ulink url="http://www.defuze.org/oss/tree/">http://www.defuze.org/oss/tree/</ulink></para>

  <para><ulink url="http://www.xml.com/pub/a/2005/04/20/py-xml.html">This
  article</ulink> discusses MarkupWriter</para>

  <para>For more examples of MarkupWriter, see:</para>

  <itemizedlist>
    <listitem>
      <para><ulink url="http://copia.ogbuji.net/blog/2005-08-01/Another_sm">"Another small
      4Suite MarkupWriter example: XHTML 1.1"</ulink></para>
    </listitem>

    <listitem>
      <para><ulink url="http://copia.ogbuji.net/blog/2005-05-09/XML_recurs">"XML recursive
      directory listing, part 2"</ulink></para>
    </listitem>
  </itemizedlist>

  <para>See <ulink url="http://uche.ogbuji.net/tech/akara/?xslt=irc.xslt&amp;date=2003-05-28#00:05:17">this
  #4suite blog entry</ulink> for another example of XPath extensions.</para>

  <para>Tamito KAJIYAMA responds to a thread discussing the grouped sorting
  XSLT FAQ in 4XSLT, <ulink url="http://mail.python.org/pipermail/xml-sig/2003-May/009430.html">offering
  an extension function as a possible solution</ulink>.</para>
</section>
</article>

4Suite/Manual (last edited 2008-11-24 18:46:29 by localhost)