This is a collaborative FAQ for Amara.

XPath (e.g. xml_xpath method) and XSLT patterns do not work with default namespaces?

Quoted from "xml_xpath does not work properly when xmlns is not empty"

import amara

#works
a = '<abc xmlns="foo"><xyz>bar</xyz></abc>'
b = amara.parse(a)
c = b.xml_xpath(u'*[1]')
print c
# prints: [<amara.bindery.abc object at...>]

#doesn't work
a = '<abc xmlns="foo"><xyz>bar</xyz></abc>'
b = amara.parse(a)
c = b.xml_xpath(u'abc')
print c
# prints: []

#works
a = '<abc xmlns=""><xyz>bar</xyz></abc>'
b = amara.parse(a)
c = b.xml_xpath(u'abc')
print c
# prints: [<amara.bindery.abc object at...>]


response:

I suggest:

a = '<abc xmlns="foo"><xyz>bar</xyz></abc>'
b = amara.parse(a, prefixes={u'f': u'foo'})
c = b.xml_xpath(u'f:abc')
print c

And yes, other compliant XPath 1.0 tools have to deal with this restriction e.g.

I can't access Bindery objects with certain name patterns in the XML

I have XML such as:

<note prefix="spam">
  Hello World
</note>

or

<xsl:if xmlns:xsl="http://www.w3.org/1999/XSL/Transform" test="foo">
  Hello World
</xsl:if>

But I get unusual results or errors from e.g. doc.note.prefix or doc.if.


Amara uses a name mangling scheme to deal with domain rule name clashes between XML and the library. It could be a matter of a Python reserved word such as if or of a name reserved by Amara itself, such as prefix, which is reserved for DOM compatibility.

You can access these objects by using their mangled names:

print doc.note.prefix_
print doc.if_

or using the mapping protocol APIs:

XSLTNS = u"http://www.w3.org/1999/XSL/Transform"
print doc.note[None, u'prefix']
print doc[XSLTNS, u'if']

or using XPath

XSLTNS = u"http://www.w3.org/1999/XSL/Transform"
print doc.note.xml_xpath(u'@prefix']
print doc.xml_xpath(u'xsl:if')  #Make sure the xsl prefix is defined, usually no problem for sane documents

Also watch out for cases where XML names contain characters illegal in Python, such as dashes. These are also mangled:

<note x-id="spam">
  Hello World
</note>

You would use:

print note.x_id #"spam"

And watch out for name clashes (which are a very rare case in the real world):

<note xmlns:msg="urn:bogus:message">
  <msg:id>spam</msg:id>
  Hello <id>World</id>
</note>

Amara disambiguates favoring the first instance in doc order:

note.id #"spam"
note.id_ #"World"

Or even:

<note id="spam">
  Hello <id>World</id>
</note>

Amara disambiguates favoring the attribute:

note.id #"spam"
note.id_ #"World"

As Uche sometimes tell users:

Amara can't handle file paths on Windows

After installing amara on my win xp box with easy_install, I tried the example in the Amara manual and got the following error:

>>> import amara
>>> doc = amara.parse("f:\\monty.xml")
Traceback (most recent call last):
[SNIP]
Ft.Lib.UriException: The URI scheme f is not supported by resolver FtUriResolver


Turn that weird OS path into a proper URL:

from Ft.Lib import Uri
doc = amara.parse(Uri.OsPathToUri("f:\\monty.xml"))

Amara1/FAQ (last edited 2010-12-03 17:53:27 by LmMorillas)