Link: start
Link: parent
Link: First page in set (first)
Link: Previous page (previous)
Link: Next page (next)
Link: Last page in set (last)
Link: A plain text version of this page (alternate)
Link: The XIST source of this page (alternate)
Link: The Python module described in this page (alternate)
XIST.xfind
==========
Tree iteration and filtering
============================
Home > Python software > ll.xist > xfind Text · XIST · Python
Python softwarelist of projects
* ll.xistAn extensible XML/HTML generator
* ExamplesParsing/creating/modifying XML; Traversing XML
trees
* HowtoExplains parsing/generating XML files, XML
transformations via XIST classes and other basic
concepts.
* SearchingHow to iterate through XIST trees
* TransformationHow to transform XIST trees
* Advanced topicsPool chaining, converter contexts,
validation
* MiscellaneousExplains various odds and ends of XIST
* xscXIST core classes
* nsPackage containing namespace modules
* parseParsing XML
* presentScreen output of XML trees
* simsSimple schema validation
* xfindTree iteration and filtering
* cssCSS related functions
* scriptsScripts for text conversion and creating XIST
namespaces
* HistoryChangeLog for XIST
* InstallationHow to install and configure XIST
* MigrationHow to update your code to new versions of XIST
* Mailing listsHow to subscribe to the XIST mailing lists
* ll.ul4cA templating language
* ll.urlRFC 2396 compliant URLs
* ll.makeObject oriented make replacement
* ll.daemonForking daemon processes
* ll.sisyphusWriting cron jobs with Python
* ll.colorRGB color values and color model conversion
* ll.miscMisc utility functions and classes
* ll.orasqlUtilities for cx_Oracle
* ll.nightshadeServe the output of Oracle functions/procedures
with CherryPy
* ll.scriptsScripts for UL4 template rendering and URL handling
* AploraLogging Apache HTTP requests to an Oracle database
* PycocoPython code coverage
* DownloadLinks to Windows and Linux, source and binary
distributions
* Source codeAccess to the Mercurial repositories
This module contains XFind and CSS selectors and related classes
and functions.
A selector is a XIST tree traversal filter that traverses the
complete XML tree and outputs those nodes specified by the
selector. Selectors can be combined with various operations and
form a language comparable to XPath but implemented as Python
expressions.
class WalkFilter(object):
==========================
A WalkFilter can be passed to the walk method of nodes to specify
how to traverse the tree and which nodes to output.
def filternode(self, *args, **kwargs):
=======================================
def filterpath(self, path):
============================
def walk(self, node):
======================
def walknodes(self, node):
===========================
def walkpaths(self, node):
===========================
def _walk(self, path):
=======================
class FindType(WalkFilter):
============================
Tree traversal filter that finds nodes of a certain type on the
first level of the tree without decending further down.
def __init__(self, *types):
============================
def filternode(self, node):
============================
class FindTypeAll(WalkFilter):
===============================
Tree traversal filter that finds nodes of a certain type searching
the complete tree.
def __init__(self, *types):
============================
def filternode(self, node):
============================
class FindTypeAllAttrs(WalkFilter):
====================================
Tree traversal filter that finds nodes of a certain type searching
the complete tree (including attributes).
def __init__(self, *types):
============================
def filternode(self, node):
============================
class FindTypeTop(WalkFilter):
===============================
Tree traversal filter that finds nodes of a certain type searching
the complete tree, but traversal of the children of a node is
skipped if this node is of the specified type.
def __init__(self, *types):
============================
def filternode(self, node):
============================
class ConstantWalkFilter(WalkFilter):
======================================
Tree traversal filter that returns the same value for all nodes.
def __init__(self, value):
===========================
def filterpath(self, path):
============================
class Selector(WalkFilter):
============================
Base class for all tree traversal filters that visit the complete
tree. Whether a node gets output can be specified by overwriting
the matchpath method. Selectors can be combined with various
operations (see methods below).
def matchpath(self, *args, **kwargs):
======================================
def filterpath(self, path):
============================
def __div__(self, other):
==========================
Create a ChildCombinator with self as the left hand selector and
other as the right hand selector.
def __floordiv__(self, other):
===============================
Create a DescendantCombinator with self as the left hand selector
and other as the right hand selector.
def __mul__(self, other):
==========================
Create an AdjacentSiblingCombinator with self as the left hand
selector and other as the right hand selector.
def __pow__(self, other):
==========================
Create a GeneralSiblingCombinator with self as the left hand
selector and other as the right hand selector.
def __and__(self, other):
==========================
Create an AndCombinator from self and other.
def __or__(self, other):
=========================
Create an OrCombinator from self and other.
def __invert__(self):
======================
Create a NotCombinator inverting self.
class AnySelector(Selector):
=============================
Selector that selects all nodes.
def matchpath(self, path):
===========================
class IsInstanceSelector(Selector):
====================================
Selector that selects all nodes that are instances of the
specified type. You can either create an IsInstanceSelector object
directly or simply pass a class to a function that expects a walk
filter (this class will be automatically wrapped in an
IsInstanceSelector):
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.a):
... print node.attrs.href, node.attrs.title
...
http://www.python.org/
http://www.python.org/#left%2Dhand%2Dnavigation
http://www.python.org/#content%2Dbody
http://www.python.org/search
http://www.python.org/about/ About The Python Language
http://www.python.org/news/ Major Happenings Within the Python Community
http://www.python.org/doc/ Tutorials, Library Reference, C API
http://www.python.org/download/ Start Running Python Under Windows, Mac, Linux and Others
...
def __init__(self, *types):
============================
def matchpath(self, path):
===========================
def __or__(self, other):
=========================
def __getitem__(self, index):
==============================
Return an nthoftype selector that uses index as the index and
self.types as the types.
def __str__(self):
===================
class hasname(Selector):
=========================
Selector that selects all nodes that have a specified Python name
(which only selects elements, processing instructions and
entities). Also a namespace name can be specified as a second
argument, which will only select elements from the specified
namespace:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.hasname("img")):
... print node.bytes()
...
def __init__(self, name, xmlns=None):
======================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class hasname_xml(Selector):
=============================
hasname_xml works similar to hasname except that the specified
name is treated as the XML name, not the Python name.
def __init__(self, name, xmlns=None):
======================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class IsSelector(Selector):
============================
Selector that selects one specific node in the tree. This can be
combined with other selectors via ChildCombinator or
DescendantCombinator selectors to select children of this specific
node. You can either create an IsSelector directly or simply pass
a node to a function that expects a walk filter:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(doc[0]/xsc.Element):
... print repr(node)
...
def __init__(self, node):
==========================
def matchpath(self, path):
===========================
def __str__(self):
===================
class IsRootSelector(Selector):
================================
Selector that selects the node that is the root of the traversal.
def matchpath(self, path):
===========================
class IsEmptySelector(Selector):
=================================
Selector that selects all empty elements or fragments:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.empty):
... print node.bytes()
...
...
def matchpath(self, path):
===========================
class OnlyChildSelector(Selector):
===================================
Selector that selects all node that are the only child of their
parents:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.onlychild & html.a):
... print node.bytes()
...
Quick Links (2.5.1)Documentation
...
def matchpath(self, path):
===========================
def __str__(self):
===================
class OnlyOfTypeSelector(Selector):
====================================
Selector that selects all nodes that are the only nodes of their
type among their siblings:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.onlyoftype & xsc.Element):
... print repr(node)
...
...
def matchpath(self, path):
===========================
def __str__(self):
===================
class hasattr(Selector):
=========================
Selector that selects all element nodes that have an attribute
with one of the specified Python names. For selecting nodes with
global attributes the attribute class can be passed:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.hasattr(xml.Attrs.lang)):
... print repr(node)
...
def __init__(self, *attrnames):
================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class hasattr_xml(Selector):
=============================
hasattr_xml works similar to hasattr except that the specified
names are treated as XML names instead of Python names.
def __init__(self, *attrnames):
================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrhasvalue(Selector):
==============================
Selector that selects all element nodes where an attribute with
the specified Python name has one of the specified values. For
global attributes the attribute class can be passed. Note that
"fancy" attributes (i.e. those containing non-text) will not be
considered:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.attrhasvalue("rel", "stylesheet")):
... print node.attrs.href
...
http://www.python.org/styles/screen-switcher-default.css
http://www.python.org/styles/netscape4.css
http://www.python.org/styles/print.css
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrhasvalue_xml(Selector):
==================================
attrhasvalue_xml works similar to attrhasvalue except that the
specified name is treated as an XML name instead of a Python name.
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrcontains(Selector):
==============================
Selector that selects all element nodes where an attribute with
the specified Python name contains one of the specified substrings
in its value. For global attributes the attribute class can be
passed. Note that "fancy" attributes (i.e. those containing
non-text) will not be considered:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.attrcontains("rel", "stylesheet")):
... print node.attrs.rel, node.attrs.href
...
stylesheet http://www.python.org/styles/screen-switcher-default.css
stylesheet http://www.python.org/styles/netscape4.css
stylesheet http://www.python.org/styles/print.css
alternate stylesheet http://www.python.org/styles/largestyles.css
alternate stylesheet http://www.python.org/styles/defaultfonts.css
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrcontains_xml(Selector):
==================================
attrcontains_xml works similar to attrcontains except that the
specified name is treated as an XML name instead of a Python name.
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrstartswith(Selector):
================================
Selector that selects all element nodes where an attribute with
the specified Python name starts with any of the specified
strings. For global attributes the attribute class can be passed.
Note that "fancy" attributes (i.e. those containing non-text) will
not be considered:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.attrstartswith("class_", "input-")):
... print node.bytes()
...
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrstartswith_xml(Selector):
====================================
attrstartswith_xml works similar to attrstartswith except that the
specified name is treated as an XML name instead of a Python name.
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrendswith(Selector):
==============================
Selector that selects all element nodes where an attribute with
the specified Python name ends with one of the specified strings.
For global attributes the attribute class can be passed. Note that
"fancy" attributes (i.e. those containing non-text) will not be
considered:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.attrendswith("href", ".css")):
... print node.attrs.href
...
http://www.python.org/styles/screen-switcher-default.css
http://www.python.org/styles/netscape4.css
http://www.python.org/styles/print.css
http://www.python.org/styles/largestyles.css
http://www.python.org/styles/defaultfonts.css
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class attrendswith_xml(Selector):
==================================
attrendswith_xml works similar to attrendswith except that the
specified name is treated as an XML name instead of a Python name.
def __init__(self, attrname, *attrvalues):
===========================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class hasid(Selector):
=======================
Selector that selects all element nodes where the id attribute has
one if the specified values:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.hasid("logo")):
... print node.bytes()
...
def __init__(self, *ids):
==========================
def matchpath(self, path):
===========================
def __str__(self):
===================
class hasclass(Selector):
==========================
Selector that selects all element nodes where the class attribute
contains one of the specified values:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.hasclass("reference")):
... print node.bytes()
...
Advanced SearchRackspaceIndustrial Light and MagicAstraZeneca
...
def __init__(self, *classnames):
=================================
def matchpath(self, path):
===========================
def __str__(self):
===================
class InAttrSelector(Selector):
================================
Selector that selects all attribute nodes and nodes inside of
attributes:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.inattr & xsc.Text):
... print node.bytes()
...
text/html; charset=utf-8
content-type
python programming language object oriented web free source
...
def matchpath(self, path):
===========================
def __str__(self):
===================
class Combinator(Selector):
============================
A Combinator is a selector that transforms one or combines two or
more other selectors in a certain way.
class BinaryCombinator(Combinator):
====================================
A BinaryCombinator is a combinator that combines two selector: the
left hand selector and the right hand selector.
def __init__(self, left, right):
=================================
def __str__(self):
===================
class ChildCombinator(BinaryCombinator):
=========================================
A ChildCombinator is a BinaryCombinator. To match the
ChildCombinator the node must match the right hand selector and
it's immediate parent must match the left hand selector (i.e. it
works similar to the > combinator in CSS or the / combinator in
XPath).
ChildCombinator objects can be created via the division operator
(/):
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.a/html.img):
... print node.bytes()
...
def matchpath(self, path):
===========================
class DescendantCombinator(BinaryCombinator):
==============================================
A DescendantCombinator is a BinaryCombinator. To match the
DescendantCombinator the node must match the right hand selector
and any of it's ancestor nodes must match the left hand selector
(i.e. it works similar to the descendant combinator in CSS or the
// combinator in XPath).
DescendantCombinator objects can be created via the floor division
operator (//):
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.div//html.img):
... print node.bytes()
...
def matchpath(self, path):
===========================
class AdjacentSiblingCombinator(BinaryCombinator):
===================================================
A AdjacentSiblingCombinator is a BinaryCombinator. To match the
AdjacentSiblingCombinator the node must match the right hand
selector and the immediately preceding sibling must match the left
hand selector.
AdjacentSiblingCombinator objects can be created via the
multiplication operator (*). The following example outputs all
links inside those p elements that immediately follow a h2
element:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.h2*html.p/html.a):
... print node.bytes()
...
SciPy Conferenceearly registrationOnline registrationEuroPython 2007Call For PapersDLS 2007The Python PapersPyCon UKproposals for talksregistration online
def matchpath(self, path):
===========================
class GeneralSiblingCombinator(BinaryCombinator):
==================================================
A GeneralSiblingCombinator is a BinaryCombinator. To match the
GeneralSiblingCombinator the node must match the right hand
selector and any of the preceding siblings must match the left
hand selector.
AdjacentSiblingCombinator objects can be created via the
exponentiation operator (**). The following example outputs all
links that are not the first links inside their parent (i.e. they
have another link among their preceding siblings):
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.a**html.a):
... print node.bytes()
...
Industrial Light and MagicAstraZenecaHoneywelland many othersZope
...
def matchpath(self, path):
===========================
class ChainedCombinator(Combinator):
=====================================
A ChainedCombinator combines any number of other selectors.
def __init__(self, *selectors):
================================
def __str__(self):
===================
class OrCombinator(ChainedCombinator):
=======================================
An OrCombinator is a ChainedCombinator where the node must match
at least one of the selectors to match the OrCombinator. An
OrCombinator can be created with the binary or operator (|):
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.hasattr("href") | xfind.hasattr("src")):
... print node.attrs.href if "href" in node.Attrs else node.attrs.src
...
http://www.python.org/channews.rdf
http://aspn.activestate.com/ASPN/Cookbook/Python/index_rss
http://python-groups.blogspot.com/feeds/posts/default
http://www.showmedo.com/latestVideoFeed/rss2.0?tag=python
http://www.awaretek.com/python/index.xml
http://pyfound.blogspot.com/feeds/posts/default
http://www.python.org/dev/peps/peps.rss
http://www.python.org/community/jobs/jobs.rss
http://www.reddit.com/r/Python/.rss
http://www.python.org/styles/screen-switcher-default.css
http://www.python.org/styles/netscape4.css
http://www.python.org/styles/print.css
http://www.python.org/styles/largestyles.css
http://www.python.org/styles/defaultfonts.css
...
def matchpath(self, path):
===========================
def __or__(self, other):
=========================
class AndCombinator(ChainedCombinator):
========================================
An AndCombinator is a ChainedCombinator where the node must match
all of the combined selectors to match the AndCombinator. An
AndCombinator can be created with the binary and operator (&):
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.input & xfind.hasattr("id")):
... print node.bytes()
...
def matchpath(self, path):
===========================
def __and__(self, other):
==========================
class NotCombinator(Combinator):
=================================
A NotCombinator inverts the selection logic of the underlying
selector, i.e. a node matches only if it does not match the
underlying selector. A NotCombinator can be created with the unary
inversion operator (~).
The following example outputs all images that don't have a border
attribute:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(html.img & ~xfind.hasattr("border")):
... print node.bytes()
...
def __init__(self, selector):
==============================
def matchpath(self, path):
===========================
def __str__(self):
===================
class CallableSelector(Selector):
==================================
A CallableSelector is a selector that calls a user specified
callable to select nodes. The callable gets passed the path and
must return a bool specifying whether this path is selected. A
CallableSelector is created implicitely whenever a callable is
passed to a method that expects a walk filter.
The following example outputs all links that point outside the
python.org domain:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> def foreignlink(path):
... return path and isinstance(path[-1], html.a) and not path[-1].attrs.href.asURL().server.endswith(".python.org")
...
>>> for node in doc.walknodes(foreignlink):
... print node.bytes()
...
YouTube.comZopeDjangoTurboGearsXML
..
def __init__(self, func):
==========================
def matchpath(self, path):
===========================
def __str__(self):
===================
class nthchild(Selector):
==========================
An nthchild object is a selector that selects every node that is
the n-th child of its parent. E.g. nthchild(0) selects every first
child, nthchild(-1) selects each last child. Furthermore
nthchild("even") selects each first, third, fifth, ... child and
nthchild("odd") selects each second, fourth, sixth, ... child.
def __init__(self, index):
===========================
def matchpath(self, path):
===========================
def __str__(self):
===================
class nthoftype(Selector):
===========================
An nthoftype object is a selector that selects every node that is
the n-th node of a specified type among its siblings. Similar to
nthchild nthoftype supports negative and positive indices as well
as "even" and "odd". Which types are checked can be passed
explicitly. If no types are passed the type of the node itself is
used:
>>> from ll.xist import xsc, parse, xfind
>>> from ll.xist.ns import xml, html
>>> doc = parse.tree(parse.URL("http://www.python.org"), parse.Tidy(), parse.NS(html), parse.Node(pool=xsc.Pool(xml, html)))
>>> for node in doc.walknodes(xfind.nthoftype(0, html.h2)):
... print node.bytes()
...