|
Packit Service |
b74dd5 |
Why lxml?
|
|
Packit Service |
b74dd5 |
=========
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
.. contents::
|
|
Packit Service |
b74dd5 |
..
|
|
Packit Service |
b74dd5 |
1 Motto
|
|
Packit Service |
b74dd5 |
2 Aims
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Motto
|
|
Packit Service |
b74dd5 |
-----
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
"the thrills without the strangeness"
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
To explain the motto:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
"Programming with libxml2 is like the thrilling embrace of an exotic stranger.
|
|
Packit Service |
b74dd5 |
It seems to have the potential to fulfill your wildest dreams, but there's a
|
|
Packit Service |
b74dd5 |
nagging voice somewhere in your head warning you that you're about to get
|
|
Packit Service |
b74dd5 |
screwed in the worst way." (`a quote by Mark Pilgrim`_)
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Mark Pilgrim was describing in particular the experience a Python programmer
|
|
Packit Service |
b74dd5 |
has when dealing with libxml2. The default Python bindings of libxml2 are
|
|
Packit Service |
b74dd5 |
fast, thrilling, powerful, and your code might fail in some horrible way that
|
|
Packit Service |
b74dd5 |
you really shouldn't have to worry about when writing Python code. lxml
|
|
Packit Service |
b74dd5 |
combines the power of libxml2 with the ease of use of Python.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
.. _`a quote by Mark Pilgrim`: http://diveintomark.org/archives/2004/02/18/libxml2
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
Aims
|
|
Packit Service |
b74dd5 |
----
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
The C libraries libxml2_ and libxslt_ have huge benefits:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Standards-compliant XML support.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Support for (broken) HTML.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Full-featured.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Actively maintained by XML experts.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* fast. fast! FAST!
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
.. _libxml2: http://www.xmlsoft.org
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
.. _libxslt: http://xmlsoft.org/XSLT
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
These libraries already ship with Python bindings, but these Python bindings
|
|
Packit Service |
b74dd5 |
mimic the C-level interface. This yields a number of problems:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* very low level and C-ish (not Pythonic).
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* underdocumented and huge, you get lost in them.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* UTF-8 in API, instead of Python unicode strings.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Can easily cause segfaults from Python.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Require manual memory management!
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
lxml is a new Python binding for libxml2 and libxslt, completely independent
|
|
Packit Service |
b74dd5 |
from these existing Python bindings. Its aims:
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Pythonic API.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Documented.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Use Python unicode strings in API.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* Safe (no segfaults).
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
* No manual memory management!
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
lxml aims to provide a Pythonic API by following as much as possible the
|
|
Packit Service |
b74dd5 |
`ElementTree API`_. We're trying to avoid inventing too many new APIs, or you
|
|
Packit Service |
b74dd5 |
having to learn new things -- XML is complicated enough.
|
|
Packit Service |
b74dd5 |
|
|
Packit Service |
b74dd5 |
.. _`ElementTree API`: http://effbot.org/zone/element-index.htm
|