From b74dd5f56bf76def5aa8ea34e7d986ca0d582af2 Mon Sep 17 00:00:00 2001 From: Packit Service Date: Jan 05 2021 22:32:03 +0000 Subject: python-lxml-4.2.3 base --- diff --git a/CHANGES.txt b/CHANGES.txt new file mode 100644 index 0000000..2acc179 --- /dev/null +++ b/CHANGES.txt @@ -0,0 +1,3899 @@ +============== +lxml changelog +============== + +4.2.3 (2018-06-27) +================== + +Bugs fixed +---------- + +* Reverted GH#265: lxml links against zlib as a shared library again. + + +4.2.2 (2018-06-22) +================== + +Bugs fixed +---------- + +* GH#266: Fix sporadic crash during GC when parse-time schema validation is used + and the parser participates in a reference cycle. + Original patch by Julien Greard. + +* GH#265: lxml no longer links against zlib as a shared library, only on static builds. + Patch by Nehal J Wani. + + +4.2.1 (2018-03-21) +================== + +Bugs fixed +---------- + +* LP#1755825: ``iterwalk()`` failed to return the 'start' event for the initial + element if a tag selector is used. + +* LP#1756314: Failure to import 4.2.0 into PyPy due to a missing library symbol. + +* LP#1727864, GH#258: Add "-isysroot" linker option on MacOS as needed by XCode 9. + + +4.2.0 (2018-03-13) +================== + +Features added +-------------- + +* GH#255: ``SelectElement.value`` returns more standard-compliant and + browser-like defaults for non-multi-selects. If no option is selected, the + value of the first option is returned (instead of None). If multiple options + are selected, the value of the last one is returned (instead of that of the + first one). If no options are present (not standard-compliant) + ``SelectElement.value`` still returns ``None``. + +* GH#261: The ``HTMLParser()`` now supports the ``huge_tree`` option. + Patch by stranac. + +Bugs fixed +---------- + +* LP#1551797: Some XSLT messages were not captured by the transform error log. + +* LP#1737825: Crash at shutdown after an interrupted iterparse run with XMLSchema + validation. + +Other changes +------------- + + +4.1.1 (2017-11-04) +================== + +* Rebuild with Cython 0.27.3 to improve support for Py3.7. + + +4.1.0 (2017-10-13) +================== + +Features added +-------------- + +* ElementPath supports text predicates for current node, like "[.='text']". + +* ElementPath allows spaces in predicates. + +* Custom Element classes and XPath functions can now be registered with a + decorator rather than explicit dict assignments. + +* Static Linux wheels are now built with link time optimisation (LTO) enabled. + This should have a beneficial impact on the overall performance by providing + a tighter compiler integration between lxml and libxml2/libxslt. + +Bugs fixed +---------- + +* LP#1722776: Requesting non-Element objects like comments from a document with + ``PythonElementClassLookup`` could fail with a TypeError. + + +4.0.0 (2017-09-17) +================== + +Features added +-------------- + +* The ElementPath implementation is now compiled using Cython, + which speeds up the ``.find*()`` methods quite significantly. + +* The modules ``lxml.builder``, ``lxml.html.diff`` and ``lxml.html.clean`` + are also compiled using Cython in order to speed them up. + +* ``xmlfile()`` supports async coroutines using ``async with`` and ``await``. + +* ``iterwalk()`` has a new method ``skip_subtree()`` that prevents walking into + the descendants of the current element. + +* ``RelaxNG.from_rnc_string()`` accepts a ``base_url`` argument to + allow relative resource lookups. + +* The XSLT result object has a new method ``.write_output(file)`` that serialises + output data into a file according to the ```` configuration. + +Bugs fixed +---------- + +* GH#251: HTML comments were handled incorrectly by the soupparser. + Patch by mozbugbox. + +* LP#1654544: The html5parser no longer passes the ``useChardet`` option + if the input is a Unicode string, unless explicitly requested. When parsing + files, the default is to enable it when a URL or file path is passed (because + the file is then opened in binary mode), and to disable it when reading from + a file(-like) object. + + Note: This is a backwards incompatible change of the default configuration. + If your code parses byte strings/streams and depends on character detection, + please pass the option ``guess_charset=True`` explicitly, which already worked + in older lxml versions. + +* LP#1703810: ``etree.fromstring()`` failed to parse UTF-32 data with BOM. + +* LP#1526522: Some RelaxNG errors were not reported in the error log. + +* LP#1567526: Empty and plain text input raised a TypeError in soupparser. + +* LP#1710429: Uninitialised variable usage in HTML diff. + +* LP#1415643: The closing tags context manager in ``xmlfile()`` could continue + to output end tags even after writing failed with an exception. + +* LP#1465357: ``xmlfile.write()`` now accepts and ignores None as input argument. + +* Compilation under Py3.7-pre failed due to a modified function signature. + +Other changes +------------- + +* The main module source files were renamed from ``lxml.*.pyx`` to plain + ``*.pyx`` (e.g. ``etree.pyx``) to simplify their handling in the build + process. Care was taken to keep the old header files as fallbacks for + code that compiles against the public C-API of lxml, but it might still + be worth validating that third-party code does not notice this change. + + +3.8.0 (2017-06-03) +================== + +Features added +-------------- + +* ``ElementTree.write()`` has a new option ``doctype`` that writes out a + doctype string before the serialisation, in the same way as ``tostring()``. + +* GH#220: ``xmlfile`` allows switching output methods at an element level. + Patch by Burak Arslan. + +* LP#1595781, GH#240: added a PyCapsule Python API and C-level API for + passing externally generated libxml2 documents into lxml. + +* GH#244: error log entries have a new property ``path`` with an XPath + expression (if known, None otherwise) that points to the tree element + responsible for the error. Patch by Bob Kline. + +* The namespace prefix mapping that can be used in ElementPath now injects + a default namespace when passing a None prefix. + +Bugs fixed +---------- + +* GH#238: Character escapes were not hex-encoded in the ``xmlfile`` serialiser. + Patch by matejcik. + +* GH#229: fix for externally created XML documents. Patch by Theodore Dubois. + +* LP#1665241, GH#228: Form data handling in lxml.html no longer strips the + option values specified in form attributes but only the text values. + Patch by Ashish Kulkarni. + +* LP#1551797: revert previous fix for XSLT error logging as it breaks + multi-threaded XSLT processing. + +* LP#1673355, GH#233: ``fromstring()`` html5parser failed to parse byte strings. + +Other changes +------------- + +* The previously undocumented ``docstring`` option in ``ElementTree.write()`` + produces a deprecation warning and will eventually be removed. + + +3.7.4 (2017-??-??) +================== + +Bugs fixed +---------- + +* LP#1551797: revert previous fix for XSLT error logging as it breaks + multi-threaded XSLT processing. + +* LP#1673355, GH#233: ``fromstring()`` html5parser failed to parse byte strings. + + +3.7.3 (2017-02-18) +================== + +Bugs fixed +---------- + +* GH#218 was ineffective in Python 3. + +* GH#222: ``lxml.html.submit_form()`` failed in Python 3. + Patch by Jakub Wilk. + + +3.7.2 (2017-01-08) +================== + +* GH#220: ``xmlfile`` allows switching output methods at an element level. + Patch by Burak Arslan. + +Bugs fixed +---------- + +* Work around installation problems in recent Python 2.7 versions + due to FTP download failures. + +* GH#219: ``xmlfile.element()`` was not properly quoting attribute values. + Patch by Burak Arslan. + +* GH#218: ``xmlfile.element()`` was not properly escaping text content of + script/style tags. Patch by Burak Arslan. + + +3.7.1 (2016-12-23) +================== + +* No source changes, issued only to solve problems with the + binary packages released for 3.7.0. + + +3.7.0 (2016-12-10) +================== + +Features added +-------------- + +* GH#217: ``XMLSyntaxError`` now behaves more like its ``SyntaxError`` + baseclass. Patch by Philipp A. + +* GH#216: ``HTMLParser()`` now supports the same ``collect_ids`` parameter + as ``XMLParser()``. Patch by Burak Arslan. + +* GH#210: Allow specifying a serialisation method in ``xmlfile.write()``. + Patch by Burak Arslan. + +* GH#203: New option ``default_doctype`` in ``HTMLParser`` that allows + disabling the automatic doctype creation. Patch by Shadab Zafar. + +* GH#201: Calling the method ``.set('attrname')`` without value argument + (or ``None``) on HTML elements creates an attribute without value that + serialises like ``
``. Patch by Daniel Holth. + +* GH#197: Ignore form input fields in ``form_values()`` when they are + marked as ``disabled`` in HTML. Patch by Kristian Klemon. + +Bugs fixed +---------- + +* GH#206: File name and line number were missing from XSLT error messages. + Patch by Marcus Brinkmann. + +Other changes +------------- + +* Log entries no longer allow anything but plain string objects as message text + and file name. + +* ``zlib`` is included in the list of statically built libraries. + + +3.6.4 (2016-08-20) +================== + +* GH#204, LP#1614693: build fix for MacOS-X. + + +3.6.3 (2016-08-18) +================== + +* LP#1614603: change linker flags to build multi-linux wheels + + +3.6.2 (2016-08-18) +================== + +* LP#1614603: release without source changes to provide cleanly built Linux wheels + + +3.6.1 (2016-07-24) +================== + +Features added +-------------- + +* GH#180: Separate option ``inline_style`` for Cleaner that only removes ``style`` + attributes instead of all styles. Patch by Christian Pedersen. + +* GH#196: Windows build support for Python 3.5. Contribution by Maximilian Hils. + +Bugs fixed +---------- + +* GH#199: Exclude ``file`` fields from ``FormElement.form_values`` (as browsers do). + Patch by Tomas Divis. + +* GH#198, LP#1568167: Try to provide base URL from ``Resolver.resolve_string()``. + Patch by Michael van Tellingen. + +* GH#191: More accurate float serialisation in ``objectify.FloatElement``. + Patch by Holger Joukl. + +* LP#1551797: Repair XSLT error logging. Patch by Marcus Brinkmann. + + +3.6.0 (2016-03-17) +================== + +Features added +-------------- + +* GH#187: Now supports (only) version 5.x and later of PyPy. + Patch by Armin Rigo. + +* GH#181: Direct support for ``.rnc`` files in `RelaxNG()` if ``rnc2rng`` + is installed. Patch by Dirkjan Ochtman. + +Bugs fixed +---------- + +* GH#189: Static builds honour FTP proxy configurations when downloading + the external libs. Patch by Youhei Sakurai. + +* GH#186: Soupparser failed to process entities in Python 3.x. + Patch by Duncan Morris. + +* GH#185: Rare encoding related ``TypeError`` on import was fixed. + Patch by Petr Demin. + + +3.5.0 (2015-11-13) +================== + +Bugs fixed +---------- + +* Unicode string results failed XPath queries in PyPy. + +* LP#1497051: HTML target parser failed to terminate on exceptions + and continued parsing instead. + +* Deprecated API usage in doctestcompare. + + +3.5.0b1 (2015-09-18) +==================== + +Features added +-------------- + +* ``cleanup_namespaces()`` accepts a new argument ``keep_ns_prefixes`` + that does not remove definitions of the provided prefix-namespace + mapping from the tree. + +* ``cleanup_namespaces()`` accepts a new argument ``top_nsmap`` that + moves definitions of the provided prefix-namespace mapping to the + top of the tree. + +* LP#1490451: ``Element`` objects gained a ``cssselect()`` method as + known from ``lxml.html``. Patch by Simon Sapin. + +* API functions and methods behave and look more like Python functions, + which allows introspection on them etc. One side effect to be aware of + is that the functions now bind as methods when assigned to a class + variable. A quick fix is to wrap them in ``staticmethod()`` (as for + normal Python functions). + +* ISO-Schematron support gained an option ``error_finder`` that allows + passing a filter function for picking validation errors from reports. + +* LP#1243600: Elements in ``lxml.html`` gained a ``classes`` property + that provides a set-like interface to the ``class`` attribute. + Original patch by masklinn. + +* LP#1341964: The soupparser now handles DOCTYPE declarations, comments + and processing instructions outside of the root element. + Patch by Olli Pottonen. + +* LP#1421512: The ``docinfo`` of a tree was made editable to allow + setting and removing the public ID and system ID of the DOCTYPE. + Patch by Olli Pottonen. + +* LP#1442427: More work-arounds for quirks and bugs in pypy and pypy3. + +* ``lxml.html.soupparser`` now uses BeautifulSoup version 4 instead + of version 3 if available. + +Bugs fixed +---------- + +* Memory errors that occur during tree adaptations (e.g. moving subtrees + to foreign documents) could leave the tree in a crash prone state. + +* Calling ``process_children()`` in an XSLT extension element without + an ``output_parent`` argument failed with a ``TypeError``. + Fix by Jens Tröger. + +* GH#162: Image data in HTML ``data`` URLs is considered safe and + no longer removed by ``lxml.html.clean`` JavaScript cleaner. + +* GH#166: Static build could link libraries in wrong order. + +* GH#172: Rely a bit more on libxml2 for encoding detection rather than + rolling our own in some cases. Patch by Olli Pottonen. + +* GH#159: Validity checks for names and string content were tightened + to detect the use of illegal characters early. Patch by Olli Pottonen. + +* LP#1421921: Comments/PIs before the DOCTYPE declaration were not + serialised. Patch by Olli Pottonen. + +* LP#659367: Some HTML DOCTYPE declarations were not serialised. + Patch by Olli Pottonen. + +* LP#1238503: lxml.doctestcompare is now consistent with stdlib's doctest + in how it uses ``+`` and ``-`` to refer to unexpected and missing output. + +* Empty prefixes are explicitly rejected when a namespace mapping is used + with ElementPath to avoid hiding bugs in user code. + +* Several problems with PyPy were fixed by switching to Cython 0.23. + + +3.4.4 (2015-04-25) +================== + +Bugs fixed +---------- + +* An ElementTree compatibility test added in lxml 3.4.3 that failed in + Python 3.4+ was removed again. + + +3.4.3 (2015-04-15) +================== + +Bugs fixed +---------- + +* Expression cache in ElementPath was ignored. Fix by Changaco. + +* LP#1426868: Passing a default namespace and a prefixed namespace mapping + as nsmap into ``xmlfile.element()`` raised a ``TypeError``. + +* LP#1421927: DOCTYPE system URLs were incorrectly quoted when containing + double quotes. Patch by Olli Pottonen. + +* LP#1419354: meta-redirect URLs were incorrectly processed by + ``iterlinks()`` if preceded by whitespace. + + +3.4.2 (2015-02-07) +================== + +Bugs fixed +---------- + +* LP#1415907: Crash when creating an XMLSchema from a non-root element + of an XML document. + +* LP#1369362: HTML cleaning failed when hitting processing instructions + with pseudo-attributes. + +* ``CDATA()`` wrapped content was rejected for tail text. + +* CDATA sections were not serialised as tail text of the top-level element. + + +3.4.1 (2014-11-20) +================== + +Features added +-------------- + +* New ``htmlfile`` HTML generator to accompany the incremental ``xmlfile`` + serialisation API. Patch by Burak Arslan. + +Bugs fixed +---------- + +* ``lxml.sax.ElementTreeContentHandler`` did not initialise its superclass. + + +3.4.0 (2014-09-10) +================== + +Features added +-------------- + +* ``xmlfile(buffered=False)`` disables output buffering and flushes the + content after each API operation (starting/ending element blocks or writes). + A new method ``xf.flush()`` can alternatively be used to explicitly flush + the output. + +* ``lxml.html.document_fromstring`` has a new option ``ensure_head_body=True`` + which will add an empty head and/or body element to the result document if + missing. + +* ``lxml.html.iterlinks`` now returns links inside meta refresh tags. + +* New ``XMLParser`` option ``collect_ids=False`` to disable ID hash table + creation. This can substantially speed up parsing of documents with many + different IDs that are not used. + +* The parser uses per-document hash tables for XML IDs. This reduces the + load of the global parser dict and speeds up parsing for documents with + many different IDs. + +* ``ElementTree.getelementpath(element)`` returns a structural ElementPath + expression for the given element, which can be used for lookups later. + +* ``xmlfile()`` accepts a new argument ``close=True`` to close file(-like) + objects after writing to them. Before, ``xmlfile()`` only closed the file + if it had opened it internally. + +* Allow "bytearray" type for ASCII text input. + +Bugs fixed +---------- + +Other changes +------------- + +* LP#400588: decoding errors have become hard errors even in recovery mode. + Previously, they could lead to an internal tree representation in a mixed + encoding state, which lead to very late errors or even silently incorrect + behaviour during tree traversal or serialisation. + +* Requires Python 2.6, 2.7, 3.2 or later. No longer supports + Python 2.4, 2.5 and 3.1, use lxml 3.3.x for those. + +* Requires libxml2 2.7.0 or later and libxslt 1.1.23 or later, + use lxml 3.3.x with older versions. + + +3.3.6 (2014-08-28) +================== + +Bugs fixed +---------- + +* Prevent tree cycle creation when adding Elements as siblings. + +* LP#1361948: crash when deallocating Element siblings without parent. + +* LP#1354652: crash when traversing internally loaded documents in XSLT + extension functions. + + +3.3.5 (2014-04-18) +================== + +Bugs fixed +---------- + +* HTML cleaning could fail to strip javascript links that mix control + characters into the link scheme. + + +3.3.4 (2014-04-03) +================== + +Features added +-------------- + +* Source line numbers above 65535 are available on Elements when + using libxml2 2.9 or later. + +Bugs fixed +---------- + +* ``lxml.html.fragment_fromstring()`` failed for bytes input in Py3. + +Other changes +------------- + + +3.3.3 (2014-03-04) +================== + +Bugs fixed +---------- + +* LP#1287118: Crash when using Element subtypes with ``__slots__``. + +Other changes +------------- + +* The internal classes ``_LogEntry`` and ``_Attrib`` can no longer be + subclassed from Python code. + + +3.3.2 (2014-02-26) +================== + +Bugs fixed +---------- + +* The properties ``resolvers`` and ``version``, as well as the methods + ``set_element_class_lookup()`` and ``makeelement()``, were lost from + ``iterparse`` objects in 3.3.0. + +* LP#1222132: instances of ``XMLSchema``, ``Schematron`` and ``RelaxNG`` + did not clear their local ``error_log`` before running a validation. + +* LP#1238500: lxml.doctestcompare mixed up "expected" and "actual" in + attribute values. + +* Some file I/O tests were failing in MS-Windows due to non-portable temp + file usage. Initial patch by Gabi Davar. + +* LP#910014: duplicate IDs in a document were not reported by DTD validation. + +* LP#1185332: ``tostring(method="html")`` did not use HTML serialisation + semantics for trailing tail text. Initial patch by Sylvain Viollon. + +* LP#1281139: ``.attrib`` value of Comments lost its mutation methods + in 3.3.0. Even though it is empty and immutable, it should still + provide the same interface as that returned for Elements. + + +3.3.1 (2014-02-12) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* LP#1014290: HTML documents parsed with ``parser.feed()`` failed to find + elements during tag iteration. + +* LP#1273709: Building in PyPy failed due to missing support for + ``PyUnicode_Compare()`` and ``PyByteArray_*()`` in PyPy's C-API. + +* LP#1274413: Compilation in MSVC failed due to missing "stdint.h" standard + header file. + +* LP#1274118: iterparse() failed to parse BOM prefixed files. + +Other changes +------------- + + +3.3.0 (2014-01-26) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* The heuristic that distinguishes file paths from URLs was tightened + to produce less false negatives. + +Other changes +------------- + + +3.3.0beta5 (2014-01-18) +======================= + +Features added +-------------- + +* The PEP 393 unicode parsing support gained a fallback for wchar strings + which might still be somewhat common on Windows systems. + +Bugs fixed +---------- + +* Several error handling problems were fixed throughout the code base that + could previously lead to exceptions being silently swallowed or not + properly reported. + +* The C-API function ``appendChild()`` is now deprecated as it does not + propagate exceptions (its return type is ``void``). The new function + ``appendChildToElement()`` was added as a safe replacement. + +* Passing a string into ``fromstringlist()`` raises an exception instead of + parsing the string character by character. + +Other changes +------------- + +* Document cleanup code was simplified using the new GC features in + Cython 0.20. + + +3.3.0beta4 (2014-01-12) +======================= + +Features added +-------------- + +Bugs fixed +---------- + +* The (empty) value returned by the ``attrib`` property of Entity and Comment + objects was mutable. + +* Element class lookup wasn't available for the new pull parsers or when using + a custom parser target. + +* Setting Element attributes on instantiation with both the ``attrib`` argument + and keyword arguments could modify the mapping passed as ``attrib``. + +* LP#1266171: DTDs instantiated from internal/external subsets (i.e. through + the docinfo property) lost their attribute declarations. + +Other changes +------------- + +* Built with Cython 0.20pre (gitrev 012ae82eb) to prepare support for + Python 3.4. + + +3.3.0beta3 (2014-01-02) +======================= + +Features added +-------------- + +* Unicode string parsing was optimised for Python 3.3 (PEP 393). + +Bugs fixed +---------- + +* HTML parsing of Unicode strings could misdecode the input on some platforms. + +* Crash in xmlfile() when closing open elements out of order in an error case. + +Other changes +------------- + + +3.3.0beta2 (2013-12-20) +======================= + +Features added +-------------- + +* ``iterparse()`` supports the ``recover`` option. + +Bugs fixed +---------- + +* Crash in ``iterparse()`` for HTML parsing. + +* Crash in target parsing with attributes. + +Other changes +------------- + +* The safety check in the read-only tree implementation (e.g. used by + ``PythonElementClassLookup``) raises a more appropriate ``ReferenceError`` + for illegal access after tree disposal instead of an ``AssertionError``. + This should only impact test code that specifically checks the original + behaviour. + + +3.3.0beta1 (2013-12-12) +======================= + +Features added +-------------- + +* New option ``handle_failures`` in ``make_links_absolute()`` and + ``resolve_base_href()`` (lxml.html) that enables ignoring or + discarding links that fail to parse as URLs. + +* New parser classes ``XMLPullParser`` and ``HTMLPullParser`` for + incremental parsing, as implemented for ElementTree in Python 3.4. + +* ``iterparse()`` enables recovery mode by default for HTML parsing + (``html=True``). + +Bugs fixed +---------- + +* LP#1255132: crash when trying to run validation over non-Element (e.g. + comment or PI). + +* Error messages in the log and in exception messages that originated + from libxml2 could accidentally be picked up from preceding warnings + instead of the actual error. + +* The ``ElementMaker`` in lxml.objectify did not accept a dict as + argument for adding attributes to the element it's building. This + works as in lxml.builder now. + +* LP#1228881: ``repr(XSLTAccessControl)`` failed in Python 3. + +* Raise ``ValueError`` when trying to append an Element to itself or + to one of its own descendants, instead of running into an infinite + loop. + +* LP#1206077: htmldiff discarded whitespace from the output. + +* Compressed plain-text serialisation to file-like objects was broken. + +* lxml.html.formfill: Fix textarea form filling. + The textarea used to be cleared before the new content was set, + which removed the name attribute. + + +Other changes +------------- + +* Some basic API classes use freelists internally for faster + instantiation. This can speed up some ``iterparse()`` scenarios, + for example. + +* ``iterparse()`` was rewritten to use the new ``*PullParser`` + classes internally instead of being a parser itself. + + +3.2.5 (2014-01-02) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* Crash in xmlfile() when closing open elements out of order in an error case. + +* Crash in target parsing with attributes. + +* LP#1255132: crash when trying to run validation over non-Element (e.g. + comment or PI). + +Other changes +------------- + + +3.2.4 (2013-11-07) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* Memory leak when creating an XPath evaluator in a thread. + +* LP#1228881: ``repr(XSLTAccessControl)`` failed in Python 3. + +* Raise ``ValueError`` when trying to append an Element to itself or + to one of its own descendants. + +* LP#1206077: htmldiff discarded whitespace from the output. + +* Compressed plain-text serialisation to file-like objects was broken. + +Other changes +------------- + + +3.2.3 (2013-07-28) +================== + +Bugs fixed +---------- + +* Fix support for Python 2.4 which was lost in 3.2.2. + + +3.2.2 (2013-07-28) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* LP#1185701: spurious XMLSyntaxError after finishing iterparse(). + +* Crash in lxml.objectify during xsi annotation. + +Other changes +------------- + +* Return values of user provided element class lookup methods are now + validated against the type of the XML node they represent to prevent + API class mismatches. + + +3.2.1 (2013-05-11) +================== + +Features added +-------------- + +* The methods ``apply_templates()`` and ``process_children()`` of XSLT + extension elements have gained two new boolean options ``elements_only`` + and ``remove_blank_text`` that discard either all strings or whitespace-only + strings from the result list. + +Bugs fixed +---------- + +* When moving Elements to another tree, the namespace cleanup mechanism + no longer drops namespace prefixes from attributes for which it finds + a default namespace declaration, to prevent them from appearing as + unnamespaced attributes after serialisation. + +* Returning non-type objects from a custom class lookup method could lead + to a crash. + +* Instantiating and using subtypes of Comments and ProcessingInstructions + crashed. + +Other changes +------------- + + +3.2.0 (2013-04-28) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* LP#690319: Leading whitespace could change the behaviour of the string + parsing functions in ``lxml.html``. + +* LP#599318: The string parsing functions in ``lxml.html`` are more robust + in the face of uncommon HTML content like framesets or missing body tags. + Patch by Stefan Seelmann. + +* LP#712941: I/O errors while trying to access files with paths that contain + non-ASCII characters could raise ``UnicodeDecodeError`` instead of properly + reporting the ``IOError``. + +* LP#673205: Parsing from in-memory strings disabled network access in the + default parser and made subsequent attempts to parse from a URL fail. + +* LP#971754: lxml.html.clean appends 'nofollow' to 'rel' attributes instead + of overwriting the current value. + +* LP#715687: lxml.html.clean no longer discards scripts that are explicitly + allowed by the user provided whitelist. Patch by Christine Koppelt. + +Other changes +------------- + + +3.1.2 (2013-04-12) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* LP#1136509: Passing attributes through the namespace-unaware API of + the sax bridge (i.e. the ``handler.startElement()`` method) failed + with a ``TypeError``. Patch by Mike Bayer. + +* LP#1123074: Fix serialisation error in XSLT output when converting + the result tree to a Unicode string. + +* GH#105: Replace illegal usage of ``xmlBufLength()`` in libxml2 2.9.0 + by properly exported API function ``xmlBufUse()``. + +Other changes +------------- + + +3.1.1 (2013-03-29) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* LP#1160386: Write access to ``lxml.html.FormElement.fields`` raised + an AttributeError in Py3. + +* Illegal memory access during cleanup in incremental xmlfile writer. + +Other changes +------------- + +* The externally useless class ``lxml.etree._BaseParser`` was removed + from the module dict. + + +3.1.0 (2013-02-10) +================== + +Features added +-------------- + +* GH#89: lxml.html.clean allows overriding the set of attributes that it + considers 'safe'. Patch by Francis Devereux. + +Bugs fixed +---------- + +* LP#1104370: ``copy.copy(el.attrib)`` raised an exception. It now returns + a copy of the attributes as a plain Python dict. + +* GH#95: When used with namespace prefixes, the ``el.find*()`` methods + always used the first namespace mapping that was provided for each + path expression instead of using the one that was actually passed + in for the current run. + +* LP#1092521, GH#91: Fix undefined C symbol in Python runtimes compiled + without threading support. Patch by Ulrich Seidl. + +Other changes +------------- + + +3.1beta1 (2012-12-21) +===================== + +Features added +-------------- + +* New build-time option ``--with-unicode-strings`` for Python 2 that + makes the API always return Unicode strings for names and text + instead of byte strings for plain ASCII content. + +* New incremental XML file writing API ``etree.xmlfile()``. + +* E factory in lxml.objectify is callable to simplify the creation of + tags with non-identifier names without having to resort to getattr(). + +Bugs fixed +---------- + +* When starting from a non-namespaced element in lxml.objectify, searching + for a child without explicitly specifying a namespace incorrectly found + namespaced elements with the requested local name, instead of restricting + the search to non-namespaced children. + +* GH#85: Deprecation warnings were fixed for Python 3.x. + +* GH#33: lxml.html.fromstring() failed to accept bytes input in Py3. + +* LP#1080792: Static build of libxml2 2.9.0 failed due to missing file. + +Other changes +------------- + +* The externally useless class ``_ObjectifyElementMakerCaller`` was + removed from the module API of lxml.objectify. + +* LP#1075622: lxml.builder is faster for adding text to elements with + many children. Patch by Anders Hammarquist. + + +3.0.2 (2012-12-14) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* Fix crash during interpreter shutdown by switching to Cython 0.17.3 for building. + +Other changes +------------- + + +3.0.1 (2012-10-14) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* LP#1065924: Element proxies could disappear during garbage collection + in PyPy without proper cleanup. + +* GH#71: Failure to work with libxml2 2.6.x. + +* LP#1065139: static MacOS-X build failed in Py3. + +Other changes +------------- + + +3.0 (2012-10-08) +================ + +Features added +-------------- + +Bugs fixed +---------- + +* End-of-file handling was incorrect in iterparse() when reading from + a low-level C file stream and failed in libxml2 2.9.0 due to its + improved consistency checks. + +Other changes +------------- + +* The build no longer uses Cython by default unless the generated C files + are missing. To use Cython, pass the option "--with-cython". To ignore + the fatal build error when Cython is required but not available (e.g. to + run special setup.py commands that do not actually run a build), pass + "--without-cython". + + +3.0beta1 (2012-09-26) +===================== + +Features added +-------------- + +* Python level access to (optional) libxml2 memory debugging features + to simplify debugging of memory leaks etc. + +Bugs fixed +---------- + +* Fix a memory leak in XPath by switching to Cython 0.17.1. + +* Some tests were adapted to work with PyPy. + +Other changes +------------- + +* The code was adapted to work with the upcoming libxml2 2.9.0 release. + + +3.0alpha2 (2012-08-23) +====================== + +Features added +-------------- + +* The ``.iter()`` method of elements now accepts ``tag`` arguments like + ``"{*}name"`` to search for elements with a given local name in any + namespace. With this addition, all combinations of wildcards now work + as expected: + ``"{ns}name"``, ``"{}name"``, ``"{*}name"``, ``"{ns}*"``, ``"{}*"`` + and ``"{*}*"``. Note that ``"name"`` is equivalent to ``"{}name"``, + but ``"*"`` is ``"{*}*"``. + The same change applies to the ``.getiterator()``, ``.itersiblings()``, + ``.iterancestors()``, ``.iterdescendants()``, ``.iterchildren()`` + and ``.itertext()`` methods;the ``strip_attributes()``, + ``strip_elements()`` and ``strip_tags()`` functions as well as the + ``iterparse()`` class. Patch by Simon Sapin. + +* C14N allows specifying the inclusive prefixes to be promoted + to top-level during exclusive serialisation. + +Bugs fixed +---------- + +* Passing long Unicode strings into the ``feed()`` parser interface + failed to read the entire string. + +Other changes +------------- + + +3.0alpha1 (2012-07-31) +====================== + +Features added +-------------- + +* Initial support for building in PyPy (through cpyext). + +* DTD objects gained an API that allows read access to their + declarations. + +* ``xpathgrep.py`` gained support for parsing line-by-line (e.g. + from grep output) and for surrounding the output with a new root + tag. + +* ``E-factory`` in ``lxml.builder`` accepts subtypes of known data + types (such as string subtypes) when building elements around them. + +* Tree iteration and ``iterparse()`` with a selective ``tag`` + argument supports passing a set of tags. Tree nodes will be + returned by the iterators if they match any of the tags. + +Bugs fixed +---------- + +* The ``.find*()`` methods in ``lxml.objectify`` no longer use XPath + internally, which makes them faster in many cases (especially when + short circuiting after a single or couple of elements) and fixes + some behavioural differences compared to ``lxml.etree``. Note that + this means that they no longer support arbitrary XPath expressions + but only the subset that the ``ElementPath`` language supports. + The previous implementation was also redundant with the normal + XPath support, which can be used as a replacement. + +* ``el.find('*')`` could accidentally return a comment or processing + instruction that happened to be in the wrong spot. (Same for the + other ``.find*()`` methods.) + +* The error logging is less intrusive and avoids a global setup where + possible. + +* Fixed undefined names in html5lib parser. + +* ``xpathgrep.py`` did not work in Python 3. + +* ``Element.attrib.update()`` did not accept an ``attrib`` of + another Element as parameter. + +* For subtypes of ``ElementBase`` that make the ``.text`` or ``.tail`` + properties immutable (as in objectify, for example), inserting text + when creating Elements through the E-Factory feature of the class + constructor would fail with an exception, stating that the text + cannot be modified. + +Other changes +-------------- + +* The code base was overhauled to properly use 'const' where the API + of libxml2 and libxslt requests it. This also has an impact on the + public C-API of lxml itself, as defined in ``etreepublic.pxd``, as + well as the provided declarations in the ``lxml/includes/`` directory. + Code that uses these declarations may have to be adapted. On the + plus side, this fixes several C compiler warnings, also for user + code, thus making it easier to spot real problems again. + +* The functionality of "lxml.cssselect" was moved into a separate PyPI + package called "cssselect". To continue using it, you must install + that package separately. The "lxml.cssselect" module is still + available and provides the same interface, provided the "cssselect" + package can be imported at runtime. + +* Element attributes passed in as an ``attrib`` dict or as keyword + arguments are now sorted by (namespaced) name before being created + to make their order predictable for serialisation and iteration. + Note that adding or deleting attributes afterwards does not take + that order into account, i.e. setting a new attribute appends it + after the existing ones. + +* Several classes that are for internal use only were removed + from the ``lxml.etree`` module dict: + ``_InputDocument, _ResolverRegistry, _ResolverContext, _BaseContext, + _ExsltRegExp, _IterparseContext, _TempStore, _ExceptionContext, + __ContentOnlyElement, _AttribIterator, _NamespaceRegistry, + _ClassNamespaceRegistry, _FunctionNamespaceRegistry, + _XPathFunctionNamespaceRegistry, _ParserDictionaryContext, + _FileReaderContext, _ParserContext, _PythonSaxParserTarget, + _TargetParserContext, _ReadOnlyProxy, _ReadOnlyPIProxy, + _ReadOnlyEntityProxy, _ReadOnlyElementProxy, _OpaqueNodeWrapper, + _OpaqueDocumentWrapper, _ModifyContentOnlyProxy, + _ModifyContentOnlyPIProxy, _ModifyContentOnlyEntityProxy, + _AppendOnlyElementProxy, _SaxParserContext, _FilelikeWriter, + _ParserSchemaValidationContext, _XPathContext, + _XSLTResolverContext, _XSLTContext, _XSLTQuotedStringParam`` + +* Several internal classes can no longer be inherited from: + ``_InputDocument, _ResolverRegistry, _ExsltRegExp, _ElementUnicodeResult, + _IterparseContext, _TempStore, _AttribIterator, _ClassNamespaceRegistry, + _XPathFunctionNamespaceRegistry, _ParserDictionaryContext, + _FileReaderContext, _PythonSaxParserTarget, _TargetParserContext, + _ReadOnlyPIProxy, _ReadOnlyEntityProxy, _OpaqueDocumentWrapper, + _ModifyContentOnlyPIProxy, _ModifyContentOnlyEntityProxy, + _AppendOnlyElementProxy, _FilelikeWriter, _ParserSchemaValidationContext, + _XPathContext, _XSLTResolverContext, _XSLTContext, _XSLTQuotedStringParam, + _XSLTResultTree, _XSLTProcessingInstruction`` + + +2.3.6 (2012-09-28) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* Passing long Unicode strings into the ``feed()`` parser interface + failed to read the entire string. + +Other changes +-------------- + + +2.3.5 (2012-07-31) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* Crash when merging text nodes in ``element.remove()``. + +* Crash in sax/target parser when reporting empty doctype. + +Other changes +-------------- + + +2.3.4 (2012-03-26) +================== + +Features added +-------------- + +Bugs fixed +---------- + +* Crash when building an nsmap (Element property) with empty + namespace URIs. + +* Crash due to race condition when errors (or user messages) occur + during threaded XSLT processing. + +* XSLT stylesheet compilation could ignore compilation errors. + +Other changes +-------------- + + +2.3.3 (2012-01-04) +================== + +Features added +-------------- + +* ``lxml.html.tostring()`` gained new serialisation options + ``with_tail`` and ``doctype``. + +Bugs fixed +---------- + +* Fixed a crash when using ``iterparse()`` for HTML parsing and + requesting start events. + +* Fixed parsing of more selectors in cssselect. Whitespace before + pseudo-elements and pseudo-classes is significant as it is a + descendant combinator. + "E :pseudo" should parse the same as "E \*:pseudo", not "E:pseudo". + Patch by Simon Sapin. + +* lxml.html.diff no longer raises an exception when hitting + 'img' tags without 'src' attribute. + +Other changes +-------------- + + +2.3.2 (2011-11-11) +================== + +Features added +-------------- + +* ``lxml.objectify.deannotate()`` has a new boolean option + ``cleanup_namespaces`` to remove the objectify namespace + declarations (and generally clean up the namespace declarations) + after removing the type annotations. + +* ``lxml.objectify`` gained its own ``SubElement()`` function as a + copy of ``etree.SubElement`` to avoid an otherwise redundant import + of ``lxml.etree`` on the user side. + +Bugs fixed +---------- + +* Fixed the "descendant" bug in cssselect a second time (after a first + fix in lxml 2.3.1). The previous change resulted in a serious + performance regression for the XPath based evaluation of the + translated expression. Note that this breaks the usage of some of + the generated XPath expressions as XSLT location paths that + previously worked in 2.3.1. + +* Fixed parsing of some selectors in cssselect. Whitespace after combinators + ">", "+" and "~" is now correctly ignored. Previously it was parsed as + a descendant combinator. For example, "div> .foo" was parsed the same as + "div>* .foo" instead of "div>.foo". Patch by Simon Sapin. + +Other changes +-------------- + + +2.3.1 (2011-09-25) +================== + +Features added +-------------- + +* New option ``kill_tags`` in ``lxml.html.clean`` to remove specific + tags and their content (i.e. their whole subtree). + +* ``pi.get()`` and ``pi.attrib`` on processing instructions to parse + pseudo-attributes from the text content of processing instructions. + +* ``lxml.get_include()`` returns a list of include paths that can be + used to compile external C code against lxml.etree. This is + specifically required for statically linked lxml builds when code + needs to compile against the exact same header file versions as lxml + itself. + +* ``Resolver.resolve_file()`` takes an additional option + ``close_file`` that configures if the file(-like) object will be + closed after reading or not. By default, the file will be closed, + as the user is not expected to keep a reference to it. + +Bugs fixed +---------- + +* HTML cleaning didn't remove 'data:' links. + +* The html5lib parser integration now uses the 'official' + implementation in html5lib itself, which makes it work with newer + releases of the library. + +* In ``lxml.sax``, ``endElementNS()`` could incorrectly reject a plain + tag name when the corresponding start event inferred the same plain + tag name to be in the default namespace. + +* When an open file-like object is passed into ``parse()`` or + ``iterparse()``, the parser will no longer close it after use. This + reverts a change in lxml 2.3 where all files would be closed. It is + the users responsibility to properly close the file(-like) object, + also in error cases. + +* Assertion error in lxml.html.cleaner when discarding top-level elements. + +* In lxml.cssselect, use the xpath 'A//B' (short for + 'A/descendant-or-self::node()/B') instead of 'A/descendant::B' for + the css descendant selector ('A B'). This makes a few edge cases + like ``"div *:last-child"`` consistent with the selector behavior in + WebKit and Firefox, and makes more css expressions valid location + paths (for use in xsl:template match). + +* In lxml.html, non-selected ``