19 May, 2013

HTML5Lib (Python) 0.95 / 1.0b1


Developer:

Website:

License / Price:

Platforms:

Databases:

Language:

Last Updated:

Category:
HTML5Lib Development Team | More scripts
code.google.com
MIT License 

Windows / Linux / Mac OS / BSD / Solaris
N/A
Python
May 20th, 2013, 00:51 GMT [view history]
C: \ Development Tools \ HTML and HTML5 Tools

It follows the original WHATWG official HTML5 specification.

The parser is designed to handle all flavours of HTML and parses invalid documents using well-defined error handling rules compatible with the behaviour of major desktop web browsers.

The output is palced inside a tree structure.

It supports output to ElementTree, DOM and lxml tree formats as well as a simple custom format.

HTML5Lib is packaged with distutils.

HTML5Lib is also available in:

Ruby - download HTML5Lib for Ruby here.
Python - download HTML5Lib for Python here.
PHP - download HTML5Lib for PHP here.

What's New in This Release: [ read full changelog ]

· Parses valid and invalid HTML documents to a tree
· Support for minidom, ElementTree (including cElementTree and lxml.etree), BeautifulSoup (deprecated) and custom simpletree output formats
· DOM to SAX converter
· Reports parse errors
· Character encoding detection
· Filtering and serializing of trees
· HTML+CSS sanitizer
· Many unit tests


Download button
Via: HTML5Lib (Python) 0.95 / 1.0b1

0 Comment: