htmltidy: Tidy Up and Test XPath Queries on HTML and XML Content

HTML documents can be beautiful and pristine. They can also be wretched, evil, malformed demon-spawn. Now, you can tidy up that HTML and XHTML before processing it with your favorite angle-bracket crunching tools, going beyond the limited tidying that 'libxml2' affords in the 'XML' and 'xml2' packages and taming even the ugliest HTML code generated by the likes of Google Docs and Microsoft Word. It's also possible to use the functions provided to format or "pretty print" HTML content as it is being tidied. Utilities are also included that make it possible to view formatted and "pretty printed" HTML/XML content from HTML/XML document objects, nodes, node sets and plain character HTML/XML using 'vkbeautify' (by Vadim Kiryukhin) and 'highlight.js' (by Ivan Sagalaev). Also (optionally) enables filtering of nodes via XPath or viewing an HTML/XML document in "tree" view using 'XMLDisplay' (by Lev Muchnik). See <> and <> for more information about 'vkbeautify' and 'XMLDisplay', respectively.

Version: 0.5.0
Depends: R (≥ 3.2.0)
Imports: Rcpp, xml2, XML, htmlwidgets, htmltools
LinkingTo: Rcpp
Suggests: testthat, httr, rvest
Published: 2019-10-03
Author: Bob Rudis [aut, cre], Dave Raggett [ctb, aut] (Original HTML Tidy library), Charles Reitzel [ctb, aut] (Modern HTML Tidy library), Björn Höhrmann [ctb, aut] (HTML5 Support), Kenton Russell [aut, ctb] (xml-viewer integration), Vadim Kiryukhin [ctb, cph] (vkbeautify library), Ivan Sagalaev [ctb, cph] (highlight.js library), Lev Muchnik [ctb, cph] (xml-viewer library)
Maintainer: Bob Rudis <bob at>
License: MIT + file LICENSE
Copyright: file inst/COPYRIGHTS
htmltidy copyright details
NeedsCompilation: yes
Materials: NEWS
In views: WebTechnologies
CRAN checks: htmltidy results


Reference manual: htmltidy.pdf


Package source: htmltidy_0.5.0.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release (arm64): htmltidy_0.5.0.tgz, r-release (x86_64): htmltidy_0.5.0.tgz, r-oldrel: htmltidy_0.5.0.tgz
Old sources: htmltidy archive


Please use the canonical form to link to this page.