DOM benchmarks
Overview
The DOM benchmarks measure how fast different DOM implementations are when building and traversing documents:
- Css4j-DOM4J module (which subclasses DOM4J).
- Css4j's native DOM.
- Stand-alone DOM4J.
- JDK's bundled DOM implementation.
As a reference, some of the benchmarks were performed with Jsoup which does not implement DOM but is a popular HTML parser and pseudo-DOM library. It is only being benchmarked when the document's markup is (X)HTML.
The following software versions were used:
- Java: AdoptOpenJDK* 15.
- JMH: 1.26.
- css4j: 3.2.
- validator.nu htmlparser: 1.4.16.
- dom4j: 2.1.3.
- Jsoup: 1.13.1.
The tables and charts were generated by Carte (look for the BenchmarkChartWriter
class and the examples
folder in the carte-jmh
module). The grey lines at the top of the bars give the amplitude of the estimated error.
The computer has an Intel® Core™ i5-1035G7 CPU and 8GB of RAM. Unfortunately, being a laptop processor it is affected by thermal management, which may explain some relatively large error intervals.
(*) Note: one of the tested DOM implementations is the one that comes bundled with the JDK (identified as "JDK" in the graphics), and it has been observed that the version shipped with the Oracle JDK may be faster.
Build HTML documents
Measures the speed at which the validator.nu HTML parser can parse a small document (38 KB HTML) into a few DOM implementations. As a reference, they are compared to the same document parsed by Jsoup.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j-DOM4J | 340.34 | ±11.21 | ops/s |
Css4j DOM | 324.86 | ±2.16 | ops/s |
DOM4J | 327.60 | ±2.21 | ops/s |
Css4j DOM (own builder) | 318.08 | ±2.01 | ops/s |
Jsoup | 3,965.42 | ±47.87 | ops/s |
JDK | 373.52 | ±2.22 | ops/s |
The standard DOM implementations are somewhat even in this test but Jsoup is much faster, about ten times the JDK and 12 times the speed of the others.
Build small XML documents
The SAX parser that comes bundled with the JDK is used to parse and build a document from a small XHTML file (38 KB).
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j-DOM4J | 493.95 | ±18.75 | ops/s |
Css4j DOM | 502.77 | ±3.36 | ops/s |
DOM4J | 526.70 | ±16.10 | ops/s |
JDK (css4j builder) | 478.38 | ±4.05 | ops/s |
JDK | 504.02 | ±2.49 | ops/s |
Build XML documents
The SAX parser that comes bundled with the JDK is used to parse and build a document (1MB file).
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j-DOM4J | 97.624 | ±4.932 | ops/s |
Css4j DOM | 91.260 | ±0.833 | ops/s |
DOM4J | 103.412 | ±5.790 | ops/s |
JDK (css4j builder) | 74.366 | ±1.288 | ops/s |
JDK | 100.270 | ±1.335 | ops/s |
DOM traversal: getFirstChild()/getNextSibling()
Count the nodes of an XML document, using a combination of getFirstChild()/getNextSibling()
to traverse it.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 3,159.4 | ±204.6 | ops/s |
Css4j-DOM4J | 211.8 | ±57.1 | ops/s |
JDK | 6,432.7 | ±181.4 | ops/s |
Note: for unknown reasons, the usual procedure to build the JDK document with a DocumentBuilderFactory
could not be used to initialize the document
traversed in the benchmark, as that document is somehow left in an inconsistent state with no child nodes; this happened with the initialization code being executed in a
Scope.Benchmark
class and also when in a static
initialization block. When that same code is executed in a JUnit test, the problem is not seen.
Because of this reason the JDK DOM document was built with css4j's XMLDocumentBuilder
,
although that process is not part of the timed benchmark.
DOM traversal: getLastChild()/getPreviousSibling()
Count the nodes of an XML document, using a combination of getLastChild()/getPreviousSibling()
to traverse it.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 2,748.7 | ±312.8 | ops/s |
Css4j-DOM4J | 163.0 | ±12.7 | ops/s |
JDK | 7,179.1 | ±179.9 | ops/s |
The Css4j-DOM4J results are representative for DOM4J as well.
DOM traversal: NodeIterator
(small file)
Count the elements of a 38kB XHTML document traversed by a NodeIterator
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 114,626 | ±1,354 | ops/s |
JDK | 82,837 | ±1,624 | ops/s |
DOM4J and Jsoup are not included as they lack a NodeIterator
.
DOM traversal: NodeIterator
Count the elements of an XML document traversed by a NodeIterator
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 2,214.1 | ±39.9 | ops/s |
JDK | 4,425.9 | ±141.5 | ops/s |
DOM4J is not included as it lacks a NodeIterator
.
Note: sometimes the NodeIterator
created by the JDK is in an inconsistent state, and fails with an exception like:
# Warmup Iteration 1: <failure> java.lang.ArrayIndexOutOfBoundsException: Index 34 out of bounds for length 33 at java.base/java.util.ArrayList.add(ArrayList.java:455) at java.base/java.util.ArrayList.add(ArrayList.java:467) at java.xml/com.sun.org.apache.xerces.internal.dom.DocumentImpl.createNodeIterator(DocumentImpl.java:255) at io.sf.carte.mark.dom.DOMIteratorMark.markNodeIteratorJdk(DOMIteratorMark.java:46) </failure>
But I have observed this only while benchmarking, and not in other cases.
DOM traversal: TreeWalker
(small file)
Count the elements of a 38kB XHTML document traversed by a TreeWalker
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 56,486 | ±487 | ops/s |
JDK | 98,074 | ±9,500 | ops/s |
Neither DOM4J nor Jsoup provide a TreeWalker
.
DOM traversal: TreeWalker
Count the elements of an XML document traversed by a TreeWalker
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 2,170.8 | ±37.8 | ops/s |
JDK | 4,558.8 | ±183.0 | ops/s |
As mentioned, neither DOM4J nor Jsoup provide a TreeWalker
.
DOM traversal: iterator()
(small file)
Traverse a 38kB XHTML document using native DOM's iterable getChildNodes()
,
DOM4J's nodeIterator()
and Jsoup's
iterable childNodes()
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 111,245.8 | ±1,274.0 | ops/s |
DOM4J | 17,267.2 | ±82.7 | ops/s |
Jsoup | 100,902.9 | ±1,250.8 | ops/s |
The JDK's DOM provides no iterable child collections.
DOM traversal: iterator()
Traverse an XML document using native DOM's iterable getChildNodes()
and DOM4J's nodeIterator()
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 1,996.5 | ±130.5 | ops/s |
DOM4J | 510.7 | ±14.7 | ops/s |
The JDK's DOM provides no iterable child collections, and Jsoup is not suitable for XML documents (neither was included).
DOM traversal: elementIterator()
(small file)
Traverse a 38kB XHTML document using native DOM's elementIterator()
,
DOM4J's elementIterator()
and Jsoup's
iterable Elements
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 100,430 | ±2,733 | ops/s |
DOM4J | 17,604 | ±105 | ops/s |
Jsoup | 157,065 | ±2,585 | ops/s |
The JDK's DOM provides no element iterator.
DOM traversal: elementIterator()
Traverse an XML document using native DOM's elementIterator()
and DOM4J's elementIterator()
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 2,348.0 | ±218.6 | ops/s |
DOM4J | 516.7 | ±10.6 | ops/s |
The JDK's DOM provides no element iterator, and Jsoup is not suitable for XML documents (neither was included).
DOM traversal: getElementsByTagName()
(small file)
Traverse the list given by getElementsByTagName()
from a 38kB document (it is an XHTML file so Jsoup is included
in the comparison). In the case of css4j's native DOM, there are two results: one iterating the
NodeList
by the item()
method, and another
via the iterator (the returned NodeList
implements
Iterable
).
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 7,697.8 | ±218.0 | ops/s |
Css4j-DOM4J | 12,801.6 | ±78.9 | ops/s |
Css4j DOM (iterator) | 139,647.5 | ±1,166.2 | ops/s |
JDK | 85,690.6 | ±5,013.1 | ops/s |
Jsoup | 96,601.6 | ±598.2 | ops/s |
Jsoup (iterator) | 81,676.9 | ±5,227.3 | ops/s |
DOM traversal: getElementsByTagName()
Traverse the list given by getElementsByTagName()
, this time from a 1MB document. Again, for css4j's native DOM there are two results: one iterating the
NodeList
by the item()
method, and another via the iterator
(the returned NodeList
implements Iterable
).
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 2.2340 | ±0.0607 | ops/s |
Css4j-DOM4J | 461.7707 | ±5.3620 | ops/s |
Css4j DOM (iterator) | 1,704.4536 | ±208.2431 | ops/s |
JDK | 2,108.4757 | ±8.5393 | ops/s |
Note: the iterator is documented as the recommended way to traverse the ElementList
in css4j since version 3.2. Css4j versions prior to 3.2 perform much better in this benchmark,
but the implementation was switched to one that is more lightweight (and the iterator performance is good enough).
DOM modification: appendChild()/removeChild()
Modify the nodes of an XML document by appending elements with appendChild()
and later removing them with
removeChild()
.
Numeric results (higher is better):
Implementation | Score | Error | Unit |
---|---|---|---|
Css4j DOM | 475.5690 | ±9.6185 | ops/s |
Css4j-DOM4J | 119.6076 | ±1.5586 | ops/s |
JDK | 739.1279 | ±35.0935 | ops/s |
Jsoup | 0.5387 | ±0.0113 | ops/s |
Analysis
Building a document with DOM4J (both plain DOM4J and the CSS-enabled subclasses that CSS4J provide) is fast, but has an important scalability problem. It uses
a synchronized cache of QName
objects, as shown by the benchmark profiler (e.g. java -jar build/benchmarks.jar XMLBuildBenchmark -prof stack:lines=5;top=3;detailLine=true;period=1
):
Secondary result "io.sf.carte.mark.dom.XMLBuildBenchmark.markBuildDOM4J: stack": Stack profiler: ....[Thread state distributions].................................................................... 72,3% BLOCKED 27,6% RUNNABLE ....[Thread state: BLOCKED]......................................................................... 72,3% 100,0% java.util.Collections$SynchronizedMap.get org.dom4j.tree.QNameCache.get org.dom4j.DocumentFactory.createQName org.dom4j.tree.NamespaceStack.createQName org.dom4j.tree.NamespaceStack.pushQName
So its usage is not recommended for multi-core systems.
Css4j's native DOM has more features than the other contenders (which implies a bit of overhead) but is still quite fast, although it is often slower than the JDK's DOM, which is the fastest standard DOM. The latter looks like a good choice for applications that do not require handling styles (or you could use the read-only DOM wrapper with it, if that fits your needs). Finally, DOM4J is lagging behind in performance for anything other than document build-up.
For users that do not need to handle CSS and do not mind dealing with a non-standard API, Jsoup excels at parsing and has good traversal speeds, although is several orders of magnitude slower in the document modification benchmark.