CSS4J
DOM mark

DOM benchmarks

Overview

The DOM benchmarks measure how fast different DOM implementations are when building and traversing documents:

As a reference, some of the benchmarks were performed with Jsoup which does not implement DOM but is a popular HTML parser and pseudo-DOM library. It is only being benchmarked when the document's markup is (X)HTML.

The following software versions were used:

  • Java: AdoptOpenJDK* 15.
  • JMH: 1.26.
  • css4j: 3.2.
  • validator.nu htmlparser: 1.4.16.
  • dom4j: 2.1.3.
  • Jsoup: 1.13.1.

The tables and charts were generated by Carte (look for the BenchmarkChartWriter class and the examples folder in the carte-jmh module). The grey lines at the top of the bars give the amplitude of the estimated error.

The computer has an Intel® Core™ i5-1035G7 CPU and 8GB of RAM. Unfortunately, being a laptop processor it is affected by thermal management, which may explain some relatively large error intervals.

(*) Note: one of the tested DOM implementations is the one that comes bundled with the JDK (identified as "JDK" in the graphics), and it has been observed that the version shipped with the Oracle JDK may be faster.

Build HTML documents

Measures the speed at which the validator.nu HTML parser can parse a small document (38 KB HTML) into a few DOM implementations. As a reference, they are compared to the same document parsed by Jsoup.

HTML build benchmark HTML build benchmark 38 KB html file, validator.nu parser (except Jsoup) Css4j-DOM4J Css4j DOM DOM4J Css4j DOM (own builder) Jsoup JDK 4100 3280 2460 1640 820 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j-DOM4J340.34±11.21ops/s
Css4j DOM324.86±2.16ops/s
DOM4J327.60±2.21ops/s
Css4j DOM (own builder)318.08±2.01ops/s
Jsoup3,965.42±47.87ops/s
JDK373.52±2.22ops/s

The standard DOM implementations are somewhat even in this test but Jsoup is much faster, about ten times the JDK and 12 times the speed of the others.

Build small XML documents

The SAX parser that comes bundled with the JDK is used to parse and build a document from a small XHTML file (38 KB).

XML build benchmark XML build benchmark 38kB file Css4j-DOM4J Css4j DOM DOM4J JDK (css4j builder) JDK 550 440 330 220 110 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j-DOM4J493.95±18.75ops/s
Css4j DOM502.77±3.36ops/s
DOM4J526.70±16.10ops/s
JDK (css4j builder)478.38±4.05ops/s
JDK504.02±2.49ops/s

Build XML documents

The SAX parser that comes bundled with the JDK is used to parse and build a document (1MB file).

XML build benchmark XML build benchmark 1MB file Css4j-DOM4J Css4j DOM DOM4J JDK (css4j builder) JDK 111 89 67 44 22 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j-DOM4J97.624±4.932ops/s
Css4j DOM91.260±0.833ops/s
DOM4J103.412±5.790ops/s
JDK (css4j builder)74.366±1.288ops/s
JDK100.270±1.335ops/s

DOM traversal: getFirstChild()/getNextSibling()

Count the nodes of an XML document, using a combination of getFirstChild()/getNextSibling() to traverse it.

DOM Traversal: NextSibling DOM Traversal: NextSibling 1MB file traversed by getFirstChild()/getNextSibling() Css4j DOM Css4j-DOM4J JDK 6700 5360 4020 2680 1340 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM3,159.4±204.6ops/s
Css4j-DOM4J211.8±57.1ops/s
JDK6,432.7±181.4ops/s

Note: for unknown reasons, the usual procedure to build the JDK document with a DocumentBuilderFactory could not be used to initialize the document traversed in the benchmark, as that document is somehow left in an inconsistent state with no child nodes; this happened with the initialization code being executed in a Scope.Benchmark class and also when in a static initialization block. When that same code is executed in a JUnit test, the problem is not seen.

Because of this reason the JDK DOM document was built with css4j's XMLDocumentBuilder, although that process is not part of the timed benchmark.

DOM traversal: getLastChild()/getPreviousSibling()

Count the nodes of an XML document, using a combination of getLastChild()/getPreviousSibling() to traverse it.

DOM Traversal: PreviousSibling DOM Traversal: PreviousSibling 1MB file traversed by getLastChild()/getPreviousSibling() Css4j DOM Css4j-DOM4J JDK 7500 6000 4500 3000 1500 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM2,748.7±312.8ops/s
Css4j-DOM4J163.0±12.7ops/s
JDK7,179.1±179.9ops/s

The Css4j-DOM4J results are representative for DOM4J as well.

DOM traversal: NodeIterator (small file)

Count the elements of a 38kB XHTML document traversed by a NodeIterator.

DOM Traversal: NodeIterator (small file) DOM Traversal: NodeIterator (small file) 38kB file traversed by NodeIterator Css4j DOM JDK 118000 94400 70800 47200 23600 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM114,626±1,354ops/s
JDK82,837±1,624ops/s

DOM4J and Jsoup are not included as they lack a NodeIterator.

DOM traversal: NodeIterator

Count the elements of an XML document traversed by a NodeIterator.

DOM Traversal: NodeIterator DOM Traversal: NodeIterator 1MB file traversed by NodeIterator Css4j DOM JDK 4700 3760 2820 1880 940 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM2,214.1±39.9ops/s
JDK4,425.9±141.5ops/s

DOM4J is not included as it lacks a NodeIterator.

Note: sometimes the NodeIterator created by the JDK is in an inconsistent state, and fails with an exception like:

# Warmup Iteration   1: <failure>

java.lang.ArrayIndexOutOfBoundsException: Index 34 out of bounds for length 33
        at java.base/java.util.ArrayList.add(ArrayList.java:455)
        at java.base/java.util.ArrayList.add(ArrayList.java:467)
        at java.xml/com.sun.org.apache.xerces.internal.dom.DocumentImpl.createNodeIterator(DocumentImpl.java:255)
        at io.sf.carte.mark.dom.DOMIteratorMark.markNodeIteratorJdk(DOMIteratorMark.java:46)
</failure>

But I have observed this only while benchmarking, and not in other cases.

DOM traversal: TreeWalker (small file)

Count the elements of a 38kB XHTML document traversed by a TreeWalker.

DOM Traversal: TreeWalker (small file) DOM Traversal: TreeWalker (small file) 38kB file traversed by TreeWalker Css4j DOM JDK 109000 87200 65400 43600 21800 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM56,486±487ops/s
JDK98,074±9,500ops/s

Neither DOM4J nor Jsoup provide a TreeWalker.

DOM traversal: TreeWalker

Count the elements of an XML document traversed by a TreeWalker.

DOM Traversal: TreeWalker DOM Traversal: TreeWalker 1MB file traversed by TreeWalker Css4j DOM JDK 4800 3840 2880 1920 960 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM2,170.8±37.8ops/s
JDK4,558.8±183.0ops/s

As mentioned, neither DOM4J nor Jsoup provide a TreeWalker.

DOM traversal: iterator() (small file)

Traverse a 38kB XHTML document using native DOM's iterable getChildNodes(), DOM4J's nodeIterator() and Jsoup's iterable childNodes().

DOM Traversal: child node iterators (small file) DOM Traversal: child node iterators (small file) 38kB file traversed by child iterators Css4j DOM DOM4J Jsoup 114000 91200 68400 45600 22800 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM111,245.8±1,274.0ops/s
DOM4J17,267.2±82.7ops/s
Jsoup100,902.9±1,250.8ops/s

The JDK's DOM provides no iterable child collections.

DOM traversal: iterator()

Traverse an XML document using native DOM's iterable getChildNodes() and DOM4J's nodeIterator().

DOM Traversal: child node iterators DOM Traversal: child node iterators 1MB file traversed by child iterators Css4j DOM DOM4J 2150 1720 1290 860 430 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM1,996.5±130.5ops/s
DOM4J510.7±14.7ops/s

The JDK's DOM provides no iterable child collections, and Jsoup is not suitable for XML documents (neither was included).

DOM traversal: elementIterator() (small file)

Traverse a 38kB XHTML document using native DOM's elementIterator(), DOM4J's elementIterator() and Jsoup's iterable Elements.

DOM Traversal: element iterators (small file) DOM Traversal: element iterators (small file) 38kB file traversed by elementIterator() Css4j DOM DOM4J Jsoup 162000 129600 97200 64800 32400 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM100,430±2,733ops/s
DOM4J17,604±105ops/s
Jsoup157,065±2,585ops/s

The JDK's DOM provides no element iterator.

DOM traversal: elementIterator()

Traverse an XML document using native DOM's elementIterator() and DOM4J's elementIterator().

DOM Traversal: element iterators DOM Traversal: element iterators 1MB file traversed by elementIterator() Css4j DOM DOM4J 2600 2080 1560 1040 520 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM2,348.0±218.6ops/s
DOM4J516.7±10.6ops/s

The JDK's DOM provides no element iterator, and Jsoup is not suitable for XML documents (neither was included).

DOM traversal: getElementsByTagName() (small file)

Traverse the list given by getElementsByTagName() from a 38kB document (it is an XHTML file so Jsoup is included in the comparison). In the case of css4j's native DOM, there are two results: one iterating the NodeList by the item() method, and another via the iterator (the returned NodeList implements Iterable).

DOM getElementsByTagName() (small file) DOM getElementsByTagName() (small file) Traverse 713 elements given by getElementsByTagName() Css4j DOM Css4j-DOM4J Css4j DOM (iterator) JDK Jsoup Jsoup (iterator) 143000 114400 85800 57200 28600 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM7,697.8±218.0ops/s
Css4j-DOM4J12,801.6±78.9ops/s
Css4j DOM (iterator)139,647.5±1,166.2ops/s
JDK85,690.6±5,013.1ops/s
Jsoup96,601.6±598.2ops/s
Jsoup (iterator)81,676.9±5,227.3ops/s

DOM traversal: getElementsByTagName()

Traverse the list given by getElementsByTagName(), this time from a 1MB document. Again, for css4j's native DOM there are two results: one iterating the NodeList by the item() method, and another via the iterator (the returned NodeList implements Iterable).

DOM getElementsByTagName() DOM getElementsByTagName() Traverse 3152 elements given by getElementsByTagName() Css4j DOM Css4j-DOM4J Css4j DOM (iterator) JDK 2140 1712 1284 856 428 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM2.2340±0.0607ops/s
Css4j-DOM4J461.7707±5.3620ops/s
Css4j DOM (iterator)1,704.4536±208.2431ops/s
JDK2,108.4757±8.5393ops/s

Note: the iterator is documented as the recommended way to traverse the ElementList in css4j since version 3.2. Css4j versions prior to 3.2 perform much better in this benchmark, but the implementation was switched to one that is more lightweight (and the iterator performance is good enough).

DOM modification: appendChild()/removeChild()

Modify the nodes of an XML document by appending elements with appendChild() and later removing them with removeChild().

DOM document modification DOM document modification appendChild()/removeChild() Css4j DOM Css4j-DOM4J JDK Jsoup 790 632 474 316 158 0 Throughput (ops/s) DOM Implementation

Numeric results (higher is better):

ImplementationScoreErrorUnit
Css4j DOM475.5690±9.6185ops/s
Css4j-DOM4J119.6076±1.5586ops/s
JDK739.1279±35.0935ops/s
Jsoup0.5387±0.0113ops/s

Analysis

Building a document with DOM4J (both plain DOM4J and the CSS-enabled subclasses that CSS4J provide) is fast, but has an important scalability problem. It uses a synchronized cache of QName objects, as shown by the benchmark profiler (e.g. java -jar build/benchmarks.jar XMLBuildBenchmark -prof stack:lines=5;top=3;detailLine=true;period=1):

Secondary result "io.sf.carte.mark.dom.XMLBuildBenchmark.markBuildDOM4J: stack":
Stack profiler:

....[Thread state distributions]....................................................................
 72,3%         BLOCKED
 27,6%         RUNNABLE

....[Thread state: BLOCKED].........................................................................
 72,3% 100,0% java.util.Collections$SynchronizedMap.get
              org.dom4j.tree.QNameCache.get
              org.dom4j.DocumentFactory.createQName
              org.dom4j.tree.NamespaceStack.createQName
              org.dom4j.tree.NamespaceStack.pushQName

So its usage is not recommended for multi-core systems.

Css4j's native DOM has more features than the other contenders (which implies a bit of overhead) but is still quite fast, although it is often slower than the JDK's DOM, which is the fastest standard DOM. The latter looks like a good choice for applications that do not require handling styles (or you could use the read-only DOM wrapper with it, if that fits your needs). Finally, DOM4J is lagging behind in performance for anything other than document build-up.

For users that do not need to handle CSS and do not mind dealing with a non-standard API, Jsoup excels at parsing and has good traversal speeds, although is several orders of magnitude slower in the document modification benchmark.