css4j 2.0.6 API

This project implements an API very similar to W3C's CSS Object Model API in the Java™ language, and also adds CSS support to the DOM4J package. It targets several different use cases, with functionalities from style sheet error detection to style computation.

Overview

This implementation can be used in several ways: with stand-alone style sheets, with its own DOM implementation, combined with DOM4J, or by wrapping a pre-existing DOM tree.

You can play with independent style sheets created with the createStyleSheet("title", "media") method of the CSSStyleSheetFactory interface. There are three implementations of that interface:

The document back-end is only important if you plan to use the sheets inside a document. The resulting style sheets are empty, but you can load a style sheet with the AbstractCSSStyleSheet.parseStyleSheet(source) method (see Parsing a Style Sheet).

One of the most important functionalities in the library is the ability to compute styles for a given element. In practice, to obtain the 'computed' or 'used' values required for actual rendering a box model implementation is needed, and also device information. The library provides a simple box model that could be used, but the details of the rendering device can be more difficult.

Depending on the use case, the target device may not be the same one where the library is running (and some exact details hence not available). To help in the computation, the library defines the DeviceFactory interface to supply device and media-specific data (also provides the objects required by media queries to work).

Using css4j's native DOM implementation

You can create a DOM document from scratch and use the related DOM methods to programmatically build a document: just use the provided DOM implementation (io.sf.carte.doc.dom.CSSDOMImplementation). If you want to parse an existing document, the procedure depends on the type of document: to parse an XML document (including XHTML documents), you can use this library's XMLDocumentBuilder, while to parse an HTML one (or an XHTML document that does not use namespace prefixes) you can use the validator.nu HTML5 parser.

An example with that parser follows:

import java.io.Reader;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import io.sf.carte.doc.dom.CSSDOMImplementation;
import io.sf.carte.doc.dom.DOMElement;
import io.sf.carte.doc.dom.HTMLDocument;
import io.sf.carte.doc.style.css.CSSComputedProperties;
import io.sf.carte.doc.style.css.CSSTypedValue;
import io.sf.carte.doc.style.css.RGBAColor;
import io.sf.carte.doc.xml.dtd.DefaultEntityResolver;
import nu.validator.htmlparser.dom.HtmlDocumentBuilder;
[...]

// Instantiate DOM implementation (with default settings: no IE hacks accepted)
// and configure it
CSSDOMImplementation impl = new CSSDOMImplementation();
// Alternatively, impl = new CSSDOMImplementation(flags);
// Now load default HTML user agent sheets
impl.setDefaultHTMLUserAgentSheet();
// Prepare builder
HtmlDocumentBuilder builder = new HtmlDocumentBuilder(impl);
// Read the document to parse, and prepare source object
Reader re = ... [reader for HTML document]
InputSource source = new InputSource(re);
// Parse. If the document is not HTML, you want to use DOMDocument instead
HTMLDocument document = (HTMLDocument) builder.parse(source);
re.close();
// Set document URI
document.setDocumentURI("http://www.example.com/mydocument.html");

Then you have a CSS-enabled document. To compute styles, use getComputedStyle:

DOMElement element = document.getElementById("someId");
CSSComputedProperties style = element.getComputedStyle(null);

// Next line could be 'String display = style.getDisplay();'
String display = style.getPropertyValue("display");

// If you use a factory that has been set to setLenientSystemValues(false), next
// line may throw an exception if the 'color' property was not specified.
// The default value for lenientSystemValues is TRUE.
RGBAColor color = ((CSSTypedValue) style.getPropertyCSSValue("color")).getRGBColorValue();

// Suppose that the linked style sheet located at 'css/sheet.css' declares:
// background-image: url('foo.png');

String image_css = style.getPropertyValue("background-image");
String image_uri = ((CSSTypedValue) style.getPropertyCSSValue("background-image")).getStringValue();

// Then, because we already set the document URI to "http://www.example.com/mydocument.html",
// image_css will be set to "url('http://www.example.com/css/foo.png')",
// and image_uri to "http://www.example.com/css/foo.png"

The CSSComputedProperties interface extends from an interface similar to W3C's CSSStyleDeclaration, adding methods like getComputedFontSize() or getComputedLineHeight().

If you are computing styles for a specific medium, tell the document about it (see "Media Handling"):

document.setTargetMedium("print");

Conformance with the DOM specification

This library's native DOM implementation has some minor behavior differences with what is written in the DOM Level 3 Core Specification. For example, on elements and attributes the Node.getLocalName() method returns the tag name instead of null when the node was created with a DOM Level 1 method such as Document.createElement(). Read the io.sf.carte.doc.dom package description for additional information.

Usage with the DOM Wrapper

If you choose to build your document with your favorite DOM implementation instead of the CSS4J one or the DOM4J back-end, you can use the DOMCSSStyleSheetFactory.createCSSDocument(document) method to wrap a pre-existing DOM Document. Example:

DOMCSSStyleSheetFactory factory = new DOMCSSStyleSheetFactory();
CSSDocument document = factory.createCSSDocument(otherDOMdocument);

Unlike the native DOM or the DOM4J back-end, the DOM resulting from the DOM wrapper is read-only, although you can change the values of some nodes.

Consistency of different DOM implementations

Beware that the computed styles found by each method (native DOM, DOM4J back-end or DOM wrapper) may not be completely identical, due to differences in the underlying document DOM implementations. Behaviour may vary due to, for example, a pseudo-class like :target being used (DOM4J has no documentURI support so it never matches).

Such cases would be rare, though (no real-world tests have shown that). If you find a difference in styles computed from different back-ends that you believe to be a bug, please report.

Configuring the cascade

Depending on your use case, you may need to set the user agent (UA) style sheet. The library supports two different UA sheets, one for STRICT ('standards') mode and another for QUIRKS. To set either of those sheets, first obtain an instance of the factory that you are using:

// Instantiate the new factory or get it from an object that you are already using.
AbstractCSSStyleSheetFactory cssFactory = ...

If you are using the DOM4J classes, you may want to do:

AbstractCSSStyleSheetFactory cssFactory = XHTMLDocumentFactory.getInstance().getStyleSheetFactory();

If you are processing HTML, css4j's default HTML5 UA sheet (based on W3C/WHATWG recommendations) should be appropriate for you:

cssFactory.setDefaultHTMLUserAgentSheet();

But if you want to set your own UA sheet, first obtain a reference to the sheet:

BaseCSSStyleSheet sheet = cssFactory.getUserAgentStyleSheet(CSSDocument.ComplianceMode.STRICT);

This is assuming the STRICT mode, i.e. that you use documents with a DOCTYPE, otherwise use QUIRKS (or you may want to set both UA sheets).

If the UA sheet already contains rules (it is empty by default), clean it:

sheet.getCssRules().clear();

And now load the new sheet:

Reader reader = ... [reads the UA sheet]
sheet.parseStyleSheet(reader, CSSStyleSheet.COMMENTS_IGNORE);
reader.close();

As the UA sheet's comments are rarely of interest at the OM level, they were ignored during the parse process (notice the COMMENTS_IGNORE flag).

Note: this implementation does not support important style declarations in the UA sheet.

Setting the user style sheet

There is also the possibility to set a user style sheet (with 'user' origin) via setUserStyleSheet:

Reader reader = ... [reads the user sheet]
cssFactory.setUserStyleSheet(reader);
reader.close();

Both important and normal declarations are supported in the user style sheet.

Media Handling

By default, computed styles only take into account generic styles that are common to all media. If you want to target a more specific medium, you have to use the CSSDocument.setTargetMedium("medium") method. For example, if your document has the following style sheets linked:

<link href="http://www.example.com/css/sheet.css" rel="stylesheet" type="text/css" />
<link href="http://www.example.com/css/sheet_for_print.css" rel="stylesheet" media="print" type="text/css" />

Computed styles will initially take into account only the "sheet.css" style sheet. However, if you execute the following method:

document.setTargetMedium("print");

Then all subsequently computed styles will account for the merged style sheet from "sheet.css" and "sheet_for_print.css".

This way to tie a document with a medium is not totally standard, as the W3C APIs would probably expect a DeviceFactory-related object implementing the ViewCSS interface and referencing the document, but this approach allows to isolate DOM logic inside DOM objects and keep the DeviceFactory for media-specific information only.

Style Sheet Sets

The library supports alternative style sheets. Use CSSDocument's methods enableStyleSheetsForSet, getStyleSheetSets, getSelectedStyleSheetSet and setSelectedStyleSheetSet. For example, if you have a document with these linked sheets:

<link href="http://www.example.com/commonsheet.css" rel="stylesheet" type="text/css" />
<link href="http://www.example.com/alter1.css" rel="alternate stylesheet" type="text/css" title="Alter 1" />
<link href="http://www.example.com/alter2.css" rel="alternate stylesheet" type="text/css" title="Alter 2" />
<link href="http://www.example.com/default.css" rel="stylesheet" type="text/css" title="Default" />

Initially, sheets 'alter1.css' and 'alter2.css' will not be used to compute styles. But then you can write code like the following:

String defset = document.getSelectedStyleSheetSet();  // Sets 'defset' to "Default"
document.setSelectedStyleSheetSet("Alter 1");  // Selects the set with title "Alter 1"
document.setSelectedStyleSheetSet("Alter 2");  // Selects the set with title "Alter 2"

These methods have been removed from the DOM standard unfortunately, but you can still read the specification at the CSSOM 5 December 2013 Working Draft.

Style Sheet Error Checking

You can check for errors and warnings in the document's sheets using the non-standard getErrorHandler method. Example:

if (document.getStyleSheets().item(0).getErrorHandler().hasSacErrors())
        ... error processing / reporting

The merged style sheet obtained from the getStyleSheet method has the merged error/warning state from the document's active sheets.

A typical source of errors are the non-compliant IE hacks, like prefixing property names with an asterisk (you may want to use the proper NSAC flags when creating the factory, see Compatibility with legacy browsers), or charset rules found in the wrong place.

Override Styles

Override styles, that come after the author style sheet in the cascade algorithm, are supported. For example:

element.getOverrideStyle(null).setCssText("padding: 6pt;");

Override styles are defined at the DocumentCSS interface.

Parsing a Style Sheet

Although document's style sheet fetching and parsing is automatic with this library, it is possible to manually parse rules from a source stream with the CSSStyleSheet.parseStyleSheet(Reader, short) method:

Reader re = ...
sheet.parseStyleSheet(re, COMMENTS_AUTO);

The new css rules found in the source stream are added to the already present ones, i.e. the sheet is not reset by this method (although the error handler is). When the second argument is COMMENTS_IGNORE, the comments in the source stream are ignored when parsing (see Accessing Style Sheet Comments).

The new CSS rules found in the source stream are added to the already present ones, i.e. the sheet is not reset by this method (although the error handler is), so if you want to refill a sheet you need to clear the rules before parsing:

sheet.getCssRules().clear();

Compatibility with legacy Browsers

Today's style sheets often contain non-conformant styles that target specific versions of old web browsers, like Internet Explorer. Several web sites contain information about that, including:

Although it could be argued whether those hacks should be used or not, the point is that actual style sheets do contain them, so this library supports them.

By default, a factory is configured to use a flagless NSAC parser which would produce an error on any of those non-standard constructs, but a set of compatibility flags can be specified in the constructors for the factory implementations. The different flags are documented in the NSAC javadocs.

The flags available at the time of this guide's update are the following:

  • STARHACK. When set, the parser will handle asterisk-prefixed property names as accepted, normal names.
  • IEVALUES supports values ending with \9 or \0, as well as progid filters and IE expressions.
  • IEPRIO allows values ending with the '!ie' priority hack.
  • IEPRIOCHAR accepts values with an '!important!' priority hack (note the '!' at the end).

The object model manages these compatibility values in parallel to standard ones. For example, after parsing this declaration with IEVALUES set:

width: 900px; width: 890px\9;

its serialization would be identical (if the flag was set correctly):

width: 900px; width: 890px\9;

but the declaration's length shall be only 1. And computed styles only use the standard values unless there are no alternatives (no standard value was set). The workings are similar for IEPRIO and IEPRIOCHAR:

width: 890px !ie;
width: 890px !important!;

with the last one being handled as of important priority. Values created by IEPRIOCHAR are never used in computed styles.

Instead, declarations including asterisk-prefixed property names (created by STARHACK) always increase the declaration's length. For example, the length of the following declaration would be 2:

width: 900px;
*width: 890px;

If you want to use these flags at the NSAC level (instead of the Object Model), you may want to read the 'Parser Flags' section in the NSAC package description, as well as the documentation for the individual flags in Parser.Flag.

CSS Style Formatting

The serialization of the cssText attribute in rules and style declarations can be customized with an implementation of the StyleFormattingContext interface. You can set your StyleFormattingFactory (which produces your customized formatting context) to the sheet factory with the CSSStyleSheetFactory.setStyleFormattingFactory method, or subclass your base factory and override the createDefaultStyleFormattingFactory method.

Look at the DefaultStyleFormattingContext class for an example of a formatting context implementation.

There is also the possibility to customize the default serialization of string values, with the CSSStyleSheetFactory.setFactoryFlag(byte) method. You can set two flags that govern which quotation you prefer, or keep the default behaviour:

  • Default: Try to keep the original quotation (single or double quotes), unless the alternative is more efficient.
  • STRING_DOUBLE_QUOTE: Use double quotes unless single quotes are more efficient (when the string contains more double quotes than single).
  • STRING_SINGLE_QUOTE: Use single quotes unless double quotes are more efficient (when the string contains more single quotes than double).

Accessing Style Sheet Comments

CSS style sheets often have comments, like:

/* This is a preceding comment */
p {color: blue; } /* This is a trailing comment */

(XML-style comments can also be present in a style sheet, but both NSAC and the CSSOM skip them.)

There is no standard CSSOM API for accessing comments in style sheets, but the AbstractCSSRule.getPrecedingComments() and getTrailingComments() methods are provided for that:

List<String> comments = document.getStyleSheets().item(0).getCssRules().item(3).getPrecedingComments();
List<String> tcomments = document.getStyleSheets().item(0).getCssRules().item(3).getTrailingComments();

By default, comments are parsed with the COMMENTS_AUTO mode, which should be appropriate for human-readable sheets like the one shown above. But a lot of sheets are serialized in a way that there are no newline characters (or only a few). For these cases, COMMENTS_PRECEDING could be used in parseStyleSheet(Reader,short), and all the comments will be considered as belonging to the next rule. With COMMENTS_IGNORE, all comments found while parsing the sheet will be ignored.

The comments preceding a rule will be included in the text returned by the sheet's AbstractCSSStyleSheet.toString() and toStyleString() methods, while other comments (located at places that cannot be easily related to a rule) are lost.

Comments in the default HTML style sheet are not available, as the parser is instructed to ignore them when parsing.

Rendering-oriented interfaces

To help with the determination of 'used' values and the actual rendering, this library provides a few helper interfaces. The most important are:

  • DeviceFactory. It is not a "factory of devices", but instead the object that delivers the relevant abstractions for a requested medium: StyleDatabase and CSSCanvas.
  • StyleDatabase. Provides medium-specific information like available fonts and colors.
  • CSSCanvas. Has knowledge of medium-specific information that depends (or may depend) on a specific viewport, like supported media features. It is linked to a viewport, if there is any. This interface is used to determine the state of active pseudo-classes.
  • Viewport. It represents a viewport defined as per the CSS specifications.

The differences between style databases and canvases can be subtle, and for some media features it could be argued that they belong to one or the other. The basic idea is that style databases should be relatively easy to implement for a given medium, while canvases are probably only going to exist if there is an actual rendering engine implemented.

Java™ Runtime Environment Requirements

The classes in the binary packages have been compiled with a Java SE compiler with 1.8 compiler compliance level, except for the module-info file which targets Java 11.

Packages 
Package Description
io.sf.carte.doc
Basic classes and interfaces used by documents.
io.sf.carte.doc.agent
User agent classes.
io.sf.carte.doc.dom
This package provide an implementation of the Document Object Model (DOM) Level 3 Core Specification that can be used for XML or HTML documents, albeit with a few deviations from the specification.
io.sf.carte.doc.style.css
This package and its subpackages provide an implementation of the CSS Object Model API.
io.sf.carte.doc.style.css.nsac
NSAC: a non-standard revision to W3C's SAC api.
io.sf.carte.doc.style.css.om
Implementation classes for the CSS Object Model API.
io.sf.carte.doc.style.css.parser
Classes related to CSS parsing.
io.sf.carte.doc.style.css.property
Implementations of CSS property values.
io.sf.carte.doc.xml.dtd
DTD-related helper classes for XML parsing.
io.sf.carte.util
Utility classes.