Saxon xslt 2 implementation - Nesterovsky bros

Tuesday, 04 November 2008

Why we've turned our attention to the Saxon implementation?

A considerable part (~75%) of project we're working on at present is creating xslt(s). That's not stylesheets to create page presentations, but rather project's business logic. To fulfill the project we were in need of xslt 2.0 processor. In the current state of affairs I doubt someone can point to a good alternative to the Saxon implementation.

The open source nature of the SaxonB project and intrinsic curiosity act like a hook for such species like ourselves.

We want to say that we're rather sceptical observers of a code: the code should prove it have merits. Saxon looks consistent. It takes not too much time to grasp implementation concepts taking into account that the code routinely follows xpath/xslt/xquery specifications. These code observation and practice with live xslt tasks helped us to form an opinion on the Saxon itself. That's why we dare to critique it.

1. Compilation is fused with execution.

An xslt before being executed passes several stages including xpath data model, and a graph of expressions - objects implementing parts of runtime logic.

Expression graph is optimized to achieve better runtime performace. The optimization logic is distributed throughout the code, and in particular lives in expression objects. This means that expression completes two roles: runtime execution and optimization.

I would prefer to see a smaller and cleaner run time objects (expressions), and optimization logic separately. On the other hand I can guess why Michael Kay fused these roles: to ease lazy optimizations (at runtime).

2. Optimizations are xslt 1.0 by origin

This is like a heritage. There are two main techniques: cached sequences, and global indices of rooted nodes.

This might be enough in xslt 1.0, but in 2.0 where there are diverse set of types, where sequences extend node sets to other types, where sequences may logically be grouped by pairs, tripples, and so on, this is not enough.

XPath data model operates with sequences only (in math sense). On the other hand it defines many set based functions (operators) like: $a intersect $b, $a except $b, $a = $b, $a != $b. In these examples XPath sequences are better to consider as sets, or maps of items.

Other example: for $i in index-of($names, $name) return $values[$i], where $names as xs:string*, $values as element()* shows that a closure of ($names, $values) is in fact a map, and $names might be implemented as a composition of a sequence and a map of strings to indices.

There are other use case examples, which lead me to think that Saxon lacks set based operators. Global indices are poor substitution, which work for rooted trees only.

Again, I guess why Michael Kay is not implementing these operators: not everyone loads xslt with stressful tasks requiring these features. I think xslt is mostly used to render pages, and one rarely deviates from rooted trees.

In spite of the objections we think that Saxon is a good xslt 2.0 implementation, which unfortunately lacks competitors.

Tuesday, 04 November 2008 11:30:36 UTC

Comments [0] -
xslt

All comments require the approval of the site owner before being displayed.

Name *
E-mail
Home page

	Remember Me
Comment (Some html is allowed: `a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u`) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.
Enter the code shown (prevents robots):
Live Comment Preview