Why we've turned our attention to the Saxon implementation?
A considerable part (~75%) of project we're working on at present is creating
xslt(s). That's not stylesheets to create page presentations, but rather
project's business logic. To fulfill the project we were in need of xslt 2.0
processor. In the current state of affairs I doubt someone can point to a good
alternative to the Saxon implementation.
The open source nature of the SaxonB project and intrinsic curiosity act like a
hook for such species like ourselves.
We want to say that we're rather sceptical
observers of a code: the code should prove it have merits. Saxon looks
consistent. It takes not too much time to grasp implementation concepts taking
into account that the code routinely follows xpath/xslt/xquery specifications.
These code observation and practice with live xslt tasks helped us to form an
opinion on the Saxon itself. That's why we dare to critique it.
1. Compilation is fused with execution.
An xslt before being executed passes several stages including xpath data model, and a graph of expressions - objects implementing
parts of runtime logic.
Expression graph is optimized to achieve better runtime performace. The
optimization logic is distributed throughout the code, and in particular lives
in expression objects. This means that expression completes two roles: runtime
execution and optimization.
I would prefer to see a smaller and cleaner run time objects (expressions),
and optimization logic separately. On the other hand I can guess why Michael Kay
fused these roles: to ease lazy optimizations (at runtime).
2. Optimizations are xslt 1.0 by origin
This is like a heritage. There are two main techniques: cached
sequences, and global indices of rooted nodes.
This might be enough in xslt 1.0, but in 2.0 where there are diverse set of
types, where sequences extend node sets to other types, where sequences may
logically be grouped by pairs, tripples, and so on, this is not enough.
XPath data model operates with sequences only (in math sense). On the other hand it
defines many set based functions (operators) like: $a intersect $b, $a except $b,
$a = $b, $a != $b. In these examples XPath sequences are better to consider as sets, or maps of items.
Other example: for $i in index-of($names, $name) return $values[$i], where
$names as xs:string*, $values as element()* shows that
a closure of ($names, $values) is in fact a map, and
$names might be implemented as a composition of a sequence and a map of
strings to indices.
There are other use case examples, which lead me to think that Saxon lacks set
based operators. Global indices are poor substitution, which work for rooted
trees only.
Again, I guess why Michael Kay is not implementing these operators: not everyone
loads xslt with stressful tasks requiring these features. I think xslt is mostly
used to render pages, and one rarely deviates from rooted trees.
In spite of the objections we think that Saxon is a good xslt 2.0 implementation,
which unfortunately lacks competitors.