Recently we've read an article "A Proposal for XSLT 4.0", and thought it worth to suggest one more idea. We have written a message to Michael Kay, author of this proposal. Here it is:
A&V Historically xslt, xquery and xpath were dealing with trees. Nowadays it became much common to process graphs. Many tasks can be formulated in terms of graphs, and in particular any task processing trees is also graph task. I suggest to take a deeper look in this direction. As an inspiration I may suggest to look at "P1709R2: Graph Library" - the C++ proposal. Michael Kay I have for many years found it frustrating that XML is confined to hierarchic relationships (things like IDREF and XLink are clumsy workarounds); also the fact that the arbitrary division of data into "documents" plays such a decisive role: documents should only exist in the serialized representation of the model, not in the model itself. I started my career working with the Codasyl-defined network data model. It's a fine and very flexible data model; its downfall was the (DOM-like) procedural navigation language. So I've often wondered what one could do trying to re-invent the Codasyl model in a more modern idiom, coupling it with an XPath-like declarative access language extended to handle networks (graphs) rather than hierarchies. I've no idea how close a reinventiion of Codasyl would be to some of the modern graph data models; it would be interesting to see. The other interesting aspect of this is whether you can make it work for schema-less data. But I don't think that would be an incremental evolution of XSLT; I think it would be something completely new. A&V I was not so radical in my thoughts. Even C++ API is not so radical, as they do not impose hard requirements on internal graph representation but rather define template API that will work both with third party representations (they even mention Fortran) or several built-in implementations that uses standard vectors. Their strong point is in algorithms provided as part of library and not graph internal structure (I think authors of that paper have structured it not the best way). E.g. in the second part they list graph algorithms: Depth First Search (DFS); Breadth First Search (BFS); Topological Sort (TopoSort); Shortest Paths Algorithms; Dijkstra Algorithms; and so on. If we shall try to map it to xpath world them graph on API level might be represented as a user function or as a map of user functions. On a storage level user may implement graph using a sequence of maps or map of maps, or even using xdm elements. So, my approach is evolutional. In fact I suggest pure API that could even be implemented now. Michael Kay Yes, there's certainly scope for graph-oriented functions such as closure($origin, $function) and is-reachable($origin, $function) and find-path($origin, $destination, $function) where we use the existing data model, treating any item as a node in a graph, and representing the arcs using functions. There are a few complications, e.g. what's the identity comparison between arbitrary items, but it can probably be done. A&V > There are a few complications, e.g. what's the identity comparison between arbitrary items, but it can probably be done. One approach to address this is through definition of graph API. E.g. to define graph as a map (interface analogy) of functions, with equality functions, if required: map { vertices: function(), edges: function(), value: function(vertex), in-vertex: function(edge), out-vertex: function(edge), edges: function(vertex), is-in-vertex: function(edge, vertex), is-out-vertex: function(edge, vertex) ... }
Historically xslt, xquery and xpath were dealing with trees. Nowadays it became much common to process graphs. Many tasks can be formulated in terms of graphs, and in particular any task processing trees is also graph task.
I suggest to take a deeper look in this direction.
As an inspiration I may suggest to look at "P1709R2: Graph Library" - the C++ proposal.
I have for many years found it frustrating that XML is confined to hierarchic relationships (things like IDREF and XLink are clumsy workarounds); also the fact that the arbitrary division of data into "documents" plays such a decisive role: documents should only exist in the serialized representation of the model, not in the model itself.
I started my career working with the Codasyl-defined network data model. It's a fine and very flexible data model; its downfall was the (DOM-like) procedural navigation language. So I've often wondered what one could do trying to re-invent the Codasyl model in a more modern idiom, coupling it with an XPath-like declarative access language extended to handle networks (graphs) rather than hierarchies.
I've no idea how close a reinventiion of Codasyl would be to some of the modern graph data models; it would be interesting to see. The other interesting aspect of this is whether you can make it work for schema-less data.
But I don't think that would be an incremental evolution of XSLT; I think it would be something completely new.
I was not so radical in my thoughts.
Even C++ API is not so radical, as they do not impose hard requirements on internal graph representation but rather define template API that will work both with third party representations (they even mention Fortran) or several built-in implementations that uses standard vectors.
Their strong point is in algorithms provided as part of library and not graph internal structure (I think authors of that paper have structured it not the best way). E.g. in the second part they list graph algorithms: Depth First Search (DFS); Breadth First Search (BFS); Topological Sort (TopoSort); Shortest Paths Algorithms; Dijkstra Algorithms; and so on.
If we shall try to map it to xpath world them graph on API level might be represented as a user function or as a map of user functions.
On a storage level user may implement graph using a sequence of maps or map of maps, or even using xdm elements.
So, my approach is evolutional. In fact I suggest pure API that could even be implemented now.
Yes, there's certainly scope for graph-oriented functions such as closure($origin, $function) and is-reachable($origin, $function) and find-path($origin, $destination, $function) where we use the existing data model, treating any item as a node in a graph, and representing the arcs using functions. There are a few complications, e.g. what's the identity comparison between arbitrary items, but it can probably be done.
closure($origin, $function)
is-reachable($origin, $function)
find-path($origin, $destination, $function)
> There are a few complications, e.g. what's the identity comparison between arbitrary items, but it can probably be done.
One approach to address this is through definition of graph API. E.g. to define graph as a map (interface analogy) of functions, with equality functions, if required:
map { vertices: function(), edges: function(), value: function(vertex), in-vertex: function(edge), out-vertex: function(edge), edges: function(vertex), is-in-vertex: function(edge, vertex), is-out-vertex: function(edge, vertex) ... }
Not sure how far this will go but who knows.