RSS 2.0
Sign In
# Friday, 09 May 2008
Георгиевская ленточка

A project I'm currently working on, requires me to manipulate with a big number of documents. This includes accessing these documents with key() function.

I never thought this task poses any problem, until I've discovered that Saxon caches documents loaded using document() function to preserve their identities:

By default, this function is ·stable·. Two calls on this function return the same document node if the same URI Reference (after resolution to an absolute URI Reference) is supplied to both calls. Thus, the following expression (if it does not raise an error) will always be true:

doc("foo.xml") is doc("foo.xml")

However, for performance reasons, implementations may provide a user option to evaluate the function without a guarantee of stability. The manner in which any such option is provided is implementation-defined. If the user has not selected such an option, a call of the function must either return a stable result or must raise an error: [err:FODC0003].

Saxon provides a saxon:discard-document() function to release documents from cache. The use case is like this:

<xsl:variable name="document" as="document-node()"
   select="saxon:discard-document(document(...))"/>

You may see, that saxon:discard-document() is bound to a place where document is loaded. In my case this is inefficient, as my code repeatedly accesses documents from different places. To release loaded documents I need to collect them after main processing.

Other issue in Saxon is that, processor may keep document references through xsl:key, thus saxon:discard-document() provides no guaranty of documents to be garbage collected.

To deal with this, I've designed (Saxon specific) api to manage document pools:

t:begin-document-pool-scope() as item()
  Begins document pool scope.
    Returns scope id.

t:end-document-pool-scope(scope as item())
  Terminates document pool scope.
    $scope - scope id.

t:put-document-in-pool(document as document-node()) as document-node()
  Puts a document into a current scope of document pool.
    $document - a document to put into the document pool.
    Returns the same document node.

The use case is:

<xsl:variable name="scope" select="t:begin-document-pool-scope()"/>

<xsl:sequence select="t:assert($scope)"/>

...
<xsl:variable name="document" as="document-node()"
  select="t:put-document-in-pool(...)"/>
...

<xsl:sequence select="t:end-document-pool-scope($scope)"/>

Download document-pool.xslt to use this api.

Friday, 09 May 2008 06:58:29 UTC  #    Comments [0] -
xslt
All comments require the approval of the site owner before being displayed.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview
Archive
<2024 April>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011
Statistics
Total Posts: 387
This Year: 3
This Month: 1
This Week: 0
Comments: 970
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2024, Nesterovsky bros
All Content © 2024, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)