RSS 2.0
Sign In
# Wednesday, February 17, 2010

The very same simple tasks tend to appear in different languages (e.g. C# Haiku). Now we have to find:

  • integer and fractional part of a decimal;
  • length and precision of a decimal.

These tasks have no trivial solutions in xslt 2.0.

At present we have came up with the following answers:

Fractional part:

<xsl:function name="t:fraction" as="xs:decimal">
  <xsl:param name="value" as="xs:decimal"/>

  <xsl:sequence select="$value mod 1"/>
</xsl:function>

Integer part v1:

<xsl:function name="t:integer" as="xs:decimal">
  <xsl:param name="value" as="xs:decimal"/>

  <xsl:sequence select="$value - t:fraction($value)"/>
</xsl:function>

Integer part v2:

<xsl:function name="t:integer" as="xs:decimal">
  <xsl:param name="value" as="xs:decimal"/>

  <xsl:sequence select="
    if ($value ge 0) then
      floor($value)
    else
      -floor(-$value)"/>
</xsl:function>

Length and precision:

<!--
  Gets a decimal specification as a closure:
    ($length as xs:integer, $precision as xs:integer).
-->
<xsl:function name="t:decimal-spec" as="xs:integer+">
  <xsl:param name="value" as="xs:decimal"/>

  <xsl:variable name="text" as="xs:string" select="
    if ($value lt 0) then
      xs:string(-$value)
    else
      xs:string($value)"/>

  <xsl:variable name="length" as="xs:integer"
    select="string-length($text)"/>
  <xsl:variable name="integer-length" as="xs:integer"
    select="string-length(substring-before($text, '.'))"/>
 
  <xsl:sequence select="
    if ($integer-length) then
      ($length - 1, $length - $integer-length - 1)
    else
      ($length, 0)"/>
</xsl:function>

The last function looks odious. In many other languages its implementation would be considered as embarrassing.

Wednesday, February 17, 2010 7:29:55 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Sunday, February 07, 2010

Given:

public class N
{
  public readonly N next;
}

What needs to be done to construct a ring of N: n1 refers to n2, n2 to n3, ... nk to n1? Is it possible?

Sunday, February 07, 2010 7:57:08 AM UTC  #    Comments [0] -
Thinking aloud | Tips and tricks
# Saturday, February 06, 2010

To end with immutable trees, at least for now, we've implemented IDictionary<K, V>. It's named Map<K, V>. Functionally it looks very like SortedDictionary<K, V>. there are some differences, however:

  • Map in contrast to SortedDictionary is very cheap on copy.
  • Bacause Map is based on AVL tree, which is more rigorly balanced than RB tree, so it's a little bit faster asymptotically for lookup than SortedDictionary, and a little bit slower on modification.
  • Due to the storage structure: node + navigator, Map consumes less memory than SortedDictionary, and is probably cheaper for GC (simple garbage graphs).
  • As AVL tree stores left and right subtree sizes, in contrast to a "color" in RB tree, we able to index data in two ways: with integer index, and with key value.

Sources are:

Update:

It was impossible to withstand temptation to commit some primitive performance comparision. Map outperforms SortedDictionary both in population and in access. this does not aggree with pure algorithm's theory, but there might be other unaccounted factors: memory consumption, quality of implementation, and so on.

Program.cs is updated with measurements.

Update 2:

More occurate tests show that for some key types Map's faster, for others SortedDictionary's faster. Usually Map's slower during population (mutable AVL tree navigator may fix this). the odd thing is that Map<string, int> is faster than SortedDictionary<string, int> both for allocaction and for access. See excel report.

Update 3:

Interesing observation. The following table shows maximal and average tree heights for different node sizes in AVL and RB trees after a random population:

AVL RB
Size Max Avg Max Avg
10 4 2.90 5 3.00
50 7 4.94 8 4.94
100 8 5.84 9 5.86
500 11 8.14 14 8.39
1000 12 9.14 16 9.38
5000 15 11.51 18 11.47
10000 16 12.53 20 12.47
50000 19 14.89 23 14.72
100000 20 15.90 25 15.72
500000 25 18.26 28 18.27
1000000 25 19.28 30 19.27

Here, according with theory, the height of AVL tree is shorter than the height of RB tree. But what is most interesting is that the depth of an "average node". This value describes a number of steps required to find a random key. RB tree is very close and often is better than AVL in this regard.

Saturday, February 06, 2010 6:31:13 PM UTC  #    Comments [0] -
Thinking aloud | Tips and tricks
# Wednesday, February 03, 2010

It was obvious as hell from day one of generics that there will appear obscure long names when you will start to parametrize your types. It was the easiest thing in the world to take care of this in advanvce. Alas, C# inherits C++'s bad practices.

Read Associative containers in a functional languages and Program.cs to see what we're talking about.

Briefly, there is a pair (string, int), which in C# should be declared as:

System.Collections.Generic.KeyValuePair<string, int>

Obviously we would like to write it in a short way. These are our attempts, which fail:

1. Introduce generic alias Pair<K, V>:

using System.Collections.Generic;
using Pair<K, V> = KeyValuePair<K, V>;

2. Introduce type alias for a generic type with specific types.

using System.Collections.Generic;
using Pair = KeyValuePair<string, int>;

And this is only one that works:

using Pair = System.Collections.Generic.KeyValuePair<string, int>;

Do you think is it bearable? Well, consider the following:

  • There is a generic type ValueNode<T>, where T should be Pair.
  • There is a generic type TreeNavigator<N>, where N is should be ValueNode<Pair>.

The declaration looks like this:

using Pair = System.Collections.Generic.KeyValuePair<string, int>;
using Node = NesterovskyBros.Collections.AVL.ValueNode<
  System.Collections.Generic.KeyValuePair<string, int>>;
using Navigator = NesterovskyBros.Collections.AVL.TreeNavigator<
  NesterovskyBros.Collections.AVL.ValueNode<
    System.Collections.Generic.KeyValuePair<string, int>>>;

Do you still think is it acceptable?

P.S. Legacy thinking led C#'s and java's designers to the use of word "new" for the object construction. It is not required at all. Consider new Pair("A", 1) vs Pair("A", 1). C++ prefers second form. C# and java always use the first one.

Wednesday, February 03, 2010 11:59:19 AM UTC  #    Comments [0] -
Thinking aloud | Tips and tricks
# Wednesday, January 27, 2010

Continuing with the post "Ongoing xslt/xquery spec update" we would like to articulate what options regarding associative containers do we have in a functional languages (e.g. xslt, xquery), assuming that variables are immutable and implementation is efficient (in some sense).

There are three common implementation techniques:

  • store data (keys, value pairs) in sorted array, and use binary search to access values by a key;
  • store data in a hash map;
  • store data in a binary tree (usually RB or AVL trees).

Implementation choice considerably depends on operations, which are taken over the container. Usually these are:

  1. construction;
  2. value lookup by key;
  3. key enumeration (ordered or not);
  4. container modification (add and remove data into the container);
  5. access elements by index;

Note that modification in a functional programming means a creation of a new container, so here is a division:

  1. If container's use pattern does not include modification, then probably the simplest solution is to build it as an ordered sequence of pairs, and use binary search to access the data. Alternatively, one could implement associative container as a hash map.
  2. If modification is essential then neither ordered sequence of pairs, hash map nor classical tree implementation can be used, as they are either too slow or too greedy for a memory, either during modification or during access.

On the other hand to deal with container's modifications one can build an implementation, which uses "top-down" RB or AVL trees. To see the difference consider a classical tree structure and its functional variant:

Classical Functional
Node structure: node
  parent
  left
  right
  other data
node
 
  left
  right
  other data
Node reference: node itself node path from a root of a tree
Modification: either mutable or requires a completely new tree O(LnN) nodes are created

Here we observe that:

  1. one can implement efficient map (lookup time no worse than O(LnN)) with no modification support, using ordered array;
  2. one can implement efficient map with support of modification, using immutable binary tree;
  3. one can implement all these algorithms purely in xslt and xquery (provided that inline functions are supported);
  4. any such imlementation will lose against the same implementation written in C++, C#, java;
  5. the best implementation would probably start from sorted array and will switch to binary tree after some size threshold.

Here we provide a C# implementation of a functional AVL tree, which also supports element indexing:

Our intention was to show that the usual algorithms for associative containers apply in functional programming; thus a feature complete functional language must support associative containers to make development more conscious, and to free a developer from inventing basic things existing already for almost a half of century.

Wednesday, January 27, 2010 7:00:55 AM UTC  #    Comments [0] -
Thinking aloud | Tips and tricks | xslt
# Friday, December 11, 2009

A client asked us to produce Excel reports in ASP.NET application. They've given an Excel templates, and also defined what they want to show.

What are our options?

  • Work with Office COM API;
  • Use Office Open XML SDK (which is a set of pure .NET API);
  • Try to apply xslt somehow;
  • Macro, other?

For us, biased to xslt, it's hard to make a fair choice. To judge, we've tried formalize client's request and to look into future support.

So, we have defined sql stored procedures to provide the data. This way data can be represented either as ADO.NET DataSet, a set of classes, as xml, or in other reasonable format. We do not predict any considerable problem with data representation if client will decide to modify reports in future.

It's not so easy when we think about Excel generation.

Due to ignorance we've thought that Excel is much like xslt in some regard, and that it's possible to provide a tabular data in some form and create Excel template, which will consume the data to form a final output. To some extent it's possible, indeed, but you should start creating macro or vb scripts to achieve acceptable results.

When we've mentioned macroses to the client, they immediately stated that such a solution won't work due to security reasons.

Comparing COM API and Open XML SDK we can see that both provide almost the same level of service for us, except that the later is much more lighter and supports only Open XML format, and the earlier is a heavy API exposing MS Office and supports earlier versions also.

Both solutions have a considerable drawback: it's not easy to create Excel report in C#, and it will be a pain to support such solution if client will ask, say in half a year, to modify something in Excel template or to create one more report.

Thus we've approached to xslt. There we've found two more directions:

  • generate data for Office Open XML;
  • generate xml in format of MS Office 2003.

It's turned out that it's rather untrivial task to generate data for Open XML, and it's not due to the format, which is not xml at all but a zipped folder containing xmls. The problem is in the complex schemas and in many complex relations between files constituting Open XML document. In contrast, MS Office 2003 format allows us to create a single xml file for the spreadsheet.

Selecting between standard and up to date format, and older proprietary one, the later looks more attractive for the development and support.

At present we're at position to use xslt and to generate files in MS Office 2003 format. Are there better options?

Friday, December 11, 2009 9:28:32 AM UTC  #    Comments [4] -
Tips and tricks | xslt
# Saturday, December 05, 2009

Did you ever hear that double numbers may cause roundings, and that many financial institutions are very sensitive to those roundings?

Sure you did! We're also aware of this kind of problem, and we thought we've taken care of it. But things are not that simple, as you're not always know what an impact the problem can have.

To understand the context it's enough to say that we're converting (using xslt by the way) programs written in a CASE tool called Cool:GEN into java and into C#. Originally, Cool:GEN generated COBOL and C programs as deliverables. Formally, clients compare COBOL results vs java or C# results, and they want them to be as close as possible.

For one particular client it was crucial to have correct results during manipulation with numbers with 20-25 digits in total, and with 10 digits after a decimal point.

Clients are definitely right, and we've introduced generation options to control how to represent numbers in java and C# worlds; either as double or BigDecimal (in java), and decimal (in C#).

That was our first implementation. Reasonable and clean. Was it enough? - Not at all!

Client's reported that java's results (they use java and BigDecimal for every number with decimal point) are too precise, comparing to Mainframe's (MF) COBOL. This rather unusuall complain puzzles a litle, but client's confirmed that they want no more precise results than those MF produces.

The reason of the difference was in that that both C# and especially java may store much more decimal digits than is defined for the particualar result on MF. So, whenever you define a field storing 5 digits after decimal point, you're sure that exactly 5 digits will be stored. This contrasts very much with results we had in java and C#, as both multiplication and division can produce many more digits after the decimal point. The solution was to truncate(!) (not to round) the numbers to the specific precision in property setters.

So, has it resolved the problem? - No, still not!

Client's reported that now results much more better (coincide with MF, in fact) but still there are several instances when they observe differences in 9th and 10th digits after a decimal point, and again java's result are more accurate.

No astonishment this time from us but analisys of the reason of the difference. It's turned out that previous solution is partial. We're doing a final truncation but still there were intermediate results like in a/(b * c), or in a * (b/c).

For the intermediate results MF's COBOL has its, rather untrivial, formulas (and options) per each operation defining the number of digits to keep after a decimal point. After we've added similar options into the generator, several truncations've manifested in the code to adjust intermediate results. This way we've reached the same accurateness as MF has.

What have we learned (reiterated)?

  • A simple problems may have far reaching impact.
  • More precise is not always better. Client often prefers compatible rather than more accurate results.
Saturday, December 05, 2009 1:17:42 PM UTC  #    Comments [0] -
Tips and tricks | xslt
# Friday, November 13, 2009

For some reason C# lacks a decimal truncation function limiting result to a specified number of digits after a decimal point. Don't know what's the reasoning behind, but it stimulates the thoughts. Internet is plentiful with workarounds. A tipical answer is like this:

Math.Truncate(2.22977777 * 1000) / 1000; // Returns 2.229

So, we also want to provide our solution to this problem.

public static decimal Truncate(decimal value, byte decimals)
{
  decimal result = decimal.Round(value, decimals);
  int c = decimal.Compare(value, result);
  bool negative = decimal.Compare(value, 0) < 0;

  if (negative ? c <= 0 : c >= 0)
  {
    return result;
  }

  return result - new decimal(1, 0, 0, negative, decimals);
}

Definitely, if the function were implemented by the framework it were much more efficient. We assume, however, that above's the best implementation that can be done externally.

Friday, November 13, 2009 2:31:26 PM UTC  #    Comments [0] -
Tips and tricks
# Tuesday, November 03, 2009

A natural curiosity led us to the implementation of connection pooling in Apache Tomcat (org.apache.commons.dbcp).

And what're results do you ask?

Uneasiness... Uneasiness for all those who use it. Uneasiness due to the difference between our expectations and real implementation.

Briefly the design is following:

  • wrap every jdbc object;
  • cache prepared statements wrappers;
  • lookup prepared statement wrappers in the cache before asking original driver;
  • upon close return wrappers into the cache.

It took us a couple of minutes to see that this is very problematic design, as it does not address double close of statements properly (jdbc states that is safe to call close() over closed jdbc object). With Apache's design it's safe not to touch the object after the close() call, as it returned to the pool and possibly already given to an other client who requested it.

The correct design would be:

  • wrap every jdbc object;
  • cache original prepared statements;
  • lookup original prepared statement in the cache before asking original driver, and return wrappers;
  • detach wrapper upon close from original object, and put original object into the cache.

A bit later. We've found a confirmation of our doubts on Apache site: see "JNDI Datasource HOW-TO ", chapter "Common Problems".

Tuesday, November 03, 2009 11:20:00 AM UTC  #    Comments [0] -
Tips and tricks
# Wednesday, October 07, 2009

Our experience with facelets shows that when you're designing a composition components you often want to add a level of customization. E.g. generate element with or without id, or define class/style if value is specified.

Consider for simplicity that you want to encapsulate a check box and pass several attributes to it. The first version that you will probably think of is something like this:

<html xmlns="http://www.w3.org/1999/xhtml"
  xmlns:ui="http://java.sun.com/jsf/facelets"
  xmlns:c="http://java.sun.com/jstl/core"
  xmlns:h="http://java.sun.com/jsf/html"
  xmlns:ex="http://www.nesterovsky-bros.com/jsf">
  <body>
    <!--
      Attributes:
        id - an optional id;
        value - a data binding;
        class - an optional element class;
        style - an optional element inline style;
        onclick - an optional script event handler for onclick event;
        onchange - an optional script event handler for onchange event.
    -->
    <ui:component>
      <h:selectBooleanCheckbox
        id="#{id}"
        value="#{value}"
        style="#{style}"
        class="#{class}"
        onchange="#{onchange}"
        onclick="#{onclick}"/>
    </ui:component>
  </body>
</html>

Be sure, this is not what you have expected.  Output will contain all mentioned attributes, even those, which weren't passed into a component (they will have empty values). More than that, if you will omit "id", you will get an error like: "emtpy string is not valid id".

The reason is in the EL! Attributes used in this example are of type String, thus result of evaluation of value expression is coersed to String. Values of attributes that weren't passed in are evaluated to null. EL returns "" while coersing null to String. The interesting thing is that, if EL were not changing null then those omitted attributes would not appear in the output.

The second attept would probably be:

<h:selectBooleanCheckbox value="#{value}">
  <c:if test="#{!empty id}">
    <f:attribute name="id" value="#{id}"/>
  </c:if>
  <c:if test="#{!empty onclick}">
    <f:attribute name="onclick" value="#{onclick}"/>
  </c:if>
  <c:if test="#{!empty onchange}">
    <f:attribute name="onchange" value="#{onchange}"/>
  </c:if>
  <c:if test="#{!empty class}">
    <f:attribute name="class" value="#{class}"/>
  </c:if>
  <c:if test="#{!empty style}">
    <f:attribute name="style" value="#{style}"/>
  </c:if>
</h:selectBooleanCheckbox>

Be sure, this won't work either (it may work but not as you would expect). Instruction c:if is evaluated on the stage of the building of a component tree, and not on the rendering stage.

To workaround the problem you should prevent null to "" conversion in the EL. That's, in fact, rather trivial to achieve: value expression should evaluate to an object different from String, whose toString() method returns a required value.

The final component may look like this:

<h:selectBooleanCheckbox
  id="#{ex:object(id)}"
  value="#{value}"
  style="#{ex:object(style)}"
  class="#{ex:object(class)}"
  onchange="#{ex:object(onchange)}"
  onclick="#{ex:object(onclick)}"/>

where ex:object() is a function defined like this:

public static Object object(final Object value)
{
  return new Object()
  {
    public String toString()
    {
      return value == null ? null : value.toString();
    }
  }
}

A bit later: not everything works as we expected. Such approach doesn't work with the validator attribute, whereas it works with converter attribute. The difference between them is that the first attribute should be MethodExpression value, when the second one is ValueExpression value. Again, we suffer from ugly JSF implementation of UOutput component.

Wednesday, October 07, 2009 9:16:10 AM UTC  #    Comments [0] -
JSF and Facelets | Tips and tricks
# Wednesday, September 09, 2009

Recently we have seen a blog entry: "JSF: IDs and clientIds in Facelets", which provided wrong implementation of the feature.

I'm not sure how useful it is, but here is our approach to the same problem.

In the core is ScopeComponent. Example uses a couple of utility functions defined in Functions. Example itself is found at window.xhtml:

<html xmlns="http://www.w3.org/1999/xhtml"
  xmlns:ui="http://java.sun.com/jsf/facelets"
  xmlns:c="http://java.sun.com/jstl/core"
  xmlns:h="http://java.sun.com/jsf/html"
  xmlns:f="http://java.sun.com/jsf/core"
  xmlns:fn="http://java.sun.com/jsp/jstl/functions"
  xmlns:ex="http://www.nesterovsky-bros.com/jsf">
  <body>
    <h:form>
      <ui:repeat value="#{ex:sequence(5)}">
        <f:subview id="scope" binding="#{ex:scope().value}">
          #{scope.id}, #{scope.clientId}
        </f:subview>
        <f:subview id="script" uniqueId="my-script"
          binding="#{ex:scope().value}" myValue="#{2 + 2}">
          , #{script.id}, #{script.clientId},
          #{script.bindings.myValue.expressionString},
          #{ex:value(script.bindings.myValue)},
          #{script.attributes.myValue}
        </f:subview>
        <br/>
      </ui:repeat>
    </h:form>
  </body>
</html>

Update: ex:scope() is made to return a simple bean with property "value".

Another useful example:

<f:subview id="group" binding="#{ex:scope().value}">
<h:inputText id="input" value="#{bean.property}"/>
<script type="text/javascript">
var element = document.getElementById('#{group.clientId}:input');
</script>
</f:subview>

Wednesday, September 09, 2009 11:39:14 AM UTC  #    Comments [1] -
JSF and Facelets | Tips and tricks

In the section about AJAX, JSF 2.0 spec (final draft) talks about partial requests...

This sounds rather strange. My perception was that the AJAX is about partial responses. What a sense to send partial requests? Requests are comparatively small anyway! Besides, a partial request may complicate restoring component tree on the server and made things fragile, but this largely depends on what they mean with these words.

Wednesday, September 09, 2009 5:54:38 AM UTC  #    Comments [0] -
JSF and Facelets | Tips and tricks
# Saturday, August 29, 2009

Recently we were disputing (Arthur vs Vladimir) about the benefits of ValueExpression references in JSF/Facelets.

Such dispute in itself presents rather funny picture when you're defending one position and after a while you're taking opposite point and starting to maintain it. But let's go to the problem.

JSF/Facelets uses Unified Expression Language for the data binding, e.g.:

<h:inputText id="name" value="#{customer.name}" />

or

<h:selectBooleanCheckbox id="selected" value="#{customer.selected}" />

In these cases value from input and check boxes are mapped to a properties name, and selected of a bean named customer. Everything is fine except of a case when selected is not of boolean type (e.g. int). In this case you will have a hard time thinking on how to adapt bean property to the jsf component. Basically, you have to provide a bean adapter, or change type of property. Later is unfeasible in our case, thus we're choosing bean adapter. More than that we have to create a generic solution for int to boolean property type adapter. With this target in mind we may create a function receiving bean and a property name and returning other bean with a single propery of boolean type:

<h:selectBooleanCheckbox id="selected"
  value="#{ex:toBoolean(customer, 'selected').value}" />

But thinking further the question appears: whether we can pass ValueExpression by reference into a bean adapter function, and have something like this:

<h:selectBooleanCheckbox id="selected"
  value="#{ex:toBoolean(byref customer.selected).value}" />

It turns out that it's possible to do this kind of thing. Unfortunately it requires custom facelets tag, like this:

<ex:ref var="selected" value="#{customer.selected}"/>

<h:selectBooleanCheckbox id="selected"
  value="#{ex:toBoolean(selected).value}" />

Implementation of such a tag is really primitive (in fact it mimics c:set tag handler except one line), but still it's an extension on the level we don't happy to introduce.

This way we were going circles considering pros and cons, regretting that el references ain't native in jsf/facelets and weren't able to classify whether our solution is a hack or a neat extension...

P.S. We know that JSF 2.0 provides solution for h:selectBooleanCheckbox but still there are cases when similar technique is required even there.

Saturday, August 29, 2009 1:11:26 PM UTC  #    Comments [0] -
JSF and Facelets | Tips and tricks
# Friday, August 21, 2009

We always tacitly assumed that protected modifier in java permits member access from a class the member belongs to, or from an instance of class's descendant. Very like the C++ defines it, in fact.

In other words no external client of an instance can directly access a protected member of that instance or class the instance belongs to.

It would be very interesting to know how many people live with such a naivete, really!

Well, that's what java states:

The protected modifier specifies that the member can only be accessed within its own package (as with package-private) and, in addition, by a subclass of its class in another package.

If one'll  think, just a little, she'll see that this gorgeous definition is so different from C++'s and so meaningless that they would better dropped this modifier altogether.

The hole is so huge that I can easily build an example showing how to modify protected member of some other class in a perfectly valid way. Consider:

MyClass.java

package com.mypackage;

import javax.faces.component.Hack;
import javax.faces.component.UIComponentBase;

import javax.faces.event.FacesListener;

public class MyClass
{
   public void addFacesListener(
     UIComponentBase component,
     FacesListener listener)
   {
     Hack.addFacesListener(component, listener);
   }

   ...
}

Hack.java

package javax.faces.component;

import javax.faces.event.FacesListener;

public class Hack
{
   public static void addFacesListener(
     UIComponentBase component,
     FacesListener listener)
   {
     component.addFacesListener(listener);
   }
}

An example is about to how one adds custom listener to an arbitrary jsf component. Notice that this is not assumed  by design, as a method addFacesListener() is protected. But see how easy one can hack this dummy "protected" notion.

Update: for a proper implementation of protected please read Manifest file, a part about package sealing.

Friday, August 21, 2009 12:25:59 PM UTC  #    Comments [0] -
JSF and Facelets | Tips and tricks
# Thursday, August 20, 2009

Just in case, if you don't know what JSON stands for - it's JavaScript Object Notation.

You may find a plenty of JSON implementations in java, so we shall add one more idea. Briefly, it's about to plug it into xml serialization infrastructure JAXB. Taking into account that JAXB now is an integral part of java platform itself, benefits are that you can transparently use the same beans for xml and JSON serialization.

What you need to do is only to provide JSON reader and writer under the hood of XMLStreamReader and XMLStreamWriter interfaces.

In spare time we shall implement this idea.

Thursday, August 20, 2009 6:28:37 AM UTC  #    Comments [0] -
Tips and tricks
# Wednesday, August 19, 2009

If you by chance see lines like the following in your code:

private transient final Type field;

then know, you're in the trouble!

The reason is simple, really (provided you're sane and don't put field modifiers without reason). transient assumes that your class is serializable, and you have a particular field that you don't want to serialize. final states that the field is initialized in the constructor, and does not change the value for the rest life cycle.

This way if you will serialize an instance of class with such a field, and then deserialize it back, you will have the field initialized with null, and no way to have another value there.

P.S. That's what we have found in our code recently:

private transient final Lock sync = new ReentrantLock();

Wednesday, August 19, 2009 4:44:42 AM UTC  #    Comments [0] -
Tips and tricks
# Monday, July 20, 2009

Generics in C# look inferior to templates (especially to concepts) in C++, however now and then you can build a wonderful pieces the way a C++ profi would envy.

Consider a generic converter method: T Convert<T>(object value).

In C++ I would create several template specializations for all supported conversions. Well, to make things harder, think of converter provider supporting conversion:

public interface IConverterProvider
{
  Converter<object, T> Get<T>();
}

That begins to be a puzzle in C++, but C# handles it easily!

My first C#'s implementation was too naive, and spent too many cycles in provider, resolving which converter to use. So, I went on, and have created a sofisticated implementation like this:

  private IConverterProvider provider = ...

  public T Convert<T>(object value)
  {
    var converter = provider.Get<T>();

    return converter(value);
  }

...

public class ConverterProvider: IConverterProvider
{
  public Converter<object, T> Get<T>()
  {
    return Impl<T>.converter;
  }

  private static class Impl<T>
  {
    static Impl()
    {
      // Heavy implementation initializing converters.
      converter = ...
    }

    public static readonly Converter<object, T> converter;
  }
}

Go, and do something close in C++!

Monday, July 20, 2009 7:18:51 AM UTC  #    Comments [0] -
Tips and tricks
# Monday, June 29, 2009

If you have a string variable $value as xs:string, and want to know whether it starts from a digit, then what's the best way to do it in the xpath?

Our answer is: ($value ge '0') and ($value lt ':').

Looks a little funny (and disturbing).

Monday, June 29, 2009 6:00:28 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Wednesday, June 24, 2009

In our project we're generating a lot of xml files, which are subjects of manual changes, and repeated generations (often with slightly different generation options). This way a life flow of such an xml can be described as following:

  1. generate original xml (version 1)
  2. manual changes (version 2)
  3. next generation (version 3)
  4. manual changes integrated into the new generation (version 4)

If it were a regular text files we could use diff utility to prepare patch between versions 1 and 2, and apply it with patch utility to a version 3. Unfortunately xml has additional semantics compared to a plain text. What's an invariant or a simple modification in xml is often a drastic change in text. diff/patch does not work well for us. We need xml diff and patch.

The first guess is to google it! Not so simple. We have failed to find a tool or an API that can be used from ant. There are a lot of GUIs to show xml differences and to perform manual merge, or doing similar but different things to what we need (like MS's xmldiffpatch).

Please point us to such a program!

Meantime, we need to proceed. We don't believe that such a tool can be done on the knees, as it's a heuristical and mathematical at the same time task requiring a careful design and good statistics for the use cases. Our idea is to exploit diff/patch. To achieve the goals we're going to perform some normalization of xmls before diff to remove redundant invariants, and normalization after the patch to return it into a readable form. This includes:

  • ordering attributes by their names;
  • replacing unsignificant whitespaces with line breaks;
  • entering line breaks after element names and before attributes, after attribute name and before it's value, and after an attribute value.

This way we expect to recieve files reacting to modifications similarly to text files.

Wednesday, June 24, 2009 11:40:32 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Saturday, April 18, 2009

Sunny> Look what have I found! Consider a C#:

public class T
{
  public T free;
}

public void NewTest()
{
  T cache = new T();

  Stopwatch timer = new Stopwatch();

  timer.Reset();
  timer.Start();

  for(int i = 0; i < 10000000; ++i)
  {
    // Get from cache.
    T t;

    if (cache.free == null)
    {
      cache.free = new T();
    }

    t = cache.free;

    // Release
    cache.free = t;
    t = null;
  }

  timer.Stop();

  long cacheTicks = timer.ElapsedTicks;

  timer.Reset();
  timer.Start();

  for(int i = 0; i < 10000000; ++i)
  {
    new T();
  }

  timer.Stop();

  long newTicks = timer.ElapsedTicks;

  Console.WriteLine("cache: {0}, new: {1}", cacheTicks, newTicks);
}

Gloomy> And?

Sunny> Tests show that new T() is almost as fast as caching! GC's "new" probably has a fast route, where it shifts free memory border in an atomic way, thus allocation takes just several cycles.

Gloomy> Well, you're probably right, there is a fast route. I, however, have a different opinion. To track references, a generational garbage collector implements field assign as a call rather than a mov. This routine, except move itself, marks touched memory page in a special card table (who said GC is cheap?); thus, I think, a reference field setter is almost as slow as the "new" call.

Saturday, April 18, 2009 7:25:12 AM UTC  #    Comments [0] -
Tips and tricks
# Thursday, April 16, 2009

.Net is known for its array covariance. That means that any array can be cast to an array of base elements:

public class T: B
{
}

T[] tlist = ...
B[] blist = tlist;

This feature comes at cost:

B b = ...
T t = ...

blist[0] = b; // This efficiently is: blist[0] = (T)b;
tlist[0] = t; // This is the same: tlist[0] = (T)t;

We pay the cost of additional cast, just for nothing. Let this dubious design decision opresses .Net/Java inventors.

You can eliminate the cast. Just use array of structs:

struct S<T>
{
  public T t;
}

S<T>[] slist = ...

slist[0].t = t; // Works without cast.

Measurment show that S[] is ~35% faster than T[] on write, and slower (JIT could do better) on read.

Well, ugly workaround of ugly design.

P.S. In java there is no relief...

Thursday, April 16, 2009 7:29:29 PM UTC  #    Comments [0] -
Tips and tricks
# Sunday, April 05, 2009

There is a method Right() in the RB tree implementation:

public int Right(int node)
{
  return items[node].right;
}

JIT does not want to inline it, probably as the method may throw:

public int Right(int node)
{
  return items[node].right;
00000000 mov eax,dword ptr [ecx+4]
00000003 cmp edx,dword ptr [eax+4]
00000006 jae 00000013
00000008 shl edx,4
0000000b lea eax,[eax+edx+8]
0000000f mov eax,dword ptr [eax+8]
00000012 ret
00000013 call 74C3A62C
00000018 int 3

Too sad.

Sunday, April 05, 2009 1:16:06 PM UTC  #    Comments [0] -
Incremental Parser | Tips and tricks
# Thursday, April 02, 2009

Early in 2001 we've read that .NET's JIT is smart enough to optimize repeated boundary checks.

In the year 2009 we still can verify that this is not the case (no matter how hard you try).

C#:

private int CharAt(int offset)
{
  string text = this.text;

  return (uint)offset >= (uint)text.Length ? -1 : text[offset];
}

Disassembly:

private int CharAt(int offset)
{
  string text = this.text;
00000000 push ebp
00000001 mov ebp,esp
00000003 mov ecx,dword ptr [ecx+30h]

  return (uint)offset >= (uint)text.Length ? -1 : text[offset];
00000006 cmp dword ptr [ecx+8],edx
00000009 jbe 00000017
0000000b cmp edx,dword ptr [ecx+8]
0000000e jae 0000001C
00000010 movzx eax,word ptr [ecx+edx*2+0Ch]
00000015 pop ebp
00000016 ret
00000017 or eax,0FFFFFFFFh
0000001a pop ebp
0000001b ret
0000001c call 74C24C6C
00000021 int 3

P.S. Neither this method is inlined (IL length is 25 bytes).

Thursday, April 02, 2009 7:56:00 AM UTC  #    Comments [0] -
Incremental Parser | Tips and tricks
# Tuesday, March 31, 2009

Yesterday, I've installed IE8.

Looks better here and there.

Today, I'm shocked!

I've reopened my web mail and it remembered the session. It keeps session cookies after closing IE8 instance!

I did not believe to myself and logged into an another web application, and then opened another IE8 instance. What do you think? - It shares the session between instances!

That is a serious security problem.

It prevents me from opening two sessions of a web application on my computer.

P.S. we have found that this problem was already discussed. See IE8 handles sessions/cookies different than IE7 - big trouble for - ...

Someone needs a brain surgery...

Quick solution: run IE8 with -nomerge command line option.

Tuesday, March 31, 2009 11:45:52 AM UTC  #    Comments [2] -
Tips and tricks
# Wednesday, March 18, 2009

We'd like to return to the binary tree algorithms and spell what you cannot do with generics in C#. Well, you can do many things, however with generalization penalty.

Consider a binary tree node: Node(Parent, Left, Right). RB, AVL, and others algorithms attach some private information to this node to perform balancing.

You can express this idea methematically (and in C++), you cannot implement it efficiently in C#.

More focused example. Consider RB tree: Node(Parent, Left, Right, Color). There are a number of ways you may implement the internal structure of the tree. Algorithms themselves stay the same.

Straightforward implementation:

class Node
{
  Node Parent;
  Node Left;
  Node Right;
  bool Color;
}

This implementation allocates nodes in the heap and each node refers to other nodes.

Node navigator implementation:

class Node
{
  Node Left;
  Node Right;
  bool Color;
}

struct NodeNavigator
{
  Node[] nodes;
  int index;
}

Node does not refer to the parent. This reduces the memory consumption and simplifies object graph, which is good for GC. Tree is walked using a node navigator, which stores ancestors of the node.

Node as a structure:

struct Node
{
  int Parent;
  int Left;
  int Right;
  bool Color; // This might be integrated as highest bit of parent.
}

Tree is stored as an array of nodes. This is compact and GC efficient implementation.

Node as a structure, and with node navigator:

struct Node
{
  int Left;
  int Right;
  bool Color; // This might be integrated as highest bit of left.
}

struct NodeNavigator
{
  Tree tree;
  int[] nodes;
  int index;
}

Tree is stored as an array of nodes, and a navigator is used to walk it. This is the most compact implementation.

Each implementation has its virtues. The common between implementations is that they share the same balancing and navigation algorithms. Storage differences prevent a single C# implementation. To the contrast, C++ allows to define a concept "tree" and to define specializations of this concept, allowing a unified algorithms; all this is done without performance penalty.

P.S. java in this regard, is almost alternativeless...

Wednesday, March 18, 2009 6:53:05 AM UTC  #    Comments [0] -
Incremental Parser | Tips and tricks
# Thursday, February 12, 2009

Do you agree that binary trees and algorithms that keep trees reasonably balanced are important?

Our answer is yes!

It's interesting enough, however, that you won't easily find these algorithms publicly available.

Though red-black, AVL and other algorithms described in the wikipedia are defined in terms of tree manipulation, all implementations we have seen, deal with trees annotated with keys and values. These implementations really use tree balancing algorithms behind the schene, and expose a commonplace set or map containers to a client. Even C++ Standard Library suffers from this disease.

We think that binary trees are valuable independent concepts, and they worth to be implemented separately, at least because there are other algorithms, except sets and maps, using trees.

And well, we did it in C#! See RedBlackTree.cs.

Consider an example - a simple scheduler, ScheduleBookmark.cs, with operations:

  • schedule an action;
  • remove an action from the schedule;
  • enumerate actions;
  • find a date, an action is scheduled for;
  • find an action (or at least closest one) for a specified date;
  • postpone actions due to delays;

A balanced binary tree allows efficient implementation of such a scheduler. Tree node stores an action, and a time span between parent node and this node. This way:

Operation Steps
schedule an action find place + link node + rebalance tree
remove an action from the schedule unlink node + rebalance tree
enumerate actions navigate tree
find a date, an action is scheduled for find node in tree
find an action for a specified date cumulate time spans up to the tree root
postpone actions due to delays fixup time spans from a node up to the tree root

Compare operation complexities between tree, array, list and map:
Operation Tree Array List Map
schedule an action O(ln(N)) O(N) O(N) O(ln(N))
remove an action from the schedule O(ln(N)) O(N) O(1) O(ln(N))
enumerate actions O(ln(N)) O(1) O(1) O(ln(N))
find a date, an action is scheduled for O(ln(N)) O(1) O(1) O(1)
find an action for a specified date O(ln(N)) O(ln(N)) O(N) O(ln(N))
postpone actions due to delays O(ln(N)) O(N) O(N) O(N*ln(N))

Complexity of each operation for the tree is O(ln(N)). No arrays, lists, or maps achieve similar worst case guaranty.

Finally, the test program is Program.cs, and a whole project (VS2008) is Tree.zip

Thursday, February 12, 2009 1:17:36 PM UTC  #    Comments [0] -
Incremental Parser | Tips and tricks
# Wednesday, February 11, 2009

Could you think of a C# method accepting an ancestor, and forbidding a descendant of a class at compile time?

The answer to this probably is: why do you need such a reptile.

Well, I don't. I didn't meant to create such a method, but generics help a lot!

public class BinaryTreeNode<Node>
  where Node: BinaryTreeNode<Node>
{
  public Node parent;
  public Node left;
  public Node right;
}

public class MyNode: BinaryTreeNode<MyNode>
{
  public int key;
}

public class MyRoot: MyNode
{
}

public class Test
{
  public void test()
  {
    MyRoot root = new MyRoot();

    // print((MyNode)root); // This works.
    print(root); // This does not work.
  }

  private static void print<T>(T node)
    where T: BinaryTreeNode<T>
  {
    Console.WriteLine("print me");
  }
}

By the way, BinaryTreeNode is an "abstract" class, as you cannot instantiate it but inherit only.

Wednesday, February 11, 2009 1:59:17 PM UTC  #    Comments [0] -
Incremental Parser | Tips and tricks
# Wednesday, January 14, 2009

Once upon a time, we created a function mimicking decapitalize() method defined in java in java.beans.Introspector. Nothing special, indeed. See the source:

/**
 * Utility method to take a string and convert it to normal Java variable
 * name capitalization. This normally means converting the first
 * character from upper case to lower case, but in the (unusual) special
 * case when there is more than one character and both the first and
 * second characters are upper case, we leave it alone.
 * <p>
 * Thus "FooBah" becomes "fooBah" and "X" becomes "x", but "URL" stays
 * as "URL".
 *
 * @param name The string to be decapitalized.
 * @return The decapitalized version of the string.
 */
public static String decapitalize(String name) {
  if (name == null || name.length() == 0) {
    return name;
  }
  if (name.length() > 1 && Character.isUpperCase(name.charAt(1)) &&
    Character.isUpperCase(name.charAt(0))){
    return name;
  }
  char chars[] = name.toCharArray();
  chars[0] = Character.toLowerCase(chars[0]);
  return new String(chars);
}

We typed implementation immediately:

<xsl:function name="t:decapitalize" as="xs:string">
  <xsl:param name="value" as="xs:string?"/>

  <xsl:variable name="c" as="xs:string"
    select="substring($value, 2, 1)"/>

  <xsl:sequence select="
    if ($c = upper-case($c)) then
      $value
    else
      concat
      (
        lower-case(substring($value, 1, 1)),
        substring($value, 2)
      )"/>
</xsl:function>

It worked, alright, until recently, when it has fallen to work, as the output was different from java's counterpart.

The input was W9Identifier. Function naturally returned the same value, while java returned w9Identifier. We has fallen with the assumption that $c = upper-case($c) returns true when character is an upper case letter. That's not correct for numbers. Correct way is:

<xsl:function name="t:decapitalize" as="xs:string">
  <xsl:param name="value" as="xs:string?"/>

  <xsl:variable name="c" as="xs:string"
    select="substring($value, 2, 1)"/>

  <xsl:sequence select="
    if ($c != lower-case($c)) then
      $value
    else
      concat
      (
        lower-case(substring($value, 1, 1)),
        substring($value, 2)
      )"/>
</xsl:function>

Wednesday, January 14, 2009 3:46:23 PM UTC  #    Comments [0] -
Tips and tricks | xslt
# Thursday, December 04, 2008

Although in last our projects we're using more Java and XSLT, we always compare Java and .NET features. It's not a secret that in most applications we may find cache solutions used to improve performance. Unlike .NET providing a robust cache solution Java doesn't provide anything standard. Of course Java's adept may find a lot of caching frameworks or just to say: "use HashMap (ArrayList etc.) instead", but this is not the same.

Think about options for Java:
1. Caching frameworks (caching systems). Yes, they do their work. Do it perfectly. Some of them are brought to the state of the art, but there are drawbacks. The crucial one is that for simple data caching one should use a whole framework. This option requires too many efforts to solve a simple problem.

2. Collection classes (HashMap, ArrayList etc.) for caching data. This is very straightforward solution, and very productive. Everyone knows these classes, nothing to configure. One should declare an instance of such class, take care of data access synchronization and everything starts working immediately. An admirable caching solution but for "toy applications", since it solves one problem and introduces another one. If an application works for hours and there are a lot of data to cache, the amount of data grows only and never reduces, so this is the reason why such caching is very quickly surrounded with all sort of rules that somehow reduce its size at run-time. The solution very quickly lost its shine and become not portable, but it's still applicable for some applications.

3. Using Java reference objects for caching data. The most appropriate for cache solution is a java.util.WeekHashMap class. WeakHashMap works exactly like a hash table but uses weak references internally. In practice, entries in the WeakHashMap are reclaimed at any time if they are not refered outside of map. This caching strategy depends on GC's whims and is not entirely reliable, may increase a number of cache misses.

We've decided to create our simple cache with sliding expiration of data.

One may create many cache instances but there is only one global service that tracks expired objects among these instances:

private Cache<String, Object> cache = new Cache<String, Object>();

There is a constructor that specifies an expiration interval in milliseconds for all cached objects:

private Cache<String, Object> cache = new Cache<String, Object>(15 * 60 * 1000)

Access is similar to HashMap:

instance = cache.get("key"); and cache.put("key", instance);

That's all one should know to start use it. Click here to download the Java source of this class. Feel free to use it in your applications.

Thursday, December 04, 2008 12:12:38 PM UTC  #    Comments [2] -
Announce | Tips and tricks
# Thursday, November 20, 2008

Yesterday I've read of a new Garbage Collection implementation G1. To be honest I was not impressed.

I think Garbage Collection is an evil, or at least its present implementations. I do not believe in algorithms that in their very core assume a centralized execution. On the other hand it's clear it's not in my power to change the status quo. My lot is to give advices mostly incompetent and ignorable.

I'm waiting for the time when someone will reach the idea to bring some parts of GC logic out of runtime scope. This will require more VM  intelligence, however will bear its fruits.

JIT or compiler during a static analysis may prove that some objects being collected may make some of their referring objects unreachable, provided it can prove that referring objects are not reachable through the other means (e.g. private field which is not stored in other places). This is close to the ideas expressed in Muse on value types in java. It's possible to prepare a garbage graph in advance before runtime.

In many cases it's also possible to prove that when method's variable goes out of scope it's not reachable through the other means and may be collected. This allows to implement a stage of automatic garbage collection when objects that are proven to be a garbage be immedeately added to a free memory set.

As an example I'm thinking of java's ArrayList object which stores private array. When ArrayList is reclaimed or resized a reference to the private array is getting lost and memory can be added to the free set immediately.

This mechanics being integrated as the first stage of GC will make it less centralized, as I believe many objects will be collected this way.

Thursday, November 20, 2008 7:54:47 AM UTC  #    Comments [0] -
Tips and tricks
# Tuesday, November 18, 2008

Suppose you have constructed a sequence of attributes.

How do you access a value of attribute "a"?

Simple, isn't it? It has taken a couple of minutes to find a solution!

<xsl:variable name="attributes" as="attribute()*">
  <xsl:apply-templates mode="t:generate-attributes" select="."/>
</xsl:variable>

<xsl:variable name="value" as="xs:string?"
  select="$attributes[self::attribute(a)]"/>

Tuesday, November 18, 2008 11:41:41 AM UTC  #    Comments [2] -
Tips and tricks | xslt
# Thursday, November 13, 2008

Saying

Our project, containing many different xslt files, generates many different outputs (e.g: code that uses DB2 SQL, or Oracle SQL, or DAO, or some other flavor of code). This results in usage of indirect calls to handle different generation options, however to allow xslt to work we had to create a big main xslt including stylesheets for each kind of generation. This impacts on a compilation time.

Alternatives

  1. A big main xslt including everything.
  2. A big main xslt including everything and using "use-when" attribute.
  3. Compose main xslt on the fly.

We were eagerly inclined to the second alternative. Unfortunately a limited set of information is available when "use-when" is evaluated. In particular there are neither parameters nor documents available. Using Saxon's extensions one may reach only static variables, or access System.getProperty(). This isn't flexible.

We've decided to try the third alternative.

Solution

We think we have found a nice solution: to create XsltSource, which receives a list of includes upon construction, and creates an xslt when getReader() is called.

import java.io.Reader;
import java.io.StringReader;

import javax.xml.transform.stream.StreamSource;

/**
 * A source to read generated stylesheet, which includes other stylesheets.
 */
public class XsltSource extends StreamSource
{
  /**
   * Creates an {@link XsltSource} instance.
   */
  public XsltSource()
  {
  }

  /**
   * Creates an {@link XsltSource} instance.
   * @param systemId a system identifier for root xslt.
   */
  public XsltSource(String systemId)
  {
    super(systemId);
  }

  /**
   * Creates an {@link XsltSource} instance.
   * @param systemId a system identifier for root xslt.
   * @param includes a list of includes.
   */
  public XsltSource(String systemId, String[] includes)
  {
    super(systemId);

    this.includes = includes;
  }

  /**
   * Gets stylesheet version.
   * @return a stylesheet version.
   */
  public String getVersion()
  {
    return version;
  }

  /**
   * Sets a stylesheet version.
   * @param value a stylesheet version.
   */
  public void setVersion(String value)
  {
    version = value;
  }

  /**
   * Gets a list of includes.
   * @return a list of includes.
   */
  public String[] getIncludes()
  {
    return includes;
  }

  /**
   * Sets a list of includes.
   * @param value a list of includes.
   */
  public void setIncludes(String[] value)
  {
    includes = value;
  }

  /**
   * Generates an xslt on the fly.
   */
  public Reader getReader()
  {
    String[] includes = getIncludes();

    if (includes == null)
    {
      return super.getReader();
    }

    String version = getVersion();

    if (version == null)
    {
      version = "2.0";
    }

    StringBuilder builder = new StringBuilder(1024);

    builder.append("<stylesheet version=\"");
    builder.append(version);
    builder.append("\" xmlns=\"http://www.w3.org/1999/XSL/Transform\">");

    for(String include: includes)
    {
      builder.append("<include href=\"");
      builder.append(include);
      builder.append("\"/>");
    }

    builder.append("</stylesheet>");

    return new StringReader(builder.toString());
  }

  /**
   * An xslt version. By default 2.0 is used.
   */
  private String version;

  /**
   * A list of includes.
   */
  private String[] includes;
}

To use it one just needs to write:

Source source = new XsltSource(base, stylesheets);
Templates templates = transformerFactory.newTemplates(source);
...

where:

  • base is a base uri for the generated stylesheet; it's used to resolve relative includes;
  • stylesheets is an array of hrefs.

Such implementation resembles a dynamic linking when separate parts are bound at runtime. We would like to see dynamic modules in the next version of xslt.

Thursday, November 13, 2008 11:26:50 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Friday, October 17, 2008

We strongly object against persistence frameworks in their contemporary meaning. This includes a long row of names like Hibernate, Java Persistence API, LINQ, and others.

Consider how one of them describes itself:

...high performance object/relational persistence and query service... lets you develop persistent classes following object-oriented idiom - including association, inheritance, polymorphism, composition, and collections... allows you to express queries in its own portable SQL extension...

Sounds good, right?

We think not! Words "own" and "portable" regarding SQL are heard almost like antonyms. When one creates a unified language (a noble rush, opposed to a proprietary one (?)) she will inevitably adds a peer, increasing plurality in the family of languages.

Attempts to create similar layers between data and business logic are not new. This happens throughout the computer history. IDMS, NATURAL, COOL:GEN these are 20-30 years old examples.

Our reasoning (nothing new).

One need to approach to a design (development and maintainance) from different perspectives, thus she will understand the question under the design better, and will estimate skills to accomplish the problem. This will lead to a modularization e.g: business layer, data layer, appearance; and to development (maintainance) roles: program developer, database specialist, appearance speciaist. On a small scale several roles are often fulfilled with one person; this should not mean, however, that these roles are redundant, one just need to try on different roles.

Why does one separate business layer and data layer?

Pragmatic perspective. There are databases, which may accomplish most of data storage tasks in a more efficient way than one may achieve without database. There are two worlds of database specialists and program developers. These two layers and roles are facts of reality.

A desiner's goal is to keep these roles separate:

  • do not force a database specialist to know the business logic details;
  • do not force a program developer to know details on how to organize a storage in more efficient way, or on how to optimize a particular query;

Modularity helps here. Databases are well equipped to solve these tasks: the data layer should expose a database API through stored procedures, functions, and views, while the business layer should use this API to access the database.

With persistence frameworks there are two alterantives:

  1. still use data layer API;
  2. rely on a persistence framework.

When the first case is selected then a framework provides almost no aditional value comparing to traditional database access (jdbc, ado.net, an so on).

When one relies on a framework then a data layer interface virtually disappears (in fact a framework substitutes this interface). Database specialist has very little control over tuning the data structure, and optimizing queries, unless she starts digging in the business code but even then she always cannot control queries to the database. Moreover database specialist must learn a proprietary query language.

Result is that a persistence framework erodes a division of responsibilities, complicating development and maintainance.

We often hear a following explanation on why one should use Persistence Frameworks: "It eases database vendor switch". This is the most stupid reason to use Persistence Frameworks! It looks as if they plan to switch vendors once a day.

A design needs to focus on a modularity. This will make code more robust, faster and maintainable. This also eases potential migration process, as the data layer should be migrated only, with minimal (mostly configurational) changes in the business layer.

Friday, October 17, 2008 7:57:28 PM UTC  #    Comments [2] -
Tips and tricks
# Saturday, September 27, 2008

We are certain xslt/xquery are the best for web application frameworks from the design perspective; or, in other words, pipeline frameworks allowing use of xslt/xquery are preferable way to create web applications.

Advantages are obvious:

  • clear separation of business logic, data, and presentation;

  • richness of languages, allowing to implement simple presentation, complex components, and sophisticated data binding;

  • built-in extensibility, allowing comunication with business logic, written in other languages and/or located at different site.

It seems the agitation for a such technologies is like to force an open door. There are such frameworks out there: Orbeon Forms, Cocoon, and others. We're not qualified to judge of their virtues, however...

Look at the current state of affairs. The main players in this area (well, I have a rather limited vision) push other technologies: JSP/JSF/Faceletes and alike in the Java world, and ASP.NET in the .NET world. The closest thing they are providing is xslt servlet/component allowing to generate an output.

Their variants of syntaxis, their data binding techniques allude to similar paradigms in xslt/xquery:

<select>
  <c:forEach var="option" items="#{bean.options}">
    <option value="#{option.key}">#{parameter.value}</option>
  </c:forEach>
</select>

On the surface, however, we see much more limited (in design and in the application) frameworks.

And here is a contradiction: how can it be that at present such a good design is not as popular, as its competitors, at least?

Someone can say, there is no such a problem. You can use whatever you want. You have a choice! Well, he's lucky. From our perspective it's not that simple.

We're creating rather complex web applications. Their nature isn't important in this context, but what is important is that there are customers. They are not thoroughly enlightened in the question, and exactly because of this they prefer technologies proposed by leaders. It seems, everything convince them: main stream, good support, many developers who know technology.

There is no single chance to promote anything else.

We believe that the future may change this state, but we're creating at present, and cannot wait...

Saturday, September 27, 2008 10:36:06 AM UTC  #    Comments [3] -
Tips and tricks | xslt
# Monday, September 08, 2008

Java has no value types: objects allocated inplace, in contrast to objects referred by a pointer in the heap. This, in my opinion, has a negative impact on a program design and on a performance.

Incidentally, I've thought of a use case, which can be understood as a value type by the jvm implementations. Consider an example:

class A
{
  private final B b = new B();
}

Implementation may layout class A, in a way that field b will be a content of an instance of class B itself rather than a pointer to an instance of a class B. This way we save a pointer and a heap allocation of instance B. Another example:

class C
{
  C(int size)
  {
    values = new D[size];

    for(int i = 0; i < values.length; i++)
    {
      values[i] = new D();
    }
  }

  private final D[] values;
}

Here field values is never a null and each item of array contains a non null value. Assuming these conditions are kept for a whole life cycle, and values are not passed by reference, we can consider values as an array of value types.

A use case conditions are following:

  • a field contains a non null value;
  • the field value is an instance of the field type and not descendant type;
  • if the field is an array, then all elements of the array are initialized with instances of element type, and not descendant type.
  • the field or an element of the array can be assigned through the operator new only (field = new T(), array[i] = new T());
  • the array field is not passed by reference (Arrays.sort(array) never happens).

JIT's allowed to interpret a field as a value type provided it proves these conditions.

Later...

There is another use case to detect value types:

  • a method variable contains no null value, and
  • that variable is never stored in any field, and
  • no synchronization is used on the instance of value in variable, and
  • a value to the variable is assigned through the operator new only.

A variable can be layed out directly onto the stack, provided a preceding conditions are satisfied.

P.S. In spite that .NET has built in value types, it may use the very same technique to optimize reference types.

Monday, September 08, 2008 8:01:51 AM UTC  #    Comments [0] -
Tips and tricks
# Thursday, July 31, 2008

Yesterday, incidentally, I've arrived to a problem of a dynamic error during evaluation of a template's match. This reminded me SFINAE in C++. There the principle is applied at compile time to find a matching template.

I think people underestimate the meaning of this behaviour. The effect of dynamic errors occurring during pattern evaluation is described in the specification:

Any dynamic error or type error that occurs during the evaluation of a pattern against a particular node is treated as a recoverable error even if the error would not be recoverable under other circumstances. The optional recovery action is to treat the pattern as not matching that node.

This has far reaching consequences, like an error recovery. To illustrate what I'm talking about please look at this simple stylesheet that recovers from "Division by zero.":

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xsl:template match="/">
  <xsl:variable name="operator" as="element()+">
    <div divident="10" divisor="0"/>
    <div divident="10" divisor="2"/>
  </xsl:variable>

  <xsl:apply-templates select="$operator"/>
</xsl:template>

<xsl:param name="NaN" as="xs:double" select="1.0 div 0"/>

<xsl:template
  match="div[(xs:integer(@divident) div xs:integer(@divisor)) ne $NaN]">
  <xsl:message select="xs:integer(@divident) div xs:integer(@divisor)"/>
</xsl:template>

<xsl:template match="div">
  <xsl:message select="'Division by zero.'"/>
</xsl:template>

</xsl:stylesheet>

Here, if there is a division by zero a template is not matched and other template is selected, thus second template serves as an error handler for the first one. Definitely, one may define much more complex construction to be handled this way.

I never was a purist (meaning doing everything in xslt), however this example along with indirect function call, shows that xslt is rather equiped language. One just need to be smart enough to understand how to do a things.

See also: Try/catch block in xslt 2.0 for Saxon 9.

Thursday, July 31, 2008 11:52:21 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Monday, July 28, 2008

Among other job activities, we're from time to time asked to check technical skills of job applicants.

Several times we were interviewing people who're far below the acceptable professional skills. It's a torment for both sides, I should say.

To ease things we have designed a small questionnaire (specific to our projects) for job applicants. It's sent to an applicant before the meeting. Even partially answered, this questionnaire constitutes a good filter against profanes:

<questionnaire>
  <item>
    <question>
      Please estimate your knowledge in XML Schema (xsd) as lacking, bad, good, or perfect.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please estimate your knowledge in xslt 2.0/xquery 1.0 as lacking, bad, good, or perfect.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please estimate your knowledge in xslt 1.0 as lacking, bad, good, or perfect.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please estimate your knowledge in java as lacking, bad, good, or perfect.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please estimate your knowledge in c# as lacking, bad, good, or perfect.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please estimate your knowledge in sql as lacking, bad, good, or perfect.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      For logical values A, B, please rewrite logical expression "A and B" using operator "or".
    </question>
    <answer/>
  </item>
  <item>
    <question>
      For logical values A, B, please rewrite logical expression "A = B" using operators "and" and "or".
    </question>
    <answer/>
  </item>
  <item>
    <question>
      There are eight balls, with only one heavier than some other.
      What is a minimum number of weighings reveals the heavier ball?
      Please be suspicious about the "trivial" solution.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      If A results in B. What one may say about the reason of B?
    </question>
    <answer/>
  </item>
  <item>
    <question>
      If only A or B result in C. What one may say about the reason of C?
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please define an xml schema for this questionnaire.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Please create a simple stylesheet creating an html table based on this questionnaire.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      For a table A with columns B, C, and D, please create an sql query selecting B groupped by C and ordered by D.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      For a sequence of xml elements A with attribute B, please write a stylesheet excerpt creating a sequence of elements D, grouping elements A with the same string value of attribute B, sorted in the order of ascending of B.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Having a java class A with properties B and C, please sort a collection of A for B in ascending, and C in descending order.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      What does a following line mean in c#?
      int? x;
    </question>
    <answer/>
  </item>
  <item>
    <question>
      What is a parser?
    </question>
    <answer/>
  </item>
  <item>
    <question>
      How to issue an error in the xml stylesheet?
    </question>
    <answer/>
  </item>
  <item>
    <question>
      What is a lazy evaluation?
    </question>
    <answer/>
  </item>
  <item>
    <question>
      How do you understand a following sentence?
      For each line of code there should be a comment.
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Have you used any supplemental information to answer these questions?
    </question>
    <answer/>
  </item>
  <item>
    <question>
      Have you independently answered these questions?
    </question>
    <answer/>
  </item>
</questionnaire>

Monday, July 28, 2008 10:54:54 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Thursday, June 26, 2008

We are designing a rather complex xslt 2.0 application, dealing with semistructured data. We must tolerate with errors during processing, as there are cases where an input is not perfectly valid (or the program is not designed or ready to get such an input).

The most typical error is unsatisfied expectation of tree structure like:
  <xsl:variable name="element" as="element()" select="some-element"/>

Obviously, dynamic error occurs if a specified element is not present. To concentrate on primary logic, and to avoid a burden of illegal (unexpected) case recovery we have created a try/catch API. The goal of such API is:

  • to be able to continue processing in case of error;
  • report as much as possible useful information related to an error.

Alternatives:

Do not think this is our arrogance, which has turned us to create a custom API. No, we were looking for alternatives! Please see [xsl] saxon:try() discussion:

  • saxon:try() function - is a kind of pseudo function, which explicitly relies on lazy evaluation of its arguments, and ... it's not available in SaxonB;
  • ex:error-safe  extension instruction - is far from perfect in its implementation quality, and provides no error location.

We have no other way except to design this feature by ourselves. In our defence one can say that we are using innovatory approach that encapsulates details of the implementation behind template and calls handlers indirectly.

Use:

Try/catch API is designed as a template <xsl:template name="t:try-block"/> calling a "try" handler, and, if required, a "catch" hanler using <xsl:apply-templates mode="t:call"/> instruction. Caller passes any information to these handlers by the means of tunnel parameters.

Handlers must be in a "t:call" mode. The "catch" handler may recieve following error info parameters:

<xsl:param name="error" as="xs:QName"/>
<xsl:param name="error-description" as="xs:string"/>
<xsl:param name="error-location" as="item()*"/>

where $error-location is a sequence of pairs (location as xs:string, context as item())*.

A sample:

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:t="http://www.nesterovsky-bros.com/xslt/public/"
  exclude-result-prefixes="xs t">

<xsl:include href="try-block.xslt"/>

<xsl:template match="/">
  <result>
    <xsl:for-each select="1 to 10">
      <xsl:call-template name="t:try-block">
        <xsl:with-param name="value" tunnel="yes" select=". - 5"/>
        <xsl:with-param name="try" as="element()">
          <try/>
        </xsl:with-param>
        <xsl:with-param name="catch" as="element()">
          <t:error-handler/>
        </xsl:with-param>
      </xsl:call-template>
    </xsl:for-each>
  </result>
</xsl:template>

<xsl:template mode="t:call" match="try">
  <xsl:param name="value" tunnel="yes" as="xs:decimal"/>

  <value>
    <xsl:sequence select="1 div $value"/>
  </value>
</xsl:template>

</xsl:stylesheet>

The sample prints values according to the formula "1/(i - 5)", where "i" is a variable varying from 1 to 10. Clearly, division by zero occurs when "i" is equal to 5.

Please notice how to access try/catch API through <xsl:include href="try-block.xslt"/>. The main logic is executed in <xsl:template mode="t:call" match="try"/>, which recieves parameters using tunneling. A default error handler <t:error-handler/> is used to report errors.

Error report:

Error: FOAR0001
Description:
Decimal divide by zero

Location:
1. systemID: "file:///D:/style/try-block-test.xslt", line: 34
2. template mode="t:call" match="element(try, xs:anyType)"
  systemID: "file:///D:/style/try-block-test.xslt", line: 30
  context node:
    /*[1][local-name() = 'try']
3. template mode="t:call"
  match="element({http://www.nesterovsky-bros.com/xslt/private/try-block}try, xs:anyType)"
  systemID: "file:///D:/style/try-block.xslt", line: 53
  context node:
    /*[1][local-name() = 'try']
4. systemID: "file:///D:/style/try-block.xslt", line: 40
5. call-template name="t:try-block"
  systemID: "file:///D:/style/try-block-test.xslt", line: 17
6. for-each
  systemID: "file:///D:/style/try-block-test.xslt", line: 16
  context item: 5
7. template mode="saxon:_defaultMode" match="document-node()"
  systemID: "file:///D:/style/try-block-test.xslt", line: 14
  context node:
    /

Implementation details:

You were not expecting this API to be pure xslt, weren't you? :-)

Well, you're right, there is an extension function. Its pseudo code is like this:

function tryBlock(tryItems, catchItems)
{
  try
  {
    execute xsl:apply-templates for tryItems.
  }
  catch
  {
    execute xsl:apply-templates for catchItems.
  }
}

 

The last thing. Please get the implementation saxon.extensions.zip. There you will find sources of the try/catch, and tuples/maps API.

Thursday, June 26, 2008 9:18:50 AM UTC  #    Comments [0] -
Announce | Tips and tricks | xslt
# Tuesday, June 17, 2008

Right now we're inhabiting in the java world, thus all our tasks are (in)directly related to this environment.

We want to store stylesheets as resources of java application, and at the same time to point to these stylesheets without jar qualification. In .NET this idea would not appear at all, as there are well defined boundaries between assemblies, but java uses rather different approach. Whenever you have a resource name, it's up to ClassLoader to find this resource. To exploit this feature we've created an uri resolver for the stylesheet transformation. The protocol we use has a following format: "resource:/resource-path".

For example to store stylesheets in the META-INF/stylesheets folder we use uri "resource:/META-INF/stylesheets/java/main.xslt". Relative path is resolved naturally. A path "../jxom/java-serializer.xslt" in previously mentioned stylesheet is resolved to "resource:/META-INF/stylesheets/jxom/java-serializer.xslt".

We've created a small class ResourceURIResolver. You need to supply an instance of TransformerFactory with this resolver:
  transformerFactory.setURIResolver(new ResourceURIResolver());

The class itself is so small that we qoute it here:

import java.io.InputStream;

import java.net.URI;
import java.net.URISyntaxException;

import javax.xml.transform.Source;
import javax.xml.transform.TransformerException;
import javax.xml.transform.URIResolver;

import javax.xml.transform.stream.StreamSource;

/**
 * This class implements an interface that can be called by the processor
 * to turn a URI used in document(), xsl:import, or xsl:include into a
 * Source object.
 */
public class ResourceURIResolver implements URIResolver
{
  /**
   * Called by the processor when it encounters
   * an xsl:include, xsl:import, or document() function.
   *
   * This resolver supports protocol "resource:".
   * Format of uri is: "resource:/resource-path", where "resource-path" is an
   * argument of a {@link ClassLoader#getResourceAsStream(String)} call.
   * @param href - an href attribute, which may be relative or absolute.
   * @param base - a base URI against which the first argument will be made
   *   absolute if the absolute URI is required.
   * @return a Source object, or null if the href cannot be resolved, and
   *   the processor should try to resolve the URI itself.
   */
  public Source resolve(String href, String base)
    throws TransformerException
  {
    if (href == null)
    {
      return null;
    }

    URI uri;

    try
    {
      if (base == null)
      {
        uri = new URI(href);
      }
      else
      {
        uri = new URI(base).resolve(href);
      }
    }
    catch(URISyntaxException e)
    {
      // Unsupported uri.
      return null;
    }

    if (!"resource".equals(uri.getScheme()))
    {
      return null;
    }

    String resourceName = uri.getPath();

    if ((resourceName == null) || (resourceName.length() == 0))
    {
      return null;
    }

    if (resourceName.charAt(0) == '/')
    {
      resourceName = resourceName.substring(1);
    }

    ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
    InputStream stream = classLoader.getResourceAsStream(resourceName);

    if (stream == null)
    {
      return null;
    }

    return new StreamSource(stream, uri.toString());
  }
}

Tuesday, June 17, 2008 7:57:52 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Friday, May 30, 2008

The project we're working on requires us to generate a java web application from a some ancient language. The code being converted, we have transformed into java classes (thanks to jxom), the presentation is converted into JSF (facelets) pages.

By the way, long before java (.net) platform has been conceived, there were languages and environments, worked out so good that contemporary client - server paradigms (like JSF, ASP.NET, and so on) are just their isomorphisms.

The problem we were dealing with recently is JSF databinding for a bean properties of types java.sql.Date, java.sql.Time, java.sql.Timestamp.

At some point of design we have decided that these types are most natural representation of data in the original language, as the program's activity is tightly connected to the database. Later on it's became clear that JSF databinding does not like these types at all. We were to decide either to fall back and use java.util.Date as bean property types, or do something with databinding.

It was not clear what's the best way until we have found an elegant solution, namely: to create ELResolver to handle bean properties of these types. The solution works because custom el resolvers are applied before standard resolvers (except implicit one).

The class DateELResolver is rather simple extension of the BeanELResolver. To use it you only need to register it the faces-config.xml:

<faces-config version="1.2"
  xmlns="http://java.sun.com/xml/ns/javaee"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
    http://java.sun.com/xml/ns/javaee/web-facesconfig_1_2.xsd">
  <application>
    <el-resolver>com.nesterovskyBros.jsf.DateELResolver</el-resolver>
  </application>
</faces-config>

Friday, May 30, 2008 12:49:50 PM UTC  #    Comments [0] -
Tips and tricks
# Saturday, April 05, 2008

Does WebSphere MQ library for .NET support a connection pool? This is the question, which ask many .NET developers who deal with IBM WebSphere MQ and write multithread applications. The answer to this question unfortunately is NO… The .NET version supports only individual connection types.

I have compared two MQ libraries Java's and one for .NET, and I’ve found that most of the classes have the same declarations except one crucial for me difference. As opposed to .NET, the Java MQ library provides several classes implementing MQ connection pooling. There is nothing similar in .NET library.

There are few common workarounds for this annoying restriction. One of such workarounds (is recommended by IBM in their “MQ using .NET”) is to keep open one MQ connection per thread. Unfortunately such approach is not working for ASP.NET applications (including web services).

The good news is that starting from service pack 5 for MQ 5.3, and of course for MQ 6.xx they are supporting sharing MQ connections in blocked mode:

“The implementation of WebSphere MQ .NET ensures that, for a given connection (MQQueueManager object instance), all access to the target WebSphere MQ queue manager is synchronized. The default behavior is that a thread that wants to issue a call to a queue manager is blocked until all other calls in progress for that connection are complete.”

This allows creating an MQ connection (pay attention that MQQueueManager object is a wrapper for MQ connection) in one thread and exclusive use it in another thread without side-effects caused by multithreading.

Taking in account this feature, I’ve created a simple MQ connection pool. It’s ease in use. The main class MQPoolManager has only two static methods:

public static MQQueueManager Get(string QueueManagerName, string ChannelName, string ConnectionName);

and

public static void Release(ref MQQueueManager queueManager);

The method Get returns MQ queue manager (either existing from pool or newly created one), and Release returns it to the connection pool. Internally the logic of MQPoolManager tracks expired connections and do some finalizations, if need.

So, you may use one MQ connection pool per application domain without additional efforts and big changes in existing applications.

By the way, this approach has allowed us to optimize performance of MQ part considerably in one of ours projects.

Later on...

To clarify using of MQPoolManager I've decided to show here following code snippet:

  MQQueueManager queueManager = 
MQPoolManager.Get(QueueManagerName, ChannelName, ConnectionName); try { // TODO: some work with MQ here } finally { MQPoolManager.Release(ref queueManager); } // at this point the queueManager is null

Saturday, April 05, 2008 8:55:57 PM UTC  #    Comments [6] -
Tips and tricks
# Tuesday, March 25, 2008

In the xslt world there is no widely used custom to think of stylesheet members as of public and private in contrast to other programming languages like C++/java/c# where access modifiers are essential. The reason is in complexity of stylesheets: the less size of code - the easier to developer to keep all details in memory. Whenever xslt program grows you should modularize it to keep it manageable.

At the point where modules are introduced one starts thinking of public interface of module and its implementation details. This separation is especially important for the template matching as you won't probably want to match private template just because you've forgotten about some template in implementation of some module.

To make public or private member distinction you can introduce two namespaces in your stylesheet, like:

For the private namespace you can use a unique name, e.g. stylesheet name as part of uri.

The following example is based on jxom. This stylesheet builds expression from expression tree. Public part consists only of t:get-expression function, other members are private:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:t="http://www.nesterovsky-bros.com/public"
  xmlns:p="http://www.nesterovsky-bros.com/private/expression.xslt"
  xmlns="http://www.nesterovsky-bros.com/download/jxom.zip"
  xpath-default-namespace="http://www.nesterovsky-bros.com/download/jxom.zip"
  exclude-result-prefixes="xs t p">

  <xsl:output method="text" indent="yes"/>

  <!-- Entry point. -->
  <xsl:template match="/">
    <xsl:variable name="expression" as="element()">
      <lt>
        <sub>
          <mul>
            <var name="b"/>
            <var name="b"/>
          </mul>
          <mul>
            <mul>
              <int>4</int>
              <var name="a"/>
            </mul>
            <var name="c"/>
          </mul>
        </sub>
        <double>0</double>
      </lt>
    </xsl:variable>

    <xsl:value-of select="t:get-expression($expression)" separator=""/>
  </xsl:template>

  <!--
    Gets expression.
      $element - expression element.
      Returns expression tokens.
  -->
  <xsl:function name="t:get-expression" as="item()*">
    <xsl:param name="element" as="element()"/>

    <xsl:apply-templates mode="p:expression" select="$element"/>
  </xsl:function>

  <!--
    Gets binary expression.
      $element - assignment expression.
      $type - expression type.
      Returns expression token sequence.
  -->
  <xsl:function name="p:get-binary-expression" as="item()*">
    <xsl:param name="element" as="element()"/>
    <xsl:param name="type" as="xs:string"/>

    <xsl:sequence select="t:get-expression($element/*[1])"/>
    <xsl:sequence select="' '"/>
    <xsl:sequence select="$type"/>
    <xsl:sequence select="' '"/>
    <xsl:sequence select="t:get-expression($element/*[2])"/>
  </xsl:function>

  <!-- Mode "expression". Empty match. -->
  <xsl:template mode="p:expression" match="@*|node()">
    <xsl:sequence select="error(xs:QName('invalid-expression'), name())"/>
  </xsl:template>

  <!-- Mode "expression". or. -->
  <xsl:template mode="p:expression" match="or">
    <xsl:sequence select="p:get-binary-expression(., '||')"/>
  </xsl:template>

  <!-- Mode "expression". and. -->
  <xsl:template mode="p:expression" match="and">
    <xsl:sequence select="p:get-binary-expression(., '&&')"/>
  </xsl:template>

  <!-- Mode "expression". eq. -->
  <xsl:template mode="p:expression" match="eq">
    <xsl:sequence select="p:get-binary-expression(., '==')"/>
  </xsl:template>

  <!-- Mode "expression". ne. -->
  <xsl:template mode="p:expression" match="ne">
    <xsl:sequence select="p:get-binary-expression(., '!=')"/>
  </xsl:template>

  <!-- Mode "expression". le. -->
  <xsl:template mode="p:expression" match="le">
    <xsl:sequence select="p:get-binary-expression(., '<=')"/>
  </xsl:template>

  <!-- Mode "expression". ge. -->
  <xsl:template mode="p:expression" match="ge">
    <xsl:sequence select="p:get-binary-expression(., '>=')"/>
  </xsl:template>

  <!-- Mode "expression". lt. -->
  <xsl:template mode="p:expression" match="lt">
    <xsl:sequence select="p:get-binary-expression(., '<')"/>
  </xsl:template>

  <!-- Mode "expression". gt. -->
  <xsl:template mode="p:expression" match="gt">
    <xsl:sequence select="p:get-binary-expression(., '>')"/>
  </xsl:template>

  <!-- Mode "expression". add. -->
  <xsl:template mode="p:expression" match="add">
    <xsl:sequence select="p:get-binary-expression(., '+')"/>
  </xsl:template>

  <!-- Mode "expression". sub. -->
  <xsl:template mode="p:expression" match="sub">
    <xsl:sequence select="p:get-binary-expression(., '-')"/>
  </xsl:template>

  <!-- Mode "expression". mul. -->
  <xsl:template mode="p:expression" match="mul">
    <xsl:sequence select="p:get-binary-expression(., '*')"/>
  </xsl:template>

  <!-- Mode "expression". div. -->
  <xsl:template mode="p:expression" match="div">
    <xsl:sequence select="p:get-binary-expression(., '/')"/>
  </xsl:template>

  <!-- Mode "expression". neg. -->
  <xsl:template mode="p:expression" match="neg">
    <xsl:sequence select="'-'"/>
    <xsl:sequence select="t:get-expression(*[1])"/>
  </xsl:template>

  <!-- Mode "expression". not. -->
  <xsl:template mode="p:expression" match="not">
    <xsl:sequence select="'!'"/>
    <xsl:sequence select="t:get-expression(*[1])"/>
  </xsl:template>

  <!-- Mode "expression". parens. -->
  <xsl:template mode="p:expression" match="parens">
    <xsl:sequence select="'('"/>
    <xsl:sequence select="t:get-expression(*[1])"/>
    <xsl:sequence select="')'"/>
  </xsl:template>

  <!-- Mode "expression". var. -->
  <xsl:template mode="p:expression" match="var">
    <xsl:sequence select="@name"/>
  </xsl:template>

  <!-- Mode "expression". int, short, byte, long, float, double. -->
  <xsl:template mode="p:expression"
    match="int | short | byte | long | float | double">
    <xsl:sequence select="."/>
  </xsl:template>

 </xsl:stylesheet>

Tuesday, March 25, 2008 6:23:30 AM UTC  #    Comments [0] -
Tips and tricks | xslt
# Saturday, February 16, 2008

Hello again!

To see first part about jxom please read.

I'm back with jxom (Java xml object model). I've finally managed to create an xslt that generates java code from jxom document.

Will you ask why it took as long as a week to produce it?

There are two answers:
1. My poor talents.
2. I've virtually created two implementations.

My first approach was to directly generate java text from xml. I was a truly believer that this is the way. I've screwed things up on that way, as when you're starting to deal with indentations, formatting and reformatting of text you're generating you will see things are not that simple. Well, it was a naive approach.

I could finish it, however at some point I've realized that its complexity is not composable from complexity of its  parts, but increases more and more. This is not permissible for a such simple task. Approach is bad. Point.

An alternative I've devised is simple and in fact more natural than naive approach. This is a two stage generation:
  a) generate sequence of tokens - serializer;
  b) generate and then print a sequence of lines - streamer.

Tokens (item()*) are either control words (xs:QName), or literals (xs:string).

I've defined following control tokens:

Token Description
t:indent indents following content.
t:unindent unindents following content.
t:line-indent resets indentation for one line.
t:new-line new line token.
t:terminator separates token sequences.
t:code marks line as code (default line type).
t:doc marks line as documentation comment.
t:begin-doc marks line as begin of documentation comment.
t:end-doc marks line as end of documentation comment.
t:comment marks line as comment.

Thus an input for the streamer looks like:

<xsl:sequence select="'public'"/>
<xsl:sequence select="' '"/>
<xsl:sequence select="'class'"/>
<xsl:sequence select="' '"/>
<xsl:sequence select="'A'"/>
<xsl:sequence select="$t:new-line"/>
<xsl:sequence select="'{'"/>
<xsl:sequence select="$t:new-line"/>
<xsl:sequence select="$t:indent"/>
<xsl:sequence select="'public'"/>
<xsl:sequence select="' '"/>
<xsl:sequence select="'int'"/>
<xsl:sequence select="' '"/>
<xsl:sequence select="'a'"/>
<xsl:sequence select="';'"/>
<xsl:sequence select="$t:unindent"/>
<xsl:sequence select="$t:new-line"/>
<xsl:sequence select="'}'"/>
<xsl:sequence select="$t:new-line"/>

Streamer receives a sequence of tokens and transforms it in a sequence of lines.

One beautiful thing about tokens is that streamer can easily perform line breaks in order to keep page width, and another convenient thing is that code generating tokens should not track indentation level, as it just uses t:indent, t:unindent control tokens to increase and decrease current indentation.

The way the code is built allows mimic any code style. I've followed my favorite one. In future I'll probably add options controlling code style. In my todo list there still are several features I want to implement, such as line breaker to preserve page width, and type qualification optimizer (optional feature) to reduce unnecessary type qualifications.

Current implementation can be found at jxom.zip. It contains:

File Description
java.xsd jxom xml schema.
java-serializer-main.xslt transformation entry point.
java-serializer.xslt generates tokens for top level constructs.
java-serializer-statements.xslt generates tokens for statements.
java-serializer-expressions.xslt generates tokens for expressions.
java-streamer.xslt converts tokens into lines.
DataAdapter.xml sample jxom document.

This was my first experience with xslt 2.0. I feel very pleased with what it can do. The only missed feature is indirect function call (which I do not want to model with dull template matching approach).

Note that in spite that xslt I've built is platform independed I want to point out that I was experimenting with saxon 9. Several times I've relied on efficient tail call implementation (see t:cumulative-integer-sum), which otherwise will lead to xslt stack overflow.

I shall be pleased to see your feedback on the subject.

Saturday, February 16, 2008 10:42:16 AM UTC  #    Comments [6] -
Tips and tricks | xslt
# Saturday, February 09, 2008

Hello,

I was not writing for a long time. IMHO: nothing to say? - do not noise!

Nowadays I'm busy with xslt.

Should I be pleased that w3c committee has finally delivered xpath 2.0/xslt 2.0/xquery? There possibly were people who have failed to wait till this happened, and who have died. Be grateful to the fate we have survived!

I'm working now with saxon 9. It's good implementation, however too interpreter like in my opinion. I think these languages could be compiled down to machine/vm code the same way as c++/java/c# do.

To the point.
I need to generate java code in xslt. I've done this earlier; that time I dealt with relatively simple templates like beans or interfaces. Now I need to generate beans, interfaces, classes with logic. In fact I should cover almost all java 6 features.

Immediately I've started thinking in terms of java xml object model (jxom). Thus there will be an xml schema of jxom (Am I inventing bicycle? I pray you to point me to an existing schema!) - java grammar as xml. There will be xslts, which generate code according to this schema, and xslt that will serialize jxom documents derectly into java.

This two stage generation is important as there are essentially two different tasks: generate java code, and serialize it down to a text format. Moreover whenever I have jxom document I can manipulate it! And finally this will allow to our team to concentrate efforts, as one should only generate jxom document.

Yesterday, I've found java ANLT grammar, and have converted it into xml schema: java.xsd. It is important to have this xml schema defined, even if no one shall use it except in editor, as it makes jxom generation more formal.

The next step is to create xslt serializer, which is in todo list.

To feel how jxom looks I've created it manually for some simple java file:

// $Id: DataAdapter.java 1122 2007-12-31 12:43:47Z arthurn $
package com.bphx.coolgen.data;

import java.util.List;

/**
* Encapsulates encyclopedia database access.
*/

public interface DataAdapter
{
  /**
   * Starts data access session for a specified model.
   * @param modelId - a model to open.
   */

  void open(int modelId)
    throws Exception;

  /**
   * Ends data access session.
   */

  void close()
   throws Exception;

  /**
   * Gets current model id.
   * @return current model id.
   */

  int getModelId();

  /**
   * Gets data objects for a specified object type for the current model.
   * @param type - an object type to get data objects for.
   * @return list of data objects.
   */

  List<DataObject> getObjectsForType(short type)
    throws Exception;

  /**
   * Gets a list of data associations for an object id.
   * @param id - object id.
   * @return list of data associations.
   */

  List<DataAssociation> getAssociations(int id)
    throws Exception;

  /**
   * Gets a list of data properties for an object id.
   * @param id - object id.
   * @return list of data properties.
   */

  List<DataProperty> getProperties(int id)
    throws Exception;
}

jxom:

<unit xmlns="http://www.bphx.com/java-1.5/2008-02-07" package="com.bphx.coolgen.data">
  <comment>$Id: DataAdapter.java 1122 2007-12-31 12:43:47Z arthurn $</comment>
  <import package="java.util.List"/>
  <interface access="public" name="DataAdapter">
    <comment doc="true">Encapsulates encyclopedia database access.</comment>
    <method name="open">
      <comment doc="true">
        Starts data access session for a specified model.
        <para type="param" name="modelId">a model to open.</para>
      </comment>
      <parameters>
        <parameter name="modelId"><type name="int"/></parameter>
      </parameters>
      <throws><type name="Exception"/></throws>
    </method>
    <method name="close">
      <comment doc="true">Ends data access session.</comment>
      <throws><type name="Exception"/></throws>
    </method>
    <method name="getModelId">
      <comment doc="true">
        Gets current model id.
        <para type="return">current model id.</para>
      </comment>
      <returns><type name="int"/></returns>
      <throws><type name="Exception"/></throws>
    </method>
    <method name="getObjectsForType">
      <comment doc="true">
        Gets data objects for a specified object type for the current model.
        <para name="param" type="type">
          an object type to get data objects for.
        </para>
        <para type="return">list of data objects.</para>
      </comment>
      <returns>
        <type>
          <part name="List">
            <typeArgument><type name="DataObject"/></typeArgument>
          </part>
        </type>
      </returns>
      <parameters>
        <parameter name="type"><type name="short"/></parameter>
      </parameters>
      <throws><type name="Exception"/></throws>
    </method>
    <method name="getAssociations">
      <comment doc="true">
        Gets a list of data associations for an object id.
        <para type="param" name="id">object id.</para>
        <para type="return">list of data associations.</para>
      </comment>
      <returns>
        <type>
          <part name="List">
            <typeArgument><type name="DataAssociation"/></typeArgument>
          </part>
        </type>
      </returns>
      <parameters>
        <parameter name="id"><type name="int"/></parameter>
      </parameters>
      <throws><type name="Exception"/></throws>
    </method>
    <method name="getProperties">
      <comment doc="true">
        Gets a list of data properties for an object id.
        <para type="param" name="id">object id.</para>
        <para type="return">list of data properties.</para>
      </comment>
      <returns>
        <!-- Compact form of generic type. -->
        <type name="List<DataProperty>"/>
      </returns>
      <parameters>
        <parameter name="id"><type name="int"/></parameter>
      </parameters>
      <throws><type name="Exception"/></throws>
    </method>
  </interface>
</unit>

To read about xslt for jxom please follow this link.

Saturday, February 09, 2008 5:56:45 PM UTC  #    Comments [0] -
Tips and tricks | xslt
# Monday, March 12, 2007
C++ Standard Library Issues List, Issue 254

I'm tracking this issue already for the several years, and have my unpretentious opinion. To make my arguments clear I'll bring the issue description here.

254. Exception types in clause 19 are constructed from std::string

Section: 19.1 [std.exceptions], 27.4.2.1.1 [ios::failure] Status: Tentatively Ready Submitter: Dave Abrahams Date: 2000-08-01

Discussion:

Many of the standard exception types which implementations are required to throw are constructed with a const std::string& parameter. For example:

     19.1.5  Class out_of_range                          [lib.out.of.range]
     namespace std {
       class out_of_range : public logic_error {
       public:
         explicit out_of_range(const string& what_arg);
       };
     }

   1 The class out_of_range defines the type of objects  thrown  as  excep-
     tions to report an argument value not in its expected range.

     out_of_range(const string& what_arg);

     Effects:
       Constructs an object of class out_of_range.
     Postcondition:
       strcmp(what(), what_arg.c_str()) == 0.

There are at least two problems with this:

  1. A program which is low on memory may end up throwing std::bad_alloc instead of out_of_range because memory runs out while constructing the exception object.
  2. An obvious implementation which stores a std::string data member may end up invoking terminate() during exception unwinding because the exception object allocates memory (or rather fails to) as it is being copied.

There may be no cure for (1) other than changing the interface to out_of_range, though one could reasonably argue that (1) is not a defect. Personally I don't care that much if out-of-memory is reported when I only have 20 bytes left, in the case when out_of_range would have been reported. People who use exception-specifications might care a lot, though.

There is a cure for (2), but it isn't completely obvious. I think a note for implementors should be made in the standard. Avoiding possible termination in this case shouldn't be left up to chance. The cure is to use a reference-counted "string" implementation in the exception object. I am not necessarily referring to a std::string here; any simple reference-counting scheme for a NTBS would do.

Further discussion, in email:

...I'm not so concerned about (1). After all, a library implementation can add const char* constructors as an extension, and users don't need to avail themselves of the standard exceptions, though this is a lame position to be forced into. FWIW, std::exception and std::bad_alloc don't require a temporary basic_string.

...I don't think the fixed-size buffer is a solution to the problem, strictly speaking, because you can't satisfy the postcondition
  strcmp(what(), what_arg.c_str()) == 0
For all values of what_arg (i.e. very long values). That means that the only truly conforming solution requires a dynamic allocation.

Further discussion, from Redmond:

The most important progress we made at the Redmond meeting was realizing that there are two separable issues here: the const string& constructor, and the copy constructor. If a user writes something like throw std::out_of_range("foo"), the const string& constructor is invoked before anything gets thrown. The copy constructor is potentially invoked during stack unwinding.

The copy constructor is a more serious problem, becuase failure during stack unwinding invokes terminate. The copy constructor must be nothrow. Curaçao: Howard thinks this requirement may already be present.

The fundamental problem is that it's difficult to get the nothrow requirement to work well with the requirement that the exception objects store a string of unbounded size, particularly if you also try to make the const string& constructor nothrow. Options discussed include:

  • Limit the size of a string that exception objects are required to throw: change the postconditions of 19.1.2 [domain.error] paragraph 3 and 19.1.6 [runtime.error] paragraph 3 to something like this: "strncmp(what(), what_arg._str(), N) == 0, where N is an implementation defined constant no smaller than 256".
  • Allow the const string& constructor to throw, but not the copy constructor. It's the implementor's responsibility to get it right. (An implementor might use a simple refcount class.)
  • Compromise between the two: an implementation is not allowed to throw if the string's length is less than some N, but, if it doesn't throw, the string must compare equal to the argument.
  • Add a new constructor that takes a const char*

(Not all of these options are mutually exclusive.)

...

To be honest, I do not understand their (committee members') decisions. It seems they are trying to conceal themselves from the problem virtually proposing to store character buffer in the exception object. In fact the problem is more general, and is related to any exception types that store some data, and which can throw during copy construction. How to avoid problems during copy construction? Well, do not perform activity that can lead to an exception. If copying data can throw, then do not copy it! Thus we have to share data between exception objects.

This logic brought me to a safe exception type design. E.g. exception object should keep refcounted handle to a data object that is shared between type instances.

The only question is: why didn't they even consider this way?

Monday, March 12, 2007 9:52:09 AM UTC  #    Comments [0] -
Tips and tricks
# Thursday, November 23, 2006

In one of our latest projects (GUI on .NET 2.0) we've felt all the power of .NET globalization, but an annoying thing happened too...

In our case such an annoying thing was sharing of UI culture info between main (UI) thread and all auxiliary threads (threads from ThreadPool, manually created threads etc.). It seems we've fallen into a .NET globalization pitfall.

We guessed that the same as main thread UI culture info for, at least, all asynchronous delegates' calls is used. This is a common mistake, and what's more annoying, there is no a single line in MSDN documentation about this issue. :-S

Let's look closer at this issue. Our application starts on computer with English regional settings ("en-En"), and during application starting we are changing UI culture info to one specified in configuration file:

	// set the culture from the config file
	try
	{
	  Thread.CurrentThread.CurrentUICulture =
              new CultureInfo(Settings.Default.CultureName);
	}
	catch
	{
	   // use the default UI culture info
	}
	

Thus, all the screens of this GUI application will be displayed according with the specified culture. There are also localized strings stored in resource files that are used as log, exception messages etc., which can be displayed from within different threads (e.g. asynchronous delegates' calls).

So, when application is running and even all screens are displayed according with the specified culture, all the exceptions from auxiliary threads still in English. :'( This happened since threads for asynchronous calls are pulled out from ThreadPool, and all these threads were created using default culture.

Conclusion
Take care about CurrentUICulture in different threads by yourself, and be careful - there are still pitfalls on this way...

Thursday, November 23, 2006 10:55:10 AM UTC  #    Comments [0] -
Tips and tricks
# Monday, October 02, 2006

Return a table of numbers from 0 up to a some value. I'm facing this recurring task once in several years. Such periodicity induces me to invent solution once again but using contemporary features.

November 18:

This time I have succeeded to solve the task in one select:

declare @count int;

set @count = 1000;

with numbers(value) as
(
  select 0
  union all
  select value * 2 + 1 from numbers where value < @count / 2
  union all
  select value * 2 + 2 from numbers where value < (@count - 1) / 2
)
select
  row_number() over(order by U.V) value
from
  numbers cross apply (select 1 V) U;

Do you have a better solution?

Monday, October 02, 2006 7:27:51 AM UTC  #    Comments [0] -
SQL Server puzzle | Tips and tricks
# Tuesday, May 30, 2006

We're building a .NET 2.0 GUI application. A part of a project is a localization. According to advices of msdn we have created *.resx files and sent them to foreign team that performs localization using WinRes tool.

Several of our user controls contained SplitContainer control. We never thought this could present a problem. Unfortunately it is!

When you're trying to open resx for a such user control you're getting:

Eror - Failed to load the resource due to the following error:
System.MissingMethodException: Constructor on type 'System.Windows.Forms.SplitterPanel' not found.

We started digging the WinRes.exe (thanks to .NET Reflector) and found the solution: we had to define the name of split container the way that its parent name appeared before (in ascending sort order) than splitter itself.

Say if you have a form "MyForm" and split container "ASplitContainer" then you should rename split container to say "_ASplitContainer".  In this case resources are stored as:

Name Parent Name
MyForm  
_ASplitContainer MyForm
_ASplitContainer.Panel1 _ASplitContainer
_ASplitContainer.Panel2 _ASplitContainer

This makes WinRes happy. :-)

Tuesday, May 30, 2006 10:13:18 AM UTC  #    Comments [0] -
Tips and tricks
# Tuesday, November 01, 2005

Today we had spent some time looking for samples of web-services in RPC/encoded style, and we have found a great site http://www.xmethods.com/. This site contains a lot of web-services samples in Document/literal and RPC/encoded styles. We think this link will be useful for both developers and testers.

Tuesday, November 01, 2005 10:51:37 AM UTC  #    Comments [0] -
Tips and tricks
# Tuesday, May 31, 2005

Yesterday we had ran into following problem: how to retrieve session object from within Java web-service? The crucial point of the problem was that we are generating automatically our web-service from Java bean and this web-service works under WebSphere v5.1.1.

After some time we had spent to find acceptable solution, we have found that it's possible either to implement “session substitution” using EJB SessionBean or somehow to retrieve HttpSession instance.

The first approach has a lot of advantages before the second one, but it requires to implement bunch of EJB objects (session bean itself, home object etc.). The second approach just solve our problem for web-service via HTTP, and no more, but... it requires only few lines to be changed in Java bean code. This second approach is based on implementation of javax.xml.rpc.server.ServiceLifecyle interface for our Java bean. For details take a look at the following article: “Web services programming tips and tricks: Build stateful sessions in JAX-RPC applications“.

Actually, only two additional methods init() and destroy() were implemented. The init() method retrieves (during initialization) an ServletEndpointContext instance that is stored somewhere in private filed of the bean. Further the ServletEndpointContext.getHttpSession() is called in order to get HttpSession. So easy, so quickly - we just was pleased.

Tuesday, May 31, 2005 12:03:33 PM UTC  #    Comments [0] -
Tips and tricks
# Tuesday, May 03, 2005

In the class definition you often read:

Thread Safety

Any public static (Shared in Visual Basic) members of this type are safe for multithreaded operations. Any instance members are not guaranteed to be thread safe.

But do not be very susceptible for these assurances. Sometimes you read this, because of help template had this text.

Let's look at Encoding::GetEncoding static method:

public static Encoding GetEncoding(string name)
{
  return Encoding.GetEncoding(EncodingTable.GetCodePageFromName(name));
}

and

internal static int GetCodePageFromName(string name)
{
   if (name == null)
     throw new ArgumentNullException("name");

  object obj1 = EncodingTable.hashByName[name];

  if (obj1 != null)
    return (int) obj1;

  name = name.ToLower(CultureInfo.InvariantCulture);
  obj1 = EncodingTable.hashByName[name];

  if (obj1 != null)
    return (int) obj1;

  int num1 = EncodingTable.internalGetCodePageFromName(name);

  EncodingTable.hashByName[name] = num1;

  return num1;
}

You see now, in the least successful case, when our encoding isn't still cached, we shall cache it in the EncodingTable.hashByName.

I just want to point out that hashtable write operation isn't thread safe.

Tuesday, May 03, 2005 1:53:58 PM UTC  #    Comments [0] -
Tips and tricks
Archive
<March 2010>
SunMonTueWedThuFriSat
28123456
78910111213
14151617181920
21222324252627
28293031123
45678910
Statistics
Total Posts: 150
This Year: 14
This Month: 0
This Week: 0
Comments: 146
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2010, Nesterovsky bros
All Content © 2010, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)