RSS 2.0
Sign In
# Sunday, 05 April 2009

Praises: I dare not to think how could we live without AnkhSVN.

At present we have:

  • a generic parser;
  • fully functional xquery parser;
  • detailed error report, and syntax suggestion;
  • high performance.

The idea of runtime grammar tree and a reader like parser results in a high performace, as we able to build a lookup tables to probe tokens. This allows us to start parsing immediately from the most specific grammar chain. For example, consider the xquery grammar:

[1] Module ::= VersionDecl? (LibraryModule | MainModule)

[2] VersionDecl ::= "xquery" "version" StringLiteral ("encoding" StringLiteral)? Separator

[3] MainModule ::= Prolog QueryBody

[4] LibraryModule ::= ModuleDecl Prolog

[5] ModuleDecl ::= "module" "namespace" NCName "=" URILiteral Separator

[6] Prolog ::=
  ((DefaultNamespaceDecl | Setter | NamespaceDecl | Import) Separator)*
  ((VarDecl | FunctionDecl | OptionDecl) Separator)*
...
[87] VarRef ::= "$" VarName

Formally, to parse xquery "$v" one needs to go deep into a grammar hierarchy. That's what is usually done. On the contrast, a lookup table for the grammar "Module", containing 80 different token runs, allows us to identify grammar chain just with a couple of probes:

[0] "xquery" "version"
[1] "module" "namespace"
[2] "declare" "default" "element" "namespace"
[3] "declare" "default" "function" "namespace"
[4] "declare" "boundary-space"
[5] "declare" "default" "collation"
[6] "declare" "base-uri"
[7] "declare" "construction"
[8] "declare" "ordering"
[9] "declare" "default" "order" "empty"
[10] "declare" "copy-namespaces"
[11] "declare" "namespace"
[12] "declare" "schema"
[13] "import" "module"
[14] "declare" "variable" "$"
[15] "declare" "function"
[16] "declare" "option"
[17] "for" "$"
[18] "let" "$"
[19] "some" "$"
[20] "every" "$"
[21] "typeswitch" "("
[22] "if" "("
[23] "-"
[24] "+"
[25] "validate" "{"
[26] "validate" "lax"
[27] "validate" "strict"
[28] "/"
[29] "//"
[30] <integer>
[31] <decimal>
[32] <double>
[33] <string>
[34] "$"
[35] "("
[36] "."
[37] <functionname> "("
[38] "ordered" "{"
[39] "unordered" "{"
[40] "<" <qname>
[41] <!--literal-->
[42] <?pi literal?>
[43] "document" "{"
[44] "element" <qname>
[45] "element" "{"
[46] "attribute" <qname>
[47] "attribute" "{"
[48] "text" "{"
[49] "comment" "{"
[50] "processing-instruction" <ncname>
[51] "processing-instruction" "{"
[52] "parent" "::"
[53] "ancestor" "::"
[54] "preceding-sibling" "::"
[55] "preceding" "::"
[56] "ancestor-or-self" "::"
[57] ".."
[58] "child" "::"
[59] "descendant" "::"
[60] "attribute" "::"
[61] "self" "::"
[62] "descendant-or-self" "::"
[63] "following-sibling" "::"
[64] "following" "::"
[65] "@"
[66] "document-node" "("
[67] "element" "("
[68] "attribute" "("
[69] "schema-element" "("
[70] "schema-attribute" "("
[71] "processing-instruction" "("
[72] "comment" "("
[73] "text" "("
[74] "node" "("
[75] <qname>
[76] "*"
[77] <ncname:*>
[78] <*:ncname>
[79] "(#"

This way, algorithmically, we outperform most of conventional parsers.

On the other hand, a parsed tree we're building, has a compact representation. Each tree node is defined with two text bookmarks, grammar chain, and a grammar specific data. What's important is that the production of garbage memory is very low, as the rate of parser's fail assumptions is small.

What should be done:

  • Attach events to the xquery grammar to collect program constructions: variables, functions, namespaces in scope. This will provide auto completion info.

  • Release inactive parsed subtrees. E.g. we can free tree of function body, and preserve its text range (two bookmarks).

Well, I'd like to think someone could understand anything in all this mumbling. All sources are at "Incremental parser" home.

Sunday, 05 April 2009 15:50:49 UTC  #    Comments [0] -
Incremental Parser
All comments require the approval of the site owner before being displayed.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview
Archive
<2024 April>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011
Statistics
Total Posts: 387
This Year: 3
This Month: 1
This Week: 0
Comments: 955
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2024, Nesterovsky bros
All Content © 2024, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)