Sunday, April 19, 2009

Implementing a table look-up in XQuery

Handling temporary XML fragments in the eXist XML db has improved markedly in version 1.3. I have been looking again at an example of processing MusicXML documents which I first wrote up in the XQuery wikibook.

The code requires a translation from the note name (A, B) to the midi note value for each note. The pitch of a note is defined by a structure like:

<pitch>
<step>C</step>
<alter>1</alter>
<octave>3</octave>
</pitch>



One approach is to use an if -then -else construct:

declare function local:MidiNote($thispitch as element(pitch) ) as xs:integer
{
let $step := $thispitch/step
let $alter :=
if (empty($thispitch/alter)) then 0
else xs:integer($thispitch/alter)
let $octave := xs:integer($thispitch/octave)
let $pitchstep :=
if ($step = "C") then 0
else if ($step = "D") then 2
else if ($step = "E") then 4
else if ($step = "F") then 5
else if ($step = "G") then 7
else if ($step = "A") then 9
else if ($step = "B") then 11
else 0
return 12 * ($octave + 1) + $pitchstep + $alter
} ;

but this cries out for a table lookup as a sequence:

declare variable $noteStep :=
(
<note name="C" step="0"/>,
<note name="D" step="2"/>,
<note name="E" step="4"/>,
<note name="F" step="5"/>,
<note name="G" step="7"/>,
<note name="A" step="9"/>,
<note name="B" step="11"/>
);

declare function local:MidiNote($thispitch as element(pitch) ) as xs:integer
{
let $alter := xs:integer(($thispitch/alter,0)[1])
let $octave := xs:integer($thispitch/octave)
let $pitchstep := xs:integer($noteStep[@name = $thispitch/step]/@step)
return 12 * ($octave + 1) + $pitchstep + $alter
} ;


or an XML element:

declare variable $noteStep :=
<steps>
<note name="C" step="0"/>
<note name="D" step="2"/>
<note name="E" step="4"/>
<note name="F" step="5"/>
<note name="G" step="7"/>
<note name="A" step="9"/>
<note name="B" step="11"/>
</steps>;

declare function local:MidiNote($thispitch as element(pitch) ) as xs:integer
{
let $alter := xs:integer(($thispitch/alter,0)[1])
let $octave := xs:integer($thispitch/octave)
let $pitchstep := xs:integer($noteStep/note[@name = $thispitch/step]/@step)
return 12 * ($octave + 1) + $pitchstep + $alter
} ;


We could also store the table in the database since it is constant.

eXist does some optimisation of XPath expressions, but it does not factor out the invariant expression $thispitch/step
in the XPath predicate.

I wrote a test suite to time these various implementations. Typically this shows that factoring the sub-expression reduces the execution time by 25%. However, even with this optimisation, the structure lookup is disappointingly slow. It is about 50% slower than the if/then expression when stored on disk, and 100% slower when in memory.

This aspect of XQuery performance is important if XQuery is to be used for general application development since data structures such as indexed and associative arrays have to be represented as sequences of atomic values or elements. This performance is not really surprising and there may be more performance to be gained by indexing the data base element.

2 comments:

Wolfgang said...

Interesting test. Surprising: for a small XML fragment, I would have expected the in memory lookup to be faster than the same lookup on a stored document. I'm quite sure we can speed this up internally. I will run the profiler on your test.

Also, if you run $noteStep/note[@name = $thispitch/step] on an fragment stored in the db, the invariant expression will indeed only be evaluated once. However, this optimization doesn't work for in memory documents. I think I need a little redesign here. Detecting invariants should be done in a separate optimization step.

chris wallace said...

I revised the test suite and added the unfactored db test - which seem to indicate that the loop invariant is not extracted.