Lots of little trees, part 2

I just noticed that the Lawrence Berkeley National Laboratory’s Nux library provides streaming XQuery functionality that makes it very easy to do the kind of XML processing that I described in this post last week.

Using Scala, for example, we can start with some imports:

Next we write the “transformer” that we want to apply to every record element:

We’re not really transforming anything here, of course—just performing a side effect as we iterate through the records. We could just as easily be adding some representation of the record to a mutable collection, sending a message to an actor, etc.

Now we create and run our query:

And we’re done. Like my conduit-based implementation, this will iterate through the records in a constant amount of memory. It’s less elegant than that solution, but it works, it’s easy, and it seems to be significantly faster.