lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Maybe it's worth adding a couple of details about what misses in the current pattern matching system, to make it a good XML handling tool.

The current system works great to manipulate trees that are entirely in RAM, and whose structure is known precisely. The most common use-case is to manipulate metalua Abstract Syntax Trees, although it's prefectly suitable for any other tree-like structure. When you manipulate XML, there are two key differences:

- the data you're working with are potentially huge or even infinite streams. You want to be able to load data in RAM lazily, only when it's actually required.

- XML data are often structured more loosely, so you'll want some more forgiving description idioms for the patterns. For instance, it's currently easy to describe a tree node X with a subtree Y as its first child:

  | `X{ `Y{ content_of_y } } -> -- handling code

You'll want to be able to describe things like "a Y subtree as any child of a node X", or "the set of all children of a node X that satisfy predicate 'p' ",  or "nodes Y that are included, directly or indirectly, into a supernode X". Since these are clear and simple concepts in the developer's mind, they ought to be represented in a clear and simple way in his code. Whether this simple syntax is eventually compiled to hairy code shouldn't be the developer's concern, as long as this generated code is reliable.

Pattern matching is a good tool for AST manipulation, because a pattern looks pretty much like the terms it can capture, so it's easy to write, easy to read, and it easily expresses the kind of things you often want to do. You want the same to be true for XML pattern matching.

I wrote this morning, and then shamelessly quote myself:
(In reality, it's compiled into something equivalent, slightly faster and much less readable, but you're but supposed to look at compiled code. There's a "-a" option in the metalua compiler that prints the compiled code as an AST, if you really want to know).

There was a typo here, it should have read "but you're *NOT* supposed to look at compiled code".  Option -a prints an AST, which is often enough for extension writers: its main purpose is to debug complex macros. Eventually, that printer should be improved, to translate AST back into concrete Lua syntax whenever possible...
 
-- Fabien.