I'm proposing to let express complex XML data manipulations programs through pattern matching. That's what XSLT pretends to do, in an unmaintainable, unreadable, contrieved, crippled way.
Scala does a very decent job of using such approaches for XML, and integrating them into a 'normal' (i.e. procedural OO flavored) language. You can get the whole picture of Scala manipulation of XML here:
http://lamp.epfl.ch/~emir/projects/scalaxbook/scalaxbook.docbk.html
Except for its java-friendly twist, it's pretty similar to the other projects mentioned above. In the doc above, the part about pattern matching starts at the section "Matching XML" (
http://lamp.epfl.ch/~emir/projects/scalaxbook/scalaxbook.docbk.html#id2494672).
This kind of system only makes sense for highly structured XML data, and only when you plan to transform them in non-trivial ways. Mind you, if you're just going to scrap titles from an RSS stream,
string.gmatch() is all you need :) People tend to stick to rather simple XML schema, precisely because the mainstream tools make in a pain in the *ss if you want to do anything non trivial.
WRT LuaExpat and tighter syntax: LuaExpat's job is to extract Lua-friendly structures from a possibly huge XML stream, and there's no point in re-doing that. It's just about building a more powerful manipulation API over it.
I guess that pattern matching could be described as a "tighther syntax", the same way as C++ could be described as a tighter syntax for ASM code: both can roughly express the same things, but C++ [allegedly] maps sane mental representation of programming problems into concrete syntax directly. Similarly, you can describe regular expressions as a tighter syntax for finite state automatons. The result is tighter, more readable, more maintainable, and more enjoyable to write.
Similarly, many XML manipulation problems can be summed up as:
- finding pieces of XML data which possess certain structural properties, e.g.
"a <book> element directly under a <bookshelf> node, with either an <author> or a <translator> subnode in it".
- discuss about subtrees of such pieces of data, once found, e.g. "
let the variable 'writer' hold the text content of the node <author> or <translator> above".
- combine these pattern descriptions in bigger switch-like statements, which choose the pattern that fits best a treated data, and applies a corresponding block of treatement code to it,
e.g. "
insert all the 'writer' variables collected above into an RSS stream of all collaborators of my library".
This is pretty much what structural pattern matching has been offering in ML-like languages since the 70's. although it's been more focused on tree-representations of programs/languages. The newer research projects mentioned above mainly aim at letting express patterns in a more relaxed way, which corresponds better to XML's typical usages.