I'm proposing to let express complex XML data manipulations programs through pattern matching. That's what XSLT pretends to do, in an unmaintainable, unreadable, contrieved, crippled way.
Scala does a very decent job of using such approaches for XML, and integrating them into a 'normal' (i.e. procedural OO flavored) language. You can get the whole picture of Scala manipulation of XML here:
Except for its java-friendly twist, it's pretty similar to the other projects mentioned above. In the doc above, the part about pattern matching starts at the section "Matching XML" (
This kind of system only makes sense for highly structured XML data, and only when you plan to transform them in non-trivial ways. Mind you, if you're just going to scrap titles from an RSS stream,
string.gmatch() is all you need :) People tend to stick to rather simple XML schema, precisely because the mainstream tools make in a pain in the *ss if you want to do anything non trivial.
WRT LuaExpat and tighter syntax: LuaExpat's job is to extract Lua-friendly structures from a possibly huge XML stream, and there's no point in re-doing that. It's just about building a more powerful manipulation API over it.
I guess that pattern matching could be described as a "tighther syntax", the same way as C++ could be described as a tighter syntax for ASM code: both can roughly express the same things, but C++ [allegedly] maps sane mental representation of programming problems into concrete syntax directly. Similarly, you can describe regular expressions as a tighter syntax for finite state automatons. The result is tighter, more readable, more maintainable, and more enjoyable to write.
Similarly, many XML manipulation problems can be summed up as:
- finding pieces of XML data which possess certain structural properties, e.g. "a <book> element directly under a <bookshelf> node, with either an <author> or a <translator> subnode in it".
- discuss about subtrees of such pieces of data, once found, e.g. "let the variable 'writer' hold the text content of the node <author> or <translator> above
- combine these pattern descriptions in bigger switch-like statements, which choose the pattern that fits best a treated data, and applies a corresponding block of treatement code to it,
e.g. "insert all the 'writer' variables collected above into an RSS stream of all collaborators of my library
This is pretty much what structural pattern matching has been offering in ML-like languages since the 70's. although it's been more focused on tree-representations of programs/languages. The newer research projects mentioned above mainly aim at letting express patterns in a more relaxed way, which corresponds better to XML's typical usages.