[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor
- From: Andrew Starks <andrew.starks@...>
- Date: Tue, 19 Feb 2013 13:27:25 -0600
On Feb 19, 2013, at 12:15, "hgualandi@inf.puc-rio.br"
<hgualandi@inf.puc-rio.br> wrote:
> Interesting. I was just doing something very similar yesterday...
>
> Anyway, during my research I found a couple of papers describing a
> different API they used when implementing a xml parser on Scheme.
>
> http://www.okmij.org/ftp/Scheme/xml.html
> http://www.okmij.org/ftp/Scheme/xml.html#Papers
>
> Basically, they said that when using a SAX parser you almost always want
> to maintain some sort of stack of elements as well as check if the open
> and close tags matched. So what they did was keep an internal stack (like
> the nsStack you have but with more stuff) and expose it to the user via
> extra arguments passed to the handlers and by assigning a meaning to their
> return values.
>
> parser = SLAXML:parser{
> startElement = function(name,nsURI, parentNode) return
> newChildNode end,
> closeElement = function(name,nsURI, parentNode, currNode) return
> parentNode end,
> text = function(text, currNode) return
> currNode end,
> -- and so on...
> }
>
> (the reason they gave for using the return values is because that lets you
> choose what sort of value gets put into the stack. They also wanted to be
> able to put immutable values such as strings and numbers in the stack)
>
> Has anyone here seen something similar? It seemed like a good idea when I
> read it.
>
With LuaExpat, I use a stack by making sure that my callbacks have
access to it as an up value. At EndElement, I pop the last value and
check that it matches the start element, which means it balanced, of
course. It's pretty trivial.
> And a minor thing: is that "attribute" callback really needed? Most of the
> SAX stuff I saw just reads the attributes in a list and then passes them
> to the startElement handler.
>
I found that basically everything is done in StartElement and
everything is validated in EndElement.
Also, I have a week-value flat list of of the elements which point to
the DOM-Like object, as well.