Re: [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor
From: Andrew Starks <andrew.starks@...>
Date: Tue, 19 Feb 2013 13:27:25 -0600

On Feb 19, 2013, at 12:15, "hgualandi@inf.puc-rio.br"
<hgualandi@inf.puc-rio.br> wrote:

> Interesting. I was just doing something very similar yesterday...
>
> Anyway, during my research I found a couple of papers describing a
> different API they used when implementing a xml parser on Scheme.
>
> http://www.okmij.org/ftp/Scheme/xml.html
> http://www.okmij.org/ftp/Scheme/xml.html#Papers
>
> Basically, they said that when using a SAX parser you almost always want
> to maintain some sort of stack of elements as well as check if the open
> and close tags matched. So what they did was keep an internal stack (like
> the nsStack you have but with more stuff) and expose it to the user via
> extra arguments passed to the handlers and by assigning a meaning to their
> return values.
>
>    parser = SLAXML:parser{
>      startElement = function(name,nsURI, parentNode)           return
> newChildNode end,
>      closeElement = function(name,nsURI, parentNode, currNode) return
> parentNode   end,
>      text         = function(text, currNode)                   return
> currNode  end,
>      -- and so on...
>    }
>
> (the reason they gave for using the return values is because that lets you
> choose what sort of value gets put into the stack. They also wanted to be
> able to put immutable values such as strings and numbers in the stack)
>
> Has anyone here seen something similar? It seemed like a good idea when I
> read it.
>

With LuaExpat, I use a stack by making sure that my callbacks have
access to it as an up value. At EndElement, I pop the last value and
check that it matches the start element, which means it balanced, of
course. It's pretty trivial.

> And a minor thing: is that "attribute" callback really needed? Most of the
> SAX stuff I saw just reads the attributes in a list and then passes them
> to the startElement handler.
>

I found that basically everything is done in StartElement and
everything is validated in EndElement.

Also, I have a week-value flat list of of the elements which point to
the DOM-Like object, as well.

References:
- [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor, hgualandi

Prev by Date: [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor
Next by Date: Re: [ANN] LuaJIT-2.0.1
Previous by thread: [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor
Next by thread: Re: [ANN] SLAXML - pure Lua, robust-ish, SAX-like streaming XML processor
Index(es):
- Date
- Thread