|
On Feb 7, 2014 12:57 AM, "steve donovan" <steve.j.donovan@gmail.com> wrote:
> On Fri, Feb 7, 2014 at 7:05 AM, Andrew Starks <andrew.starks@trms.com> wrote:
> > T has five children: hello, a tag, world, b tag, and how are you?
> > So storing text data in negative keys would seem to make less sense. What am
> > I missing?
>
> It's fine (I think) to have both 'tags' and text as part of the array,
> since they're so easy to distinguish using type().
Python's ElementTree API keeps text inside a tag in an "e.text" element. However, it only stores the text up to the first child tag. Text between elements--that is, trailing siblings to an element--are stored in "echild.tail". If you are doing mixed content, this seems like more abuse from the SOAPheads, since it so clearly is designed to let them ignore .text most of the time and ignore .tail all of the time.
It also forces canonical form onto text; there is never an issue where <a>bb</a> can be represented as both {[0]="a", "bb"} and {[0]="a", "b", "b"}.
Does anybody *like* the numeric-valued attributes coming out of lxp start tags? I end up nil'ing them out.
> Clearly XML is one of those things in the Lua-niverse that are going
> to be continuously reinvented; one of the factors behind this is
> precisely that a fully general representation is awkward for
> hierarchical data.
lhf's format does have a disadvantage: by using t["string"] on attributes, it can't be used for methods or other string-keyed user annotation. Oh well.
Jay