lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Jul 4, 2012, at 12:29 PM, Dirk Laurie wrote:

> 2012/7/4 Matthew Wild <mwild1@gmail.com>:
>> On 4 July 2012 16:46, Dirk Laurie <dirk.laurie@gmail.com> wrote:
>>>> bad input …. the result is probably not what the programmer intended.
>>> 
>>> You're right, but is that a defect of my code or just an inconvenient
>>> truth also known as GIGO?
>> 
>> It's a defect in your code, if your code is meant to be taking
>> arbitrary input and generating valid XML.
> 
> It isn't.  Lua and XML are both too rich to make that a design option.

I think Matthew means "arbitrary string input"; I think the two of us assumed that h1{s} is an intended use. This should result in an h1 tag with text contents of s--which means s must be quoted like "<h1>"..quoted(s).."</h1>".

The alternative is to write h1{quoted(s)} but this usually ends up under-escaping, over-escaping, or both. In my opinion, it is dangerous to let strings sometimes represent text and sometimes represent XML. Better to always have them be strings, which are escaped at the edges.

It's OK to generate only a subset of well-formed XML. DTD-less, PI-less XML is a very common constraint. If you wanted, your notation you can do mixed content easily:

  blockquote{"Games ", b{"without"}, " frontiers."}

But not everybody needs or wants to do mixed content.

There may be cases where somebody has already handed you stringified XML content to be attached to your tree. I'd prefer that not to happen, but it can be incorporated in this notation with something like:

  function bq(s) 
      return blockquote{_rawxml(s))
  end
  print(bq("Games <b>without</b> frontiers."))

but this is hopefully not that common because it's unsafe. "Uncommon and/or unsafe" should keep it from being the default choice.

There is the separate issue of only emitting characters allowed in XML (and not mixing up your encodings accidentally) but the consequence of including a \0 in output is that a reader will abort, not misinterpret. Text encoding is a problem in Lua in general, and trying to deal with it at the pure-Lua level is a pain. I'd put it off until later.

Jay