lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, May 12, 2015 at 1:27 AM, Gaspard Bucher <gaspard@teti.ch> wrote:
local validate = require 'utf8validator'
local orig_load = xml.load
function xml.load(string)
  return orig_load(validate(string))
end

The "validate" function could take an optional "invalid char handler" function as argument, letting end users decide what to do on invalid characters instead of blowing.

my 2c...
 
Is there a reasonable default "sanitize" step that could be mechanically applied to common but technically invalid sequences, or perhaps a small core set of transformations to choose from? Like how %q escapes special characters. Maybe a parameter that lets you choose amongst throw, fixup/re-encode, delete offending chars, or a custom handler?

Would this cause all kinds of havoc? I admit shallow knowledge of the subject.


--
Brigham Toskin