lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Thanks Coda and Tim, that helps me understand how I should deal with it. I am using he XML parser mentioned here: http://lua-users.org/wiki/LuaXml specifically this one: http://manoelcampos.com/files/LuaXML-0.0.1-(lua5).tar.gz

This parser does not understand the BOM in the beginning and goes into error mode. I think the simplest fix is to check if the stream starts with a UTF-8 BOM then just remove it and then do as normal.

Thanks,
Milind



On Tue, Sep 2, 2014 at 6:02 PM, Tim Hill <drtimhill@gmail.com> wrote:

On Sep 2, 2014, at 1:02 PM, Coda Highland <chighland@gmail.com> wrote:

> The XML spec says that a UTF-8 BOM is definitely SUPPOSED to be legal.
> If your XML parser can't handle it, then either the device is
> generating the BOM incorrectly (it should be 0xEF 0xBB 0xBF), or
> you're munging it somewhere between receiving it from the socket and
> writing it to a file (possibly implicitly, for example if something
> you're using is trying to be too clever and doing a charset conversion
> when it shouldn't), or your XML parser is in violation of the spec. I
> suspect the first one is the most likely.
>
> /s/ Adam
>

Technically a BOM is never legal in UTF-8, XML spec notwithstanding.

—Tim