lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, Aug 9, 2016 at 2:51 PM, Roberto Ierusalimschy
<roberto@inf.puc-rio.br> wrote:
>> As such, since text files on disk in most current filesystems do not
>> carry encoding metadata, it isn't a bug to put a BOM at the beginning
>> of a UTF-8 text file.
>
> It is a bug in the spec, in the sense that it breaks the main raison
> d'être of utf-8 (being compatible with ascii).

I disagree on the grounds that it is neither required nor recommended
and the spec is playing "be liberal in what you accept" -- that is,
the spec is saying "because this can happen, a compliant
implementation should be prepared to handle it."

It's also a non-goal to be forwards-compatible with ASCII. It's
backwards-compatible with it (all valid 7-bit ASCII documents are
valid UTF-8) but the converse is obviously false (not all valid UTF-8
documents are valid 7-bit ASCII documents, independent of the BOM
issue). You can't expect tools that process ASCII data to correctly
handle arbitrary UTF-8 documents.

/s/ Adam