lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Daniel Silverstone schrieb:
> All in all, Microsoft should not be encouraged to let this
> abomination stand.

I don't think it's just Microsoft:

> Q: Can a UTF-8 data stream contain the BOM character (in UTF-8 form)?
> If yes, then can I still assume the remaining UTF-8 bytes are in
> big-endian order?
> A: Yes, UTF-8 can contain a BOM. However, it makes no difference as
> to the endianness of the byte stream. UTF-8 always has the same byte
> order. An initial BOM is only used as a signature -- an indication
> that an otherwise unmarked text file is in UTF-8. Note that some
> recipients of UTF-8 encoded data do not expect a BOM. Where UTF-8 is
> used transparently in 8-bit environments, the use of a BOM will
> interfere with any protocol or file format that expects specific
> ASCII characters at the beginning, such as the use of "#!" of at the
> beginning of Unix shell scripts.