[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: ignoring BOM
- From: Robert Raschke <rtrlists@...>
- Date: Sun, 31 May 2009 23:11:40 +0100
Sure, but do any programs outside of Notepad, Wordpad and some (very
definitely not all) others on Windows actually deal with BOMs (note
the plural, cf.
http://en.wikipedia.org/wiki/Byte-order_mark#Representations_of_byte_order_marks_by_encoding
). And how many programs outside of Windows do anything appropriate
with a BOM at all?
It may have been a reasonable idea before UTF-8 came along, but even
that must have been a very short period of time.
But anyway, Lua does not actually state that it can dofile() or
require() UTF-8 files anywhere that I can see. So, no problem, right?
Robby
On 5/31/09, Javier Bezos <noreply@tex-tipografia.com> wrote:
> Robert Raschke wrote:
>
> > I've gone and hacked the bit in the lua code that is used to load code
> > to ignore the silly BOM that M$ insists in introducing into text
> > files.
>
> Silly or not, it's a valid option in a UTF-8 file, according to
> the Unicode standard:
>
> In UTF-8, the BOM corresponds to the byte sequence <EF16 BB16 BF16>.
> Although there are never any questions of byte order with UTF-8 text,
> this sequence can serve as signature for UTF-8 encoded text where the
> character set is unmarked.
>
> So, a system claiming it can understand UTF-8 files must be able to
> handle the BOM somehow (like ignoring it).
>
> Javier
> -----------------------------
> http://www.tex-tipografia.com
>
>