lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Wed, Jan 5, 2011 at 6:57 PM, Lorenzo Donati
<lorenzodonatibz@interfree.it> wrote:
> I know Lua can _store_ any octet sequence in a string. The doubt is with the
> interpreter executable: can it read and always parse a utf8 file with
> non-ASCII chars in some literals/comments?

i believe (but can be wrong) when the Lua compiler scans for a string,
it grabs everything until it grabs the end quote, which can be ', " or
]==..==].  all these are ASCII values, whose octets have the eight bit
clear.

UTF8 was specifically formulated so that a multi-octet character can't
be mistaken for ASCII characters.  to ensure that, every octet has the
eight bit set.

therefore, when a string starts with ", Lua will grab everything until
the next ", which can't be at the middle of a non-ASCII character.
similarly with single quotes and longstrings

so, i think you're safe.

-- 
Javier