lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> I think that the statement "storing data on file with anything other than
> UTF-8 would IMHO be a mistake" only holds if you're thinking of
> storing text
> in English or Latin languages. If you go to Japanese & Chinese,
> then you're
> talking about 3 bytes per character, which is more than UTF-16 and UCS-2.

While this is indeed true, and often annoying for people who use non latin
character sets, a good part of the strings that Lua uses are program code,
which is usually made up of latin characters.
This fact is fuzzied by the fact that Lua can accept identifiers using non
ASCII characters, depending on the locale in use. Still, the proportion of
latin characters is likely to be high enough for the space gain to be a
worthwhile tradeoff.
I agree that string intensive programs will diminish this ratio though, if
using non latin characters.

> I think it would be better for Lua to support Unicode via wchar_t if all
the
> target underlying systems could support this. Because this is not the
case,
> the use of UTF-8 sounds like a reasonable approach.

>From what I've read on the list, one of the key advantages of Lua over other
languages is its small size. Indiscriminate use of wchar_t for all strings
would be a waste of space. I don't deny the fact that wchar_t has
advantages,
but I thought I would underline this.

Thanks

--
Vincent Penquerc'h