lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> In fact, UTF-8 also uses a maximum of 4 bytes to represent
> any code point, but requires 3 bytes to represent code points
> in asian languages, so in general terms it is less compact
> than UTF-16, but in some applications ("mostly ascii") it will
> turn out to be better.

If I understand correctly, even asian languages use ascii punctuation
(dots, spaces, newlines, commas, etc.), which uses 1 byte in utf-8 but 2
in utf-16. So, even for these languages utf-8 it is not so less compact
as it seems.

-- Roberto