[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: question about Unicode
- From: roberto@... (Roberto Ierusalimschy)
- Date: Thu, 7 Dec 2006 13:02:09 -0200
> In fact, UTF-8 also uses a maximum of 4 bytes to represent
> any code point, but requires 3 bytes to represent code points
> in asian languages, so in general terms it is less compact
> than UTF-16, but in some applications ("mostly ascii") it will
> turn out to be better.
If I understand correctly, even asian languages use ascii punctuation
(dots, spaces, newlines, commas, etc.), which uses 1 byte in utf-8 but 2
in utf-16. So, even for these languages utf-8 it is not so less compact
as it seems.