lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Mon, Dec 04, 2006 at 04:49:11PM -0200, Roberto Ierusalimschy wrote:
> > The way slnunicode does it is optimized for size,
> > using a higly compressed unicode character class table (from Tcl)
> > and never requiring space for a UTF-16 version (unlike Tcl).
> What is the license? Where can I find documentation?

As an aside, it should be noted that the notion of UTF-8 being smaller
than UTF-16 (or UCS-2) is a very Western-centric idea.  It's variable-
width, so it's only smaller for languages where most letters take
one byte in UTF-8; Asian languages typically take three, so UTF-8
is 50% larger.  (No prejudice in UTF-8's design here--CJK just has
too many characters!)  Arabic breaks even, I think.

Just a note, not an argument against UTF-8--not being able to desync
the stream by losing a byte, endianness-independence, and scaling to
high Unicode ranges cleanly are good reasons to use it, too.

Glenn Maynard