[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Unicode?
- From: Mark Hamburg <mhamburg@...>
- Date: Wed, 11 Jun 2003 10:28:13 -0700
I haven't pounded on it extensively, but I've wired my simple Lua
environment (built in Cocoa on MacOS X) to work with UTF8 encoded strings
for input and output. I expect this to be fine so long as I:
* Don't want to disassemble strings into characters
* Use regular expressions that use things other than low-ASCII for matches
* Perform comparisons on strings other than for equality
What this relies on is that:
* Lua fully supports essentially any 8-bit character set but really only
cares about those in the 7-bit ASCII set from a parsing standpoint
* UTF-8 does all of its encoding using combinations of high 8-bit values --
i.e., the bytes of a multibyte character can never be mistaken for ASCII
Mark