[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Native unicode support?
- From: Björn De Meyer <bjorn.demeyer@...>
- Date: Wed, 26 Jun 2002 19:07:13 +0200
David Burgess wrote:
> Methinks UTF-8 would be an ideal solution. Does anyone know whats
> required to implement utf-8 in Lua?
It depends on how far you want to go. UTF-8 is a
miltibyte encoding that has many benefits like
ASCII compatibility and relative simplicity
of encoding. To support UTF-8 only in character
strings, you need to rewrite most of the the lua
standard string library. Especially the regular-
expression engine looks like a tough nut.
If you want to make UTF-8 the fixed, default encoding for
the .lua scripts themselves, then you will need to adapt
llex.c, and supply your own replacements for
isalpha() and isalnum(). Fortunately, with UTF-8,
you can see from a single byte whether a character is part
of an "alphabetical" sequence. Check the UTF-8 specs
for more info. Or contact me at my e-mail address below.
"No one knows true heroes, for they speak not of their greatness." --
Björn De Meyer