lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


David Burgess wrote:
> 
> Methinks UTF-8 would be an ideal solution. Does anyone know whats
> required to implement utf-8 in Lua?



It depends on how far you want to go. UTF-8 is a
miltibyte encoding that has many benefits like 
ASCII compatibility and relative simplicity 
of encoding. To support UTF-8 only in character 
strings, you need to rewrite most of the the lua 
standard string library. Especially the regular-
expression engine looks like a tough nut.

If you want to make UTF-8 the fixed, default encoding for 
the .lua scripts themselves, then you will need to adapt 
llex.c, and supply your own replacements for 
isalpha() and isalnum(). Fortunately, with UTF-8, 
you can see from a single byte whether a character is part 
of an "alphabetical" sequence. Check the UTF-8 specs
for more info. Or contact me at my e-mail address below. 
 
-- 
"No one knows true heroes, for they speak not of their greatness." -- 
Daniel Remar.
Björn De Meyer 
bjorn.demeyer@pandora.be