lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I tested it, and apparently I was mistaken.
LUA work 5 and lua 4 do not accept 
UTF8 or korean identifiers because it
uses isalnum() and isalpha(), which on
most systems excludes values with the high 
8-bit set.

However, with a rather trivial hack,
you can make identifiers with the high 8 bit 
set work. Here's the output from a plain
GNU 'diff' between the lua-5.0-work llex.c
and my llex.utf8.c. I have tested the patch 
for UTF-8 and latin 9 encodings, and it seems 
to work fine.  

156c156,159
<   } while (isalnum(LS->current) || LS->current == '_');
---
>   } while ( isalnum(LS->current) || ( LS->current == '_')
>   || ( LS->current > 127) );
> /* Allow alphanumerical characters, but also 
>   characters in the upper 8 bit range. */
386c389,390
<         else if (isalpha(LS->current) || LS->current == '_') {
---
>         else if 
>         (isalpha(LS->current) || LS->current == '_' || (LS->current > 127)) {

Could people tell me if it works in other encodings?
If so, could this patch be reviewed and added to
Lua proper? I think 8-bit wide identifier names are a nice 
thing to have, and they are not expensive, it seems.

-- 
"No one knows true heroes, for they speak not of their greatness." -- 
Daniel Remar.
Björn De Meyer 
bjorn.demeyer@pandora.be