[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Could Lua itself become UTF8-aware?
- From: Dirk Laurie <dirk.laurie@...>
- Date: Sat, 29 Apr 2017 15:41:50 +0200
2017-04-29 15:21 GMT+02:00 Roberto Ierusalimschy <email@example.com>:
>> At present all the entries from 0x80 to 0xFF in the constant array
>> luai_ctype in lctype.c are zero: no bit set.
>> There are three unused bits. Couldn't two of them be used to mean
>> UTF8_FIRST and UTF8_CONT?
>> This is only the first step, but if the idea is shot down here already,
>> the others need not be mentioned.
> This particular idea has very low cost, so I don't see why to shot it
> down before knowing the rest of the story. What does it mean for Lua
> to be "UTF-8 aware"?
> -- Roberto
The next step would be a compiler option under which the lexer
accepts a UTF-8 first character followed by the correct number
of UTF-8 continuation characters as being alphabetic for the
purpose of being an identifier or part of one.