lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Jay Carlson <> wrote:

>> On Apr 30, 2017, at 10:44 AM, Soni L. <> wrote:
>> On 2017-04-30 07:19 AM, Shmuel Zeigerman wrote:
>>> On 29/04/2017 16:41, Dirk Laurie wrote:
>>>> The next step would be a compiler option under which the lexer
>>>> accepts a UTF-8 first character followed by the correct number
>>>> of UTF-8 continuation characters as being alphabetic for the
>>>> purpose of being an identifier or part of one.
>>> BTW, LuaJIT 2 has it for years already (it allows UTF-8 in identifiers). But it seems nobody needs it.
>> Well, I'd argue mostly nobody needs it because it doesn't allow emoji in identifiers.
> Does it prohibit all astral characters, or are emoji singled out?
> Either way, that's too bad. You can write this Ruby:
> def 🚫(📖)
> 	🐦 = 💻(📖)
> 	📝🚫🐊 =
> 	begin
> 		🚫📥 = 🐦.blocked_ids
>, 'block_ids.current'), 'w+') do |💾|
> 			🚫📥.sort.each do |😭🐊|
> 				💾.puts(😭🐊)
> 				📝🚫🐊 << 😭🐊.to_s

I have been following the conversation about making Lua be UTF-8 aware for
source code identifiers, it has been quite the lively discussion!

Roberto's comment about it making Lua twice as large seemed crazy...
that is until I spent some time looking into UTF-8 libraries and all the
intricacies of implementing a UTF-8 parser. Wow!

I thought it would be interesting to see what could be done with UTF-8 for
source code... but if the result is like the above (source code with emoji
identifiers for those who can't see it) I think I'd rather code in Italian
BASIC or even that one version of Pascal

In all my years of coding I would say 99% of the time everything is in
English... I really hope emoji identifiers do not become standard, that
would be terrible to code in, Soni's obsession with them notwithstanding! ;)