[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Could Lua itself become UTF8-aware?
- From: Paige DePol <lual@...>
- Date: Tue, 2 May 2017 02:13:58 -0500
Jay Carlson <email@example.com> wrote:
>> On Apr 30, 2017, at 10:44 AM, Soni L. <firstname.lastname@example.org> wrote:
>> On 2017-04-30 07:19 AM, Shmuel Zeigerman wrote:
>>> On 29/04/2017 16:41, Dirk Laurie wrote:
>>>> The next step would be a compiler option under which the lexer
>>>> accepts a UTF-8 first character followed by the correct number
>>>> of UTF-8 continuation characters as being alphabetic for the
>>>> purpose of being an identifier or part of one.
>>> BTW, LuaJIT 2 has it for years already (it allows UTF-8 in identifiers). But it seems nobody needs it.
>> Well, I'd argue mostly nobody needs it because it doesn't allow emoji in identifiers.
> Does it prohibit all astral characters, or are emoji singled out?
> Either way, that's too bad. You can write this Ruby:
> def 🚫(📖)
> 🐦 = 💻(📖)
> 📝🚫🐊 = Array.new
> 🚫📥 = 🐦.blocked_ids
> File.open(File.join(LOGDIR, 'block_ids.current'), 'w+') do |💾|
> 🚫📥.sort.each do |😭🐊|
> 📝🚫🐊 << 😭🐊.to_s
I have been following the conversation about making Lua be UTF-8 aware for
source code identifiers, it has been quite the lively discussion!
Roberto's comment about it making Lua twice as large seemed crazy...
that is until I spent some time looking into UTF-8 libraries and all the
intricacies of implementing a UTF-8 parser. Wow!
I thought it would be interesting to see what could be done with UTF-8 for
source code... but if the result is like the above (source code with emoji
identifiers for those who can't see it) I think I'd rather code in Italian
BASIC or even that one version of Pascal
In all my years of coding I would say 99% of the time everything is in
English... I really hope emoji identifiers do not become standard, that
would be terrible to code in, Soni's obsession with them notwithstanding! ;)