lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 15/10/2018 17:55, 奥斯陆君王 wrote:

[...]


It's very easy.Will lua 5.4 support it?


I hope it never will!

Sorry, it is not about any cultural prejudice (I know many people, especially Asian people, could feel discriminated by such a stance, but it is not my intention).

It is just a matter of convenience and "safety". It is not worth opening such a big can of worms, IMO.

I started programming learning by trial and error what it means using "0" and "O" and "o" as characters in identifiers carelessly. The same goes for "l" and "1".

That is, any subset of characters that have likely similar glyphs in some font are going to cause grief in some cases without proper programming practices.

Allow the whole UNICODE mess into identifiers and the chances for mistaking a symbol for another skyrockets exponentially! I'm not an UNICODE guru but I bet my bottom dollar that there are more than a dozen symbols that, in some font, look like an uppercase latin "O" (that is a symbol looking like more or less like a circle). The same goes for other simple-looking symbols like an uppercase "I" (a vertical "stick" of some sort).

Now, imagine an identifier like B10010100, where each individual "character" is in fact a different "version" of a "0" or a "1". Nightmare!

These problems are somewhat small annoyances to cope with when you are dealing with ASCII, where the "problematic" chars are well known, because every programmer more or less knows what's in *the whole ASCII set*.

But what the frigging heck is in UNICODE?!? There are gazillions of code points! There are even not-yet-defined code points!!! WHO knows UNICODE in its entirety?

How can I be sure that whoever must use my code where I inserted a "unicodishy" identifier is able to understand uniquely what kind of "characters" make up the identifier?

Is this worth all the hassle? What advantages would this bring to the programming effort? How much will it cost to track down bugs generated by the possible mistake?

I doubt there are tangible *net* advantages in *standardizing* UNICODE, even in its remarkable UTF-8 encoding, as an alphabet for programming.

UNICODE was meant for linguistics and typesetting, not for programming.

And anyway, as LHF pointed out, you can change lctype.c if you have special needs (which I definitely won't argue against, that's for sure).


BTW, since this is not the first time this "I'd like unicode in my names" thing comes up, I'd like to see some of the UNICODE gurus on this list entering a contest of creating the most bedazzling set of seemingly-identical identifiers using theirs "utf-8 powers". :-D

I think this will have great educational value for those thinking that having *generally standardized* UNICODE identifiers is a good idea. ;-)


Cheers!

--Lorenzo