[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Could Lua itself become UTF8-aware?
- From: Enrico Colombini <erix@...>
- Date: Mon, 1 May 2017 09:17:37 +0200
On 01-May-17 07:28, Daurnimator wrote:
However, to reply to the issue at hand: are unicode classes wanted?
i.e. should a unicode space such as U+2001 count as whitespace for
Furthermore, what should be considered valid characters for identifiers?
I guess we still want the rule "alpha followed by any number of alphanumeric"?
Which Unicode standard do we want to pick? (You did realise unicode
gets updated.... right?)
We'd need a strategy to deal with updates (which rarely go well: see
how people are still dealing with fallout from IDNA2003 => IDNA2008)
Which brings us to the next problem: normalisation of identifiers. It
would seem perplexing to many that the identifiers U+00C5 and U+0041
U+030A would refer to different variables.
Even if you don't think normalisation should occur (like myself), then
you'll at least have an easy mechanism for obfuscated code
FWIW, in the '80s there were Italian versions of BASIC, but they luckily
died out. I say "luckily" because language localization mean community
fragmentation: you cannot search for ideas and solutions and you cannot
cooperate with people around the world. Not to speak of library usability.
So I do not think localized identifiers are a great idea. I realize that
the issue may be more acutely felt by Asian users, but it is always a
balance between symbol comprehension and using universal symbols (even
in case one does not understand their literal meaning: think of
identifiers as pictograms).
You do not have to think in English to write "if...then...else", even if
it could be very slightly helpful in the first learning steps. In fact,
I knew almost nothing of English when I started programming and that did
not hamper me in the least. But the benefits of using world-standardized
identifiers were immense.