[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Will Lua kernel use Unicode in the future?
- From: Chris Marrin <chris@...>
- Date: Sat, 31 Dec 2005 07:24:24 -0800
Jens Alfke wrote:
(I've replied to a bunch of messages here rather than sending out six
On 29 Dec '05, at 9:17 AM, Chris Marrin wrote:
It allows you to add "incidental" characters without the need for a
fully functional editor for that language. For instance, when I
worked for Sony we had the need to add a few characters of Kanji on
occasion. It's not easy to get a Kanji editor setup for a western
keyboard, so adding direct unicode was more convenient. There are
also some oddball symbols in the upper registers for math and
chemistry and such that are easier to add using escapes.
Also, in some projects there are guidelines that discourage the use of
non-ascii characters in source files (due to problems with editors,
source control systems, or other tools. In these situations it's
convenient to be able to use inline escapes to specify non-ascii
characters that commonly occur in human-readable text ... examples
would include ellipses, curly-quotes, emdashes, bullets, currency
symbols, as well as accented letters of course.
But we're moving into an era where support of English text (the only
language in existance that fits into ASCII) is not good enough. I used
to work at Sony, where this issue is magnified about a million times
compared to "western" languages. The notion of "optional" support for
non-ascii characters was never acceptable.
IMO, with globalization, languages that don't support Unicode won't
make the cut in the long run.
I find it ironic that the three non-Unicode-savvy languages I use (PHP,
Ruby, Lua) all come from countries whose native languages use non-ascii
But I think Lua mostly does a great job of supporting Unicode with it's
agnostic approach, because it allows UTF8 sequences to pass through
unchallenged... mostly. The few rough edges being discussed here are
really minor. Escaping unicode would simply make it more practical to
handle the full unicode range. Making it easier to tell the underlying
clib to use UTF8 for collating avoids putting platform specific code
outside Lua. And allowing non-ascii character identifiers takes this
restriction away from non-English speakers. Small changes just to
tighten up the i18n support.
chris marrin ,""$, "As a general rule,don't solve puzzles
email@example.com b` $ that open portals to Hell" ,,.
,.` ,b` ,` , 1$'
,|` mP ,` :$$' ,mm
,b" b" ,` ,mm m$$ ,m ,`P$$
m$` ,b` .` ,mm ,'|$P ,|"1$` ,b$P ,` :$1
b$` ,$: :,`` |$$ ,` $$` ,|` ,$$,,`"$$ .` :$|
b$| _m$`,:` :$1 ,` ,$Pm|` ` :$$,..;"' |$:
P$b, _;b$$b$1" |$$ ,` ,$$" ``' $$
```"```'" `"` `""` ""` ,P`