lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Jens Alfke wrote:
(I've replied to a bunch of messages here rather than sending out six separate replies...)

On 29 Dec '05, at 9:17 AM, Chris Marrin wrote:

It allows you to add "incidental" characters without the need for a fully functional editor for that language. For instance, when I worked for Sony we had the need to add a few characters of Kanji on occasion. It's not easy to get a Kanji editor setup for a western keyboard, so adding direct unicode was more convenient. There are also some oddball symbols in the upper registers for math and chemistry and such that are easier to add using escapes.

Also, in some projects there are guidelines that discourage the use of non-ascii characters in source files (due to problems with editors, source control systems, or other tools. In these situations it's convenient to be able to use inline escapes to specify non-ascii characters that commonly occur in human-readable text ... examples would include ellipses, curly-quotes, emdashes, bullets, currency symbols, as well as accented letters of course.

But we're moving into an era where support of English text (the only language in existance that fits into ASCII) is not good enough. I used to work at Sony, where this issue is magnified about a million times compared to "western" languages. The notion of "optional" support for non-ascii characters was never acceptable. wrote:

IMO, with globalization, languages that don't support Unicode won't make the cut in the long run.

I find it ironic that the three non-Unicode-savvy languages I use (PHP, Ruby, Lua) all come from countries whose native languages use non-ascii characters :)

But I think Lua mostly does a great job of supporting Unicode with it's agnostic approach, because it allows UTF8 sequences to pass through unchallenged... mostly. The few rough edges being discussed here are really minor. Escaping unicode would simply make it more practical to handle the full unicode range. Making it easier to tell the underlying clib to use UTF8 for collating avoids putting platform specific code outside Lua. And allowing non-ascii character identifiers takes this restriction away from non-English speakers. Small changes just to tighten up the i18n support.

chris marrin              ,""$, "As a general rule,don't solve puzzles        b`    $  that open portals to Hell" ,,.
        ,.`           ,b`    ,`                            , 1$'
     ,|`             mP    ,`                              :$$'     ,mm
   ,b"              b"   ,`            ,mm      m$$    ,m         ,`P$$
  m$`             ,b`  .` ,mm        ,'|$P   ,|"1$`  ,b$P       ,`  :$1
 b$`             ,$: :,`` |$$      ,`   $$` ,|` ,$$,,`"$$     .`    :$|
b$|            _m$`,:`    :$1   ,`     ,$Pm|`    `    :$$,..;"'     |$:
P$b,      _;b$$b$1"       |$$ ,`      ,$$"             ``'          $$
 ```"```'"    `"`         `""`        ""`                          ,P`