lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Asko Kauppi wrote:


I've been thinking about UTF-8 and Lua lately, and wonder how much work it would be to actually support that in Lua "out of the box". There are some programming languages (s.a. Tck) that claim already to do that, and I feel the concept would match Lua's targets and philosophy rather nicely.


-ak

UTF-8 in a nutshell:
    http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8


I would like to see some language that really supports "universal language" and why not Lua?

Consider ICU as a base:

"

ICU is a mature, widely used set of C/C++ and Java libraries for Unicode support, software internationalization and globalization (i18n/g11n). It grew out of the JDK 1.1 internationalization APIs, which the ICU team contributed, and the project continues to be developed for the most advanced Unicode/i18n support. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software.

 *ICU Features*

As computing environments become more heterogeneous, software portability becomes more important. The International Components for Unicode (ICU) libraries provide robust and full-featured Unicode services on a wide variety of platforms, without sacrificing performance. It supports the most current version of the Unicode standard, and provides support for supplementary Unicode characters (needed for support of the repertoires of GB 18030, HKSCS, and JIS X 0213). It offers great flexibility to extend and customize the supplied services, which include:

   *
     Text: Unicode text handling, full character properties and
     character set conversions (500+ codepages)
   *
     Analysis: Unicode regular expressions; full Unicode sets;
     character, word and line boundaries
   *
     Comparison: Language sensitive collation and searching
   *
     Transformations: normalization, upper/lowercase, script
     transliterations (50+ pairs)
   *
     Locales: Comprehensive locale data (230+) and resource bundle
     architecture
   *
     Complex Text Layout: Arabic, Hebrew, Indic and Thai
   *
     Time: Multi-calendar and time zone
   *
     Formatting and Parsing: dates, times, numbers, currencies,
     messages and rule based

ICU is an open source development project sponsored, supported, and used by IBM. It is dedicated to providing robust, full-featured, commercial quality, freely available Unicode-based technologies. The ICU project is licensed under the X License <http://www-306.ibm.com/software/globalization/icu/license.jsp> (see also the x.org original <http://www.x.org/Downloads_terms.html>), which is compatible with GPL <http://www.gnu.org/philosophy/license-list.html#GPLCompatibleLicenses> but with fewer restrictions on commercial use of the software. The ICU library supports multi-threading environments, and is available in C, C++ and Java.

"

It actually brings a lot more to the table than just Unicode!

Dave LeBlanc
Seattle, WA USA