lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2012/7/5 Simon Orde <sorde@gotadsl.co.uk>:
> 1. Lua scripts are currently always written in ANSI only, and probably
> always will be.

Lua code itself (ie. general syntax outside of string literals) only
use a portion of the ANSI charset that is also present in other
charsets, like UTF-8. So Lua can read UTF-8. You just have to remember
that Lua strings are arrays of bytes, not arrays of characters. So the
encoding of characters to bytes is up to you.

> 2. Strings in Lua can be in any format you like (e.g. ANSI, UTF-8 or UTF-16)
> so apps that want to support Unicode can do so by specifying that string
> parameters and return values are in a Unicode encoding such as UTF-8.
>
> 3. There is currently plenty of library support for ANSI Lua strings, but no
> library support for working with UTF-16 Lua strings.  There are one or two
> small libaries for doing some simple string manipulation with UTF-8 strings
> (e.g. http://lua-users.org/wiki/ValidateUnicodeString and
> http://files.luaforge.net/releases/sln/slnunicode).  UTF-8 is likely to have
> more support in the future in Lua libraries than UTF-16.
>
> 4. IUP currently only supports ANSI Lua strings, but support for UTF-8 Lua
> strings will be added soon?  Is that right?  Any timescales on that?
>
> Is the above correct?  Anything important I haven't mentioned?
>
> I'm a great fan of Lua.  Support for Unicode is really important for me
> though.  The above strategy, if correct, is probably OK for my purposes.
> Ideally I'd prefer script-writers to be able to write scripts in UTF-16 and
> work entirely in UTF-16 - but I can live without that.

With little work you can have your script-writers write in UTF-16, and
then convert that to UTF-8 on the fly (during loading with a custom
loader for example). Code will be interpreted correctly, and content
of string literals will be in UTF-8, which is a good convention IMHO.
Then all you have to do is make sure your libraries accept UTF-8
strings.

> But it will only
> really work when support for UTF-8 becomes available in IUP - and, ideally,
> other Lua libraries.  So does anyone know what work, if any, is
> currently being done on adding support for UTF-8 (or UTF-16?) Lua strings in
> Lua libraries - such as IUP?

Some libraries are already compatible with UTF-8. I believe on Unix
the io, os and lfs modules are already compatible, and on Windows
these are easy to patch (I have patches for 5.1 if you want).

Maybe you should list the ones that you want, and we can tell you more
precisely if they are or will be compatible with UTF-8.