|
Bulat Ziganshin wrote:
i think that idea that you may hold utf-16 strings in Lua strings is wrong by itself. you should make specialized datatype
Maybe I wasn't clear enough: my library isn't using utf-16 on the Lua side (though "wrong by itself" is an exaggeration IMHO).
Robert Raschke wrote:
When processing "strings" that may contain non-ASCII data it is imperative you always only ever use the C API functions where you pass in the length of the buffer holding your "string"! Never rely on the C style terminating zero.
That's very true, and I never intended to do so. But the things are different with Windows API: utf-16 strings do contain non-ASCII data, and these functions know that the strings end with L'\0' (two consecutive zero bytes).
Jerome Vuarand wrote:
AFAIK there is little reason to use UTF-16 in Lua rather than UTF-8. With UTF-8 most string library functions can be used to some extent. There are nice functions in the win32 API to convert from UTF-8 to UTF-16 and back : MultiByteToWideChar and WideCharToMultiByte.
I'm using utf-8 on the Lua side and utf-16 on the C side, where it is required by Windows API. The conversions between them are made by functions downloaded from www.unicode.org (that adds approx. 3 KiB of size to the executable).
Joshua Jensen wrote:
You might try LuaPlus (http://luaplus.org/) which has real wide character string support. Be sure to grab the Subversion build from svn://svn.luaplus.org/LuaPlus/work51. If you would like to merge it into your Lua, look for #ifdef'ed blocks of LUA_WIDESTRING and LUA_WIDESTRING_FILE.
If I write for LuaPlus then the users of my software would have to use LuaPlus too (and this is not desirable). Still, there's an opportunity to study how LuaPlus solves these problems (and even borrow some parts from it, as its license permits that) :)
Many thanks to all who responded.My proposal was intended to alleviate manipulation of wide character strings with Lua API, before passing them to Windows functions, the same way as keeping one extra zero byte by Lua makes things easier with the C-string API.
-- Shmuel