lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2009/10/16 uri cohen <uri.cohen@gmail.com>:
> Following the feedback that using UTF8 is better than porting LUA to UTF16,
> is there a LUA port or library which allows the user to:
> - define UTF8 literals;
> - open files with UTF8 names;
> - read and write Windows unicode text using only LUA strings
> - bind with external C code which work in UTF16

I wrote such a patch, it is attached to this mail. It adds the
following to the Lua API, declared in standard lua headers but
implemented in lwstring.c:

#define LUA_HAS_WSTRING 1

LUA_API const wchar_t *(lua_towstring) (lua_State *L, int index);
LUA_API const wchar_t *(lua_tolwstring) (lua_State *L, int index,
size_t *length);
LUA_API void (lua_pushwstring) (lua_State *L, const wchar_t *value);
LUA_API void (lua_pushlwstring) (lua_State *L, const wchar_t *value,
size_t size);

LUALIB_API const wchar_t *(luaL_optwstring) (lua_State *L, int index,
const wchar_t *def);
LUALIB_API const wchar_t *(luaL_checkwstring) (lua_State *L, int index);
LUALIB_API const wchar_t *(luaL_checklwstring) (lua_State *L, int
index, size_t *length);
LUALIB_API int	(luaL_loadwfile) (lua_State *L, const wchar_t *filename);

#define luaL_dowfile(L, fn)	\
	(luaL_loadwfile(L, fn) || lua_pcall(L, 0, LUA_MULTRET, 0))

A "wstring" is a userdata which contains a NULL-terminated string of
wchar_t. lua_towstring converts a string on the stack to a wide char
string. It does replace the string on the stack, similarly to
lua_tostring and lua_tonumber when they are passed a number or a
string respectively, so the same restrictions apply (don't use on keys
in a lua_next loop).

On windows these functions use MultiByteToWideChar and
WideCharToMultiByte to convert between utf-8 and utf-16. On other
platforms the functions use mbstowcs and wcstombs, which convert
between the current multi-byte locale and wide strings, usually ucs-4.

On windows only, the patch modifies all calls to the C library that
uses filenames, and convert these filenames using wstring conversion
functions.

It's only been tested on windows so far (both the
MultiByteToWideChar/WideCharToMultiByte and mbstowcs/wcstombs versions
compile there). It's not perfect (I think it may leak if a
lua_pushstring errors), and I didn't patch the Makefile since I'm not
using it. Any feedback is welcome.

Attachment: lua-wstring.patch
Description: Binary data