[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Managing Unicode (UTF-8 and UTF-16) data in Lua
- From: "Christian N." <cn00@...>
- Date: Fri, 5 Aug 2016 22:08:14 +0200
On 05.08.2016 at 21:54 Paul K wrote:
Hmm, looks like it does more or less what I'm doing at the moment
(converting wide strings to Lua strings), but it doesn't use UTF-8,
but rather uses the ANSI codepage (so it's just as vulnerable to
mojibake as Lua's os.getenv, for example). But thanks for the pointer
It does support UTF8, you just need to set it explicitly:
It will then be passed to MultiByteToWideChar, WideCharToMultiByte,
and other calls.
Have a look at http://utf8everywhere.org/, especially section 10 "How to
do text on Windows". That might answer your question and IMHO the whole
document is very interesting for anyone who works with encodings.
But from the top of my head, using the wide string APIs and converting
from UTF-8 to UTF-16 is the right thing to do. Unfortunately, the os and
io parts of Lua's standard library will be largely unusable for you,
since Windows does not support setting UTF-8 as ANSI codepage and
neither does Microsoft's C runtime (setlocale()). You will basically
have to use a self-patched version replacing calls such as fopen with
their MS-specific UTF-16 equivalents such as _wfopen.