lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sat, Aug 6, 2016 at 9:54 AM, Paul Moore <p.f.moore@gmail.com> wrote:

> Why do that when the standard Lua string type is UTF-8 safe? Better
> surely to use UTF-8 via Lua strings, and only use UTF-16 for
> interfacing to the Windows APIs?

Lua strings are problematic if you are working with C libraries that
expect UTF-8 input; the character offsets returned by the string
library become undependable when Lua encounters multi-byte UTF-8
characters.

We worked around this issue in NoteCase Pro (GTK2 libraries) by
embedding Xavier Wang's luautf8 library, [1] which provides, inter
alia, utf8-compatible equivalents to the Lua string library's
functions that work with character offsets, although we changed its
namespace to avoid a naming collision between one of his functions and
one of the new Lua 5.3 unicode functions.  No problems encountered on
any supported OS (including Windows) in well over a  year.

Best regards,

Paul

[1] https://github.com/starwing/luautf8 (also available via Luarocks).

-- 
[Notice not included in the above original message:  The U.S. National
Security Agency neither confirms nor denies that it intercepted this
message.]