[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Adding utf-8 handling support to Lua 5.1 in 2017
- From: Vadim Peretokin <vperetokin@...>
- Date: Fri, 14 Apr 2017 16:53:30 +0000
Thanks for the vote of confidence, appreciate it!
On Thu, Apr 13, 2017 at 9:17 PM, Vadim Peretokin <firstname.lastname@example.org> wrote:
> I've had a closer look at utf8 in 5.3 and unfortunately it does not enable
> all of string.* to work with utf-8 which is what I need, so that is a no-go.
> I think the alternatives are starwing/luautf8, Stepets/utf8.lua and
> Mediawiki's ustring.
> Has anyone had experience with any of those or other libraries I've missed
> to provide equivalent utf-8 support of the string.* library?
We've used starwing/luautf8 with v. Lua 5.2 and 5.3 embedded in
NoteCase Pro  without reported issues.  We did have to give it
its own namespace (we use "uf8ex") to avoid a function name clash with
Lua 5.3's utf8.len. Fortunately, at the time we implemented starwing's
code, we knew from this list that v. 5.3 would have that naming
conflict so we were able to avoid the conflict before it developed.
Caveat: Those of us who do a lot of scripting in NoteCase Pro  to
my knowledge only use the luautf8 equivalents to the Lua string
library functions that consume or return offsets. We have scant
experience with lua-utf8's other functions.
Hope this helps.
1. <http://notecasepro.com/> (the program embeds Lua on a wide variety
of operating systems. See <http://notecasepro.com/download.php>.
2. I did notice that utf8.title apparently has the same return as
utf8.upper. From my rudimentary understanding of unicode, this is
correct. But I think it represents a poor choice of nomenclature in
the unicode world. "Title case" in the English language does not mean
that all alphabetical characters are upper cased.
3. I've written upward of 600 scripts for NoteCase Pro, most of which
use one or more of the starwing/utf8 string functions.
[Notice not included in the above original message: The U.S. National
Security Agency neither confirms nor denies that it intercepted this