lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Le 2021-01-29 15:06, Egor Skriptunoff a écrit :
Lua documentation is clear about that : the Length operator on
strings returns the size in bytes of the string (not in characters
!)

Lua manual also claims: "The string library assumes one-byte character
encodings."
So, by redefining string.sub() you have broken compatibility with a
lot of Lua code.

Nevertheless, a consistent change of both the string library and the
string operators from bytes to unicode codepoints might be a good
idea. At least, such Lua dialect would be interesting to try.

It's a consistent change I agree, but it eases the use of non ASCII characters. LuaRT strings are containers for characters, where Lua strings are containers for bytes (compatible only with one byte encoded characters).

LuaRT brings Buffer objects to manipulate bytes without any encoding.

Sam