|
Le 2021-01-29 15:06, Egor Skriptunoff a écrit :
Lua documentation is clear about that : the Length operator on strings returns the size in bytes of the string (not in characters !)Lua manual also claims: "The string library assumes one-byte character encodings." So, by redefining string.sub() you have broken compatibility with a lot of Lua code. Nevertheless, a consistent change of both the string library and the string operators from bytes to unicode codepoints might be a good idea. At least, such Lua dialect would be interesting to try.
It's a consistent change I agree, but it eases the use of non ASCII characters. LuaRT strings are containers for characters, where Lua strings are containers for bytes (compatible only with one byte encoded characters).
LuaRT brings Buffer objects to manipulate bytes without any encoding. Sam