lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Lua documentation is clear about that : the Length operator on strings returns the size in bytes of the string (not in characters !)

When using string encoding with multibytes characters, the # size (in bytes) may not be the same as the string length (in characters).
UTF8 encoded strings in standard Lua should not be used with string function that needs position (as for string:sub) because position (in bytes) may not correspond to character position.

I agree with you that strings should ideally use character position and not byte position.

29 janvier 2021 08:54 "Egor Skriptunoff" <egor.skriptunoff@gmail.com> a écrit:
So you should write in LuaRT :
for i = 1, str:len() do
local c = str:sub(i, i)
....
end
IMO, both #str and str:sub() should use the same unit of measurement:
either byte or unicode codepoint.
Otherwise I don't understand what "compatibility with Lua" you're talking about.
Most old Lua scripts working with strings will be broken.