lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Le 2021-01-21 12:40, Egor Skriptunoff a écrit :
What does #str (the string length operator) return?

#str means the size of the string on bytes.
Use string.length to get the length in characters.

What does "character" mean in Lua RT?
Does string.length() return the number of unicode codepoints or the
number of UTF-16 words?
I'm asking because Windows internally works with UTF16-encoded
strings.

What about positions (pos1...pos4) in

string.sub(str, pos1, pos1)
pos2 = string.find(str, "()%w", pos3)
string.byte(str, pos4)
Are they byte positions or character positions?

A character in LuaRT is an UTF8 symbol, for example : "A" "B" "C" "Ö" "é" "@" and so on.
A character position is the symbol position in the string.

Its the same for standard Lua characters where byte position is the same as the character position In LuaRT, character position is not equal to byte position (an UTF8 character can vary in byte size from 1 to 4 bytes)

To keep it simple, LuaRT uses character position for all strings functions, as for standard Lua, to preserve compatibility

Regards,

SAmir