lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Le 2021-01-19 19:31, samir.tine@luart.org a écrit :
1) Yes %w pattern will always match what you call "national alphabet
characters". That maintains compatibility with standard Lua.

2) I have tested string.find("Ё", "[А-Я]") in LuaRT : it returns
correctly 1,2 that means the Ё has been correctly found ?

Regards,

Samir

19 janvier 2021 07:28 "Egor Skriptunoff" <egor.skriptunoff@gmail.com>
a écrit:

On Mon, Jan 18, 2021 at 10:17 PM wrote:

I see you have removed os.setlocale() from the standard library.
How to switch between C locale and national locale?
LuaRT does not use the concept of C locale.

Does it mean that the "%w" pattern will always match national
alphabet characters?
In some projects I prefer to have "%w" as ASCII7-only.
Do I have to use "[A-Za-z%d]" instead in LuaRT?

Char intervals are treated the same. An accentued character (for
example
"à") is not in the interval [A-Z] (which includes letters, but
not
accentued letters).

Is char interval interpreted from the point of view of the alphabet
or from the point of view of codepoint numeric values?
For example, "А" is the first letter of the Russian alphabet, "Я"
is the last letter.
The letter "Ё" is somewhere between them in the alphabet,
but codepoint("Ё") is beyond the numeric range of
codepoint("А")...codepoint("Я")
So, what would be the result of string.find("Ё", "[А-Я]") in
LuaRT?

In fact, it returns 1, 1 (matching the right character).
A bug consisting in a miscalculation of the end of the match was found with your help :)

Thank you !