[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: How to extract a floating point number locale-independantly
- From: Daurnimator <quae@...>
- Date: Tue, 26 Apr 2016 23:46:46 +1000
On 26 April 2016 at 22:51, Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:
>> This means was suprising to me and has a number of consequences:
>> - Using a seperator such as "," is impossible in the first place as:
>> [...]
>
> Using a separator such as "," is impossible in the first place because
> print(1,0) already has a meaning in Lua. It would a be a big mess if
> the lexer respected locales, and the mess has nothing to do with the
> implementation.
Yes for the specific case of ",", before I read the code I was
thinking other locales might use something else that didn't conflict.
>> - If running in a locale where '.' is not the decimal separator,
>> parsing lua could result in *many* calls to `localeconv`.
>
> I am not sure what you count as "many", but the code seems to call
> `localeconv` once per chunk (unless you change locales during
> parsing, in which case is plus one per change).
Doesn't it get called once per floating point number encountered while lexing?
>> Fixing lua_str2number to be locale independent will:
>> - allow the `trydecpoint` hack to be removed from llex.c
>> - fix tonumber() to *not* be locale dependant.
>
> One is false; as you said, we would have to fall back to it in C89.
I meant it would be removed from llex.c and into luaconf.h.
> Two is dubious; this "fix" may affect people that count on this
> behavior.
I do agree others might be relying on this. This may mean it can't be
fixed until 5.4.
> Couldn't your library change the locale whean reading numbers and
> then convert it back to what it was when done?
No. libraries should never modify the locale (as it's global to the
process and inherently not threadsafe)
unless they're specifically for the task of locale manipulation.
+ the code would be absurd....
local function safe_tonumber(s, b)
local old_locale = os.setlocale()
if old_locale ~= "C" then
os.setlocale("C")
local res = tonumber(s, b)
os.setlocale(old_locale)
return res
else
return tonumber(s, b)
end
end
--------------------------------
One of the big issues is that there is no way to replicate the way lua
itself parses numbers.
And this is what tonumber is sort of documented to do.
Again, I'll quote from the manual:
>From tonumber() docs:
> The conversion of strings can result in integers or floats, according to the lexical conventions of Lua (see §3.1).
Also from the lua_stringtonumber() docs:
> The conversion can result in an integer or a float, according to the lexical conventions of Lua (see §3.1).
>From 3.1:
> A numeric constant (or numeral) can be written with an optional fractional part and an optional decimal exponent, marked by a letter 'e' or 'E'.
> Lua also accepts hexadecimal constants, which start with 0x or 0X.
> Hexadecimal constants also accept an optional fractional part plus an optional binary exponent, marked by a letter 'p' or 'P'.
> A numeric constant with a fractional dot or an exponent denotes a float; otherwise it denotes an integer.
I guess you could do something like:
local function safe_tonumber(s, b)
if b ~= nil then
return tonumber(s, b)
else
local func = load("return " .. s)
if not func then return nil end
return func()
end
end
Which is slightly less bad than the example earlier that uses setlocale.
But still.... I consider this an issue with lua.