lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> This does not contradict at all what I said.

You suggested that Egor's code golf answer might be using a special
version of Lua that only allows integers as table keys; I said that it
was plain vanilla Lua 5.3 and 5.4 and that some floats are in fact
allowed as table keys. Egor did after all specify "integers in float
format" in the original post, so his method is not meant to apply to
non-integral floats. I am unfortunately not able to fully understand
or respond to the rest of your post, and not really interested in
discussing perceived deficiencies in PUC Lua or its specification.

— Gabriel



On Wed, May 22, 2019 at 2:21 PM Philippe Verdy <verdy_p@wanadoo.fr> wrote:
>
> This does not contradict at all what I said. But the initial comment was not clear enough (speaking about just converting a number to an integer, whithout specifying how the numeric value was converted).
>
> All what is said in Lua about the distinction of floats and integers is absolutely not necessary : it's just an internal implementation detail, the internal datatype used, or "subtype" does not matter, Lua like also Javascript, should only handle "numbers", which is the only datatype at interface level.
>
> And it should also be true of the LuaC internal interface: it should be able to use Lua "numbers" in all cases, and should clearly expose a special API entries with limitations for the special optimization case; the same is true as well for JNI in Java that needs to expose an unrestricted interface and a restricted interface for the "optimized" case; as this also applies to the representation of strings as unrestricted arrays of unsigned 16-bit values and not restricted to UTF-16, or as 8-bit strings using a "modified UTF-8" encoding (needed for the representation of arbitrary 16-bit code units in JNI and in the internal representation of classes compiled to bytecode, because arebitrary 16-bit code units are forbidden in standard UTF-16 and then not representable in standard UTF-8: the "modified UTF-8" of Java/JNI also uses 6 bytes instead of 4 bytes for code points in supplementary planes, because it does not restrict the surrogates to be paired correctly in standard UTF-16 and because JNI actually represents the 16-bit codes units and not really the codepoints).
>
> What was then discussed since long for Java also applies to other languages (including C/C++, C#, Javascript, Lua...) whose datatypes are less restricted than the "standard" subtypes they can be used for and specially optimized for: optimizations should still take into account the "special" values that do not follow the restrictions of the "standard" subtypes: it is important to identify those cases (and note that for floatting points, Lua is supposed to represent NaN values with a single value, but actually the IEEE floats/doubles have multiple values for NaNs, some of them signaling or not, and with additional bits in the mantissa; some other details includes the "denormal" values, with the lowest negatiove exponent-value but a limited precision, and the special cases of "signed zeroes" for which Lua is not clear enough about if they should be unified to a single unsigned zero value of the "number" Lua type). Lua is also not clear about the exposed precision of numbers and how Lua numbers are converted from/to native floats/doubles (that have multiple possible representations, and not just those of the IEEE standard, or those part of the float/double/long double datatypes in C/C++ (which is also a separate complex issue as it also requires a special conversion mechanism between them and the hardware-native types used by an FPU or GPU or some other external device, or in the "vector" computing units integrated in the CPU or in a separate device, which may be in the same host or accessed virtually via some networking conversion interface)
>
> Lua is also not clear about the representation of its "nil" value even if it says it is in a separate datatype: is it really unique with a single instance or does it depend on the representation of "NULL" pointers in C/C++? I'm not even convinced that "nil" in Lua is represented by a NULL pointer in C/C++, it may be a valid non-NULL pointer to an object, but then there's the need to convert NULL pointers used in the internal C/C++ implementation to a valid "nil" object reference in Lua.
>
> The basic datatypes in Lua are clearly insufficiently specified (the same is anso true for C/C++!), we don't know clearly their limits and Lua (unlike Java) exposes no clear way to allow a program to know these limits and adopt a predictable behavior. This makes Lua programs still difficult to assert that they are really "portable" across platforms/implementations, and that Lua implementations are really "conforming". This is a very fuzzy area that shows that Lua is still a language in development. Note that some Java programmers did not like these restrictions (and this caused conflicts between them), and the same applies too to Javascript/ECMAscript: various programmers wanty to sacrifice portability to allow increase of performance (and this is what has happened to C/C++ and what made them the most unsafe programming languages, with critical security bugs that are now so critical and causes a huge growing cost to final users: a nightmare for the world and for our public and personal freedom and security).
>
> Micro-optimizations do not pay at all, there's nothing to win (except for a limited short time), but all to loose. We must always be strict on datatype specifications, even if this causes some minor performance penalty (but this small penalty is compensated rapidly without doing enything else, in just a few months or even less, by the rapid evolution of technologies and their rapid reduction of cost for using them).
>
>
> Le mer. 22 mai 2019 à 00:46, Gabriel Bertilson <arboreous.philologist@gmail.com> a écrit :
>>
>> > You are probably using an implementation of Lua which adds the restriction to number keys for forcing them to be native integers
>>
>> The Lua 5.3 and 5.4 behavior is that some floats are converted to
>> integers when used as table keys, as noted in the manual:
>>
>> > The indexing of tables follows the definition of raw equality in the language. The expressions a[i] and a[j] denote the same table element if and only if i and j are raw equal (that is, equal without metamethods). In particular, floats with integral values are equal to their respective integers (e.g., 1.0 == 1). To avoid ambiguities, any float with integral value used as a key is converted to its respective integer. For instance, if you write a[2.0] = true, the actual key inserted into the table will be the integer 2.
>>
>> — Gabriel
>>
>> On Tue, May 21, 2019 at 4:33 PM Philippe Verdy <verdy_p@wanadoo.fr> wrote:
>> >
>> > May be you save 1 byte in source code, but at runtime it's horrible: you create a new table object to use the double value as a key (which is then converted to integer using an unspecified rounding mode...) initialized to a dummy value 0. Then you use the next iterator to extract that key, and the table will be garbage collected. This needlessly stresses the memory allocator and the GC. And it will be very slow compared to math.floor() or math.ceil().
>> > I'm also not really sure that a table index is necessarily an integer and cannot be a double value.
>> > The only restriction being that it should not be a NaN value, because it is not comparable. But any other double value can be potentially be used as valid index (including non-integers, infinites, and denormals which are comparable and ordered within the set of doubles minus NaNs). So this pseudo-"technic" is very likely to not be portable/supported.
>> >
>> > The official doc currently says "Any value can be used as a key (not just numbers and strings) just make sure it's not nil or NaN (Not a Number)".
>> > It means that a table like {[0]='A', [0.1]='B'} is valid and contains two distinct pairs because keys 0 and 0.1 are both valid and distinct.
>> > You are probably using an implementation of Lua which adds the restriction to number keys for forcing them to be native integers (and then allow the backing store to be optimized by not requiring a storage vector for all integers keys, but only for numeric keys that are not part of a sequence) and otherwise store string lkeys separately.
>> > So, next{[0.1]=2} should return 0.1 (the single key which is not nil and not NaN), and not 0 (some other rounded integer key)...
>> >
>> > And your code "x=next{[x]=0}" MUST certainly fail with a runtime exception, if ever x is NaN or nil (invalid key): nothing will be stored in x, and the next iterator will not even be called, or an empty table may be silently created and the next iterator will return nil (no more keys in table) and then x will be assigned a nil value (which will be different "nil" type and no longer a "number" type).
>> >
>> >
>> > Le mar. 21 mai 2019 à 22:14, Egor Skriptunoff <egor.skriptunoff@gmail.com> a écrit :
>> >>
>> >> On Tue, May 21, 2019 at 10:55 PM Jonathan Goble <jcgoble3@gmail.com> wrote:
>> >>>
>> >>> Took me a few minutes, but I found it:
>> >>>
>> >>> x=next{[x]=0}
>> >>>
>> >>
>> >> Correct!
>> >>
>>