Re: Bug report in length of simple table

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Bug report in length of simple table
From: Ross Berteig <Ross@...>
Date: Thu, 15 Sep 2016 12:28:53 -0700

On 9/14/2016 4:42 PM, Miroslav Janíček wrote:

On 15 Sep 2016, at 0:15, Ross Berteig <Ross@CheshireEng.com> wrote:
I'm not sure there are practical versions of Lua where that table
could actually exist. It would have to be built with a small enough
integer type that a table with half of all possible integral keys could
fit within (virtual) memory.


Bah, that’s an implementation detail ;)

Well, I'm an Engineer, not a Mathematician, so implementation detailsare what pays my bills ;-)

But you’re of course right — we probably won’t see systems that
could  handle such tables for 64-bit integers in our lifetimes. However, I
think it’s perfectly possible to have such a table with 32-bit integers,
even today.

And although it would likely be foolish, I suspect with some effort onecould build a Lua using 16-bit integers, even if the float subtyperemained 32-bit. That might even make a kind of sense in some embeddedsystem environment, although having enough RAM to comfortably *use* Luausually implies also having a 32-bit CPU core.

 (As a side note: in my reading of the Lua Reference
  Manual I haven’t been able to conclusively discard the
  possibility that computing sequence length involves
  invoking the __index metamethod. I know it sounds silly,
  but I think that a Lua implementation that would do that
  would still be “legal” according to the manual. Now, in
  such an implementation, the __index metamethod

  function (t, k) if k > 0 then return k; else return nil; end end

  would do the trick for any integer width.)

The 5.3 reference at https://www.lua.org/manual/5.3/manual.html#3.4.7does not say that only raw operations will be used when __len() is notprovided, and nothing in section 2.4's description of __len or __indexwould seem to promise otherwise.

As I read it, your proposed __index would not affect #t. You would needto define __len for that, which makes sense because there really is nosensible meaning to #t in that case.

Dipping into the source code, we find that the stock implementation ofthe length operator is here:


  https://www.lua.org/source/5.3/lvm.c.html#luaV_objlen

which calls

  https://www.lua.org/source/5.3/ltable.c.html#luaH_getn

if __len() is not defined. luaH_getn() is part of the tableimplementation and directly peeks at its internals to do the binarysearch for a boundary value to return.

Now here's another implementation detail: luaH_getn() is declared toreturn `int`, and the part that searches the hash table works with arange i to j where i and j are declared `unsigned int` with theidentified boundary index implicitly cast back to `int` on return.

The catch is not only that `unsigned int` is not the same range as`int`, but that it does not use `LUA_INTEGER` so the range of integerkeys implemented for the length of a sequence is restricted to the rangeof `int` not whatever LUA_INTEGER is configured to be.

Also, there will be a gotcha waiting should #t exceed INT_MAX, which Ithink could cause #t to be negative. If so, I'd argue that is a bug. Ihave not attempted to create a test case, however.

But anyway, that wasn’t really the point. The point is that under
this  definition, it seems that there exists a sequence that does not

> have a length.

Only if there can be a largest integer key. And in practice, in everyimplementation, there does exist a largest integer n such that t[n],t[n-1], and t[n+1] are all distinct entries in the table t.

Then there are many values of i > n, where i == i+1 (and hence t[i] ==t[i+1]) and the whole idea of a sequence falls apart. This is a naturalresult of the quiet and soft roll over from integers to floats in Lua5.3, and always happened in earlier versions of Lua where numbers arealways floats.

The practical answer is to simply note that there is an implementationdefined longest possible sequence, and possibly give some guidance aboutwhat it might be. As I read the sources, that longest length is INT_MAXas currently implemented.

I think the fix could be quite simple, along the following lines:

 * if t[1] == nil, then 0 is a border;
 * let MAX be the maximum value for an integer, then MAX
   is a border if t[MAX] ~= nil;
 * for all positive integers i, if t[i] ~= nil and t[i+1] == nil,
   then i is a border.

Plus: a table is a sequence if and only if it has exactly one border.

That said, I'm not certain that the definition of sequence cares
about whether the keys are specifically integer type numbers. So the
number math.maxinteger will be 2^63-1, and 2^63 can be exactly
represented in 64-bit float. 2^63+1 cannot, and that is probably where
the fuzzy non-border happens.


I must admit I’m not exactly sure what you mean.....

I'm just pointing out that the implementation detail of there beingspecial treatment for integral values is not necessarily important forthe definition of a sequence. 2^63+1 cannot be a border because 2^63+2is, in the fun world of floating point arithmetic, the very same key.


--
Ross Berteig                               Ross@CheshireEng.com
Cheshire Engineering Corp.           http://www.CheshireEng.com/
+1 626 303 1602

References:
- Bug report in length of simple table, Tomas.Lavicka
- Re: Bug report in length of simple table, Marc Balmer
- Re: Bug report in length of simple table, Malma
- Re: Bug report in length of simple table, Oliver Kroth
- RE: Bug report in length of simple table, Tomas.Lavicka
- Re: Bug report in length of simple table, Dirk Laurie
- Re: Bug report in length of simple table, Roberto Ierusalimschy
- Re: Bug report in length of simple table, Miroslav Janíček
- Re: Bug report in length of simple table, Ross Berteig
- Re: Bug report in length of simple table, Miroslav Janíček

Prev by Date: Re: lpeg.Cb bug (?) with repeated evaluation of group capture pattern
Next by Date: Re: The many lengths of Lua
Previous by thread: Re: Bug report in length of simple table
Next by thread: Re: Bug report in length of simple table
Index(es):
- Date
- Thread