lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Sep 18, 2014 at 5:18 AM, John Hind <john.hind@zen.co.uk> wrote:
> Combining 'oldindex' with existing features ('len', 'newindex' and tracking
> variables) we could:
>
> - Constrain a table to be a proper vector/list (so length would always be
> valid).
> - Redefine length so it is a defined value for the previously undefined
> cases (0, -1 or nil for example).
> - Redefine length so it is a true count of all the elements in the table
> regardless of key.
> - Redefine length so it is the largest positive integer key regardless of
> 'holes'.

unfortunately, all these possibilities (except maybe the first one,
i'm not sure) result in a O(n) time profile for each value erasure.
that's a cost most of us are not willing to pay just for more
palatable operators.


On Thu, Sep 18, 2014 at 7:32 AM, Udo Schroeter <udo.schroeter@gmail.com> wrote:
> It's an implementation detail that bleeds through into userland big time,
> though. From the pairs()/ipairs() distinction through to the way table sizes
> are calculated (which I believe was the thing that set off this discussion
> in the first place).


you're mixing two different things: "the array part" is _not_ the
"sequence" (pairs with integer positive key).

it's quite possible to have a sequence with many of its pairs stored
in "the hash part".  check the source code for the '#' operator and
you'll see that it searches in both parts.

----- ltable.c -----
/*
** Try to find a boundary in table `t'. A `boundary' is an integer index
** such that t[i] is non-nil and t[i+1] is nil (and 0 if t[1] is nil).
*/
int luaH_getn (Table *t) {
  unsigned int j = t->sizearray;
  if (j > 0 && ttisnil(&t->array[j - 1])) {
    /* there is a boundary in the array part: (binary) search for it */
    unsigned int i = 0;
    while (j - i > 1) {
      unsigned int m = (i+j)/2;
      if (ttisnil(&t->array[m - 1])) j = m;
      else i = m;
    }
    return i;
  }
  /* else must find a boundary in hash part */
  else if (isdummy(t->node))  /* hash part is empty? */
    return j;  /* that is easy... */
  else return unbound_search(t, j);
}
---------------

The real situation is more like this:

A.- a table keeps key/value pairs where the key can be any Lua value
except nil and NaN

B.- it's quite useful to have consecutive positive integer keys

C.- the '#' operator and some methods are defined only for tables
where all the positive integer keys form a 1...n consecutive sequence.

D.- the current implementation of a table keeps an array to store some
values that have positive integer keys.  the exact size of this array
is dynamically adjusted via some heuristics.  it might be much bigger
or smaller than the set of positive integer keys in the table.

the point that so many of us forget is that D is just a performance
optimization.  the exact heuristic used to place values in the array
doesn't affect the observed behavior (except for measured
performance).  specifically, there can be values that form a sequence
but are not stored in the array part.

-- 
Javier