lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> I've never seen an incorrect size returned by #t so far but is this
> behaviour subject to change in later lua versions?

As long the table is a "list" (all positive, integer values are from 1
to #t) the moment you call it, # works fine and unequivocal. If it
isn't it still will return a good place where t[#t] is nil, in case
you just want to append anything anywhere not caring where as long it
doesnt overwrite something.

I suppose it helps to look at how Lua does it. This is a rough
interpretation, exact code for 5.1.4 is below.

First it looks to find a X > 1 for which t[X] is nil,

How it does find that X depends on the structure of table, how much of
it is either in the array part / hash part etc. In the array case it
can take the size of the array, in the hash case, it doubles X until
it is t[X| is nil

Then Y is set to 1 so that t[Y| is not nil. If t[1] is nil, the answer is 0.

Then for n and m it makes a binary search, that is trying to increse Y
so that t[Y] stays non nil, and reducing X so that t[X] stays nil. If
X - Y is 1, its finished and found the (or an) "edge".

In an array with holes the actual edge might differ on which parts are
in array and which are in hash, which depends on the history of the
table.
if the table is a "list" then there is only one edge, and #t is an
unequivocally answer.

I suppose the developers did a lot of speed testing, as Lua does care
a lot about speed, but it is still a respectable operation that needs
to be done, everytime # is called. It depends on the usecase, if you
have a list(table) where # is used more often than it is altered,
storing the length in a "n" or "#" variable is going to save time. If
it is altered a lot without # being used, changing "n" or "#" with
each alteration will be slower than running that binary search at the
end of it.

I hope by and large this has been a correct representation.
-----

static int unbound_search (Table *t, unsigned int j) {
  unsigned int i = j;  /* i is zero or a present index */
  j++;
  /* find `i' and `j' such that i is present and j is not */
  while (!ttisnil(luaH_getnum(t, j))) {
    i = j;
    j *= 2;
    if (j > cast(unsigned int, MAX_INT)) {  /* overflow? */
      /* table was built with bad purposes: resort to linear search */
      i = 1;
      while (!ttisnil(luaH_getnum(t, i))) i++;
      return i - 1;
    }
  }
  /* now do a binary search between them */
  while (j - i > 1) {
    unsigned int m = (i+j)/2;
    if (ttisnil(luaH_getnum(t, m))) j = m;
    else i = m;
  }
  return i;
}


/*
** Try to find a boundary in table `t'. A `boundary' is an integer index
** such that t[i] is non-nil and t[i+1] is nil (and 0 if t[1] is nil).
*/
int luaH_getn (Table *t) {
  unsigned int j = t->sizearray;
  if (j > 0 && ttisnil(&t->array[j - 1])) {
    /* there is a boundary in the array part: (binary) search for it */
    unsigned int i = 0;
    while (j - i > 1) {
      unsigned int m = (i+j)/2;
      if (ttisnil(&t->array[m - 1])) j = m;
      else i = m;
    }
    return i;
  }
  /* else must find a boundary in hash part */
  else if (t->node == dummynode)  /* hash part is empty? */
    return j;  /* that is easy... */
  else return unbound_search(t, j);
}