lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 5 April 2011 20:26, Axel Kittenberger <axkibe@gmail.com> wrote:
>> I've never seen an incorrect size returned by #t so far but is this
>> behaviour subject to change in later lua versions?
>
> As long the table is a "list" (all positive, integer values are from 1
> to #t) the moment you call it, # works fine and unequivocal. If it
> isn't it still will return a good place where t[#t] is nil, in case
> you just want to append anything anywhere not caring where as long it
> doesnt overwrite something.
>
> I suppose it helps to look at how Lua does it. This is a rough
> interpretation, exact code for 5.1.4 is below.
>
> First it looks to find a X > 1 for which t[X] is nil,
>
> How it does find that X depends on the structure of table, how much of
> it is either in the array part / hash part etc. In the array case it
> can take the size of the array, in the hash case, it doubles X until
> it is t[X| is nil
>
> Then Y is set to 1 so that t[Y| is not nil. If t[1] is nil, the answer is 0.
>
> Then for n and m it makes a binary search, that is trying to increse Y
> so that t[Y] stays non nil, and reducing X so that t[X] stays nil. If
> X - Y is 1, its finished and found the (or an) "edge".
>
> In an array with holes the actual edge might differ on which parts are
> in array and which are in hash, which depends on the history of the
> table.
> if the table is a "list" then there is only one edge, and #t is an
> unequivocally answer.
>
> I suppose the developers did a lot of speed testing, as Lua does care
> a lot about speed, but it is still a respectable operation that needs
> to be done, everytime # is called. It depends on the usecase, if you
> have a list(table) where # is used more often than it is altered,
> storing the length in a "n" or "#" variable is going to save time. If
> it is altered a lot without # being used, changing "n" or "#" with
> each alteration will be slower than running that binary search at the
> end of it.
>
> I hope by and large this has been a correct representation.
> -----
>
> static int unbound_search (Table *t, unsigned int j) {
>  unsigned int i = j;  /* i is zero or a present index */
>  j++;
>  /* find `i' and `j' such that i is present and j is not */
>  while (!ttisnil(luaH_getnum(t, j))) {
>    i = j;
>    j *= 2;
>    if (j > cast(unsigned int, MAX_INT)) {  /* overflow? */
>      /* table was built with bad purposes: resort to linear search */
>      i = 1;
>      while (!ttisnil(luaH_getnum(t, i))) i++;
>      return i - 1;
>    }
>  }
>  /* now do a binary search between them */
>  while (j - i > 1) {
>    unsigned int m = (i+j)/2;
>    if (ttisnil(luaH_getnum(t, m))) j = m;
>    else i = m;
>  }
>  return i;
> }
>
>
> /*
> ** Try to find a boundary in table `t'. A `boundary' is an integer index
> ** such that t[i] is non-nil and t[i+1] is nil (and 0 if t[1] is nil).
> */
> int luaH_getn (Table *t) {
>  unsigned int j = t->sizearray;
>  if (j > 0 && ttisnil(&t->array[j - 1])) {
>    /* there is a boundary in the array part: (binary) search for it */
>    unsigned int i = 0;
>    while (j - i > 1) {
>      unsigned int m = (i+j)/2;
>      if (ttisnil(&t->array[m - 1])) j = m;
>      else i = m;
>    }
>    return i;
>  }
>  /* else must find a boundary in hash part */
>  else if (t->node == dummynode)  /* hash part is empty? */
>    return j;  /* that is easy... */
>  else return unbound_search(t, j);
> }

Thanks for that detailed and interesting reply, I understand now.