lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I'm not debating the average performance.  I'm wondering about the worst
case performance.  Tables are the heart of Lua and should be as good as
possible.

For instance, numuse does a linear scan through all the slots counting
elements.  That averages out because it's only called at rehash.  But why
make the worse case worse?  There is something to be said for optimizing the
worst case especially when is does occur.  A worse case insert into a large
table can take several hundred times longer than the insert just before it.
Clearly this worse case can be eliminated by keeping a numUsed field in the
table structure at the loss of 4 bytes per table.

Now, on slowish or tightly timed uses this can be a problem.  One trick is
to create a larger table than is needed so these costs never occur and there
have been Lua modification to do that.

> Not exactly. Some elements may get a "better" position (if a previously
> colliding key was removed from the table). It is a time where the table
> is cleaned. And this only happens when there are a minimum number of
> free nodes (1/4 of the table, I guess), not only 1.

That's not what I see.
    1. firstfree only gets decremented
    2. when slots above firstfree are set to nil firstfree is not changed
    3. when firstfree == node the table is rehashed
    4. which eventually calls setnodevector with size == oldSize, which
reallocates and reinserts all the entries

In some cases this is the same as a linear search from the top of the array
for a free slot to set firstfree to.  Yes they are both linear time
operations.  But thousands of times apart in time constant.  Earning a
salary is a linear operation, but I bet we are all sensitive to the rate.

When does this happen?  Any time I get and set entries and have hash
collisions even if the table size does NOT grow, because firstfree is
decremented once for each collision.

Could we have tables without worse case performance that is linear?  Not
easily, since that would imply an allocator that could deliver nodevectors
preinitialized to the empty state.  Sure that's possible to try to supply
from another task and spread the cost of initialization over time, but it's
complicated and might get behind.  Maybe if you bounded the number of slots
in your application...  But I digress.

The point is that there are two obvious ways to improve the worse case
performance of tables.
1. reset the firstfree pointer with a linear scan rather than a rehash of
the whole table --- downside: a tiny bit of new code
2. keep a numUsed field in the table and get rid of the numuse subroutine
--- downside: every table is 4 bytes larger

#1 is definitely worth it.  Something like
static Node *findfree(const Hash *t){
  Node *v = t->node;
  int size = t->size;
  int i;
  for (i=size-1; i>=0; i--) {
    if (ttype(&v[i].val) != LUA_TNIL)
      return &v[i];
  }
  return 0;
}
Inserted into rehash when the size does not change.

I'll experiment some as I already have a table testing and profiling harness
for tables.

Russ