lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Russell Y. Webb wrote:
> 
> >I had a thinko about the fill rate of the old hash.  I thought, it would
> >only rehash if no free slot was found (would have given a nearly 100% fill
> >rate for numeric keys - the typical lua arrays).  But it always rehashes at
> >66%.
> 
> Interesting idea.  Waiting for no free slots would be bad since that would
> create _large_ collisions in some cases.  Waiting for there to be some
> percentage of the entries colliding would optimize your array case and
> still work well for the rest of the cases.
> 
> Meaning wait until numCollisions/numEntries = (numEntries - (numSlots -
> numSlotsFree))/numEntries reached some value, say 0.3, then rehash.  That
> way tables with are actually rehashed only when there are too many
> collisions.  I forget how collisions are resolved in Lua tables but in some
> cases that would mean that you could have tables be more than 100% full.
> Of course, it probably wouldn't make that much difference and one could run
> into the weird case of everything hashing to the same slot and tables
> having to grow huge to satisfy the low percentage of collisions condition.
> 
> Could be fun to experiment with.

To optimize the numeric style arrays you could do it even easier.  Don't
rehash if the "direct slot" for the key is free.  That way you'ld get
nearly 100% for these arrays.  That hash function for numerics has this
property: hash(i+1)==hash(i)+1  So consecutive keys give consecutive
slots -> 100% fill rate.

Ciao, ET.


PS: I didn't grok the double hashing in the old version.  Looks pretty
strange.  Or should the  "h1 += (h&(tsize-2)) + 1" just add some
random value (0<val<tsize)?  Does this function guaranties that it
probes _all_ slots?