[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: stringtable and Table Hashing
- From: Jamie Webb <j@...>
- Date: Tue, 1 Jun 2004 23:31:50 +0100
On Tuesday 01 June 2004 22:28, Yann Com-Nougue wrote:
> The stringtable uses chaining to resolve hash collisions which seems
> to me to be a good solution to collisions. Whereas the hashing of tables
> (in ltable.c) uses probing to resolve collisions. Intuitivelly I would
> think that chaining is a generally better solution but I`m new at this so I
> might be wrong.
Chaining reduces the expected number of collisions, however it has the
overhead of requiring memory to be allocated for each element of the chain.
For strings, this is not a big deal because memory is being allocated anyway
to hold the newly-created string. Regular tables however store their values
directly in the table, with no new memory allocated (the data is either small
enough to fit there, or is a reference to an already-created object).
Therefore in this case chaining would represent a significant overhead, both
in terms of time to allocate and deallocate the chain nodes, and memory in
terms of bookkeeping and fragmentation for all those small blocks. As Brent
points out, that memory could equally well be used to make the hash table
bigger. So, you may be better off just reducing the load factor if you need
better performance at the expense of memory.
> Anywho, my app inserts data in the hash mostly at init so what I
> need is lookup speed. I will be profiling the number of collisions and
> nodes traversed when probing to have a better idea of how my data set
> reacts to the probe collision handling. If time permits, I might change it
> to chaining and run my profiling again to see what's best for me :)
I'm wondering actually if with a low enough load factor simple sequential
probing might not improve performance, because of the increased chance that
subsequent probes will be cache hits (table nodes are probably 28 bytes, and
most current processors have 64 byte L1 cache lines).
-- Jamie Webb