lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 22/04/2014 9:18 PM, Coroutines wrote:
On Tue, Apr 22, 2014 at 2:16 AM, Ross Bencina
<rossb-lists@audiomulch.com> wrote:

A hash table has amortized O(1) insert/lookup.

From what I understand, the memory added by maintaining a hash table
of addresses would be double a binary-searched array/vector (assuming
the keys are word-sized for quick access).

In our case the value is just one bit (present/not present) and can most likely be signaled by the NULL pointer. So the space is the same as a sorted list. i.e. no value field is needed.


Also, while a hash table
would provide O(1) lookup time, the hash function itself could take a
while to do it's thing -- though I'm sure in most cases a simple
minimal hash function would beat a binary-search of a vector/array of
real-world size. :>

Yes. I'm not sure where the trade-off would sit.


I wish I could find more information on creating simple hashing
functions for "perfect" or "minimal" hash tables (with collisions or
without to save on memory -- no vacant table slots).

For arbitrary inputs it doesn't work that way. It's a statistical technique. Part of what you're looking for is this: http://en.wikipedia.org/wiki/Open_addressing
and this:
http://en.wikipedia.org/wiki/Cuckoo_hashing

But if you want O(1) you don't want collisions. So you waste some space.

I know of murmurhash and cityhash for non-crypto stuff, but I'm sure
they're oriented toward hashing data of variable-length and not
fixed-size numbers like a memory address.  Again these hashes seemed
more heavyweight and broad?  I'd love to find more examples on trivial
ways to build perfect & minimal hashes.  One recurring example is:
hash(key) = #table / key  -- obviously this produces collisions

Try searching for "fast integer hashing"

This looks like a good intro:

"Integer Hash Function"
https://gist.github.com/badboy/6267743


Wikipedia hints at creating hash/indexes that scale along powers of 2
and to size hash tables appropriately for this.  (tuning the algorithm
for calculating indexes along powers of 2 I guess?)

Here's some stuff I could find:

http://en.wikipedia.org/wiki/Category:Hash_functions
http://en.wikipedia.org/wiki/Category:Cryptographic_hash_functions
http://en.wikipedia.org/wiki/Hash_table

Anyway, sorry for hijacking the thread with my own curiosity.  Back to
business people ^__^


You should look at how Lua implements hashing too. After all, in part at least Lua Tables are hash tables.

R.