lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Jerome Vuarand <jerome.vuarand <at> gmail.com> writes:
> 2009/2/1 Joshua Haberman <joshua <at> reverberate.org>:
> > Languages like Ruby, Python, etc. allow you to define hash functions that
> > apply to your objects.  You define a hash function and a comparison
> > function, and this lets user-defined types be keys in hash tables based on
> > their values.
> >
> > [...]
> >
> > In any case, like I said I'm mainly wondering if anyone has a PowerPatch,
> > I'm not going to argue that it should go into the main Lua (though I
> > wouldn't mind that either :)
> 
> You don't need a power-patch, with the help of existing metamethods,
> you can implement your own hashing algorithm. Here is an example:
> 
> http://lua-users.org/wiki/ComparisonByValue

This scheme requires that two objects that are not equivalent do *not* have the
same "hash" value.  At this point it's not really a hash anymore, it's a
fingerprint:

http://en.wikipedia.org/wiki/Fingerprint_(computing)

The fingerprint you give to Lua as a table key will then be run through the
usual hash table algorithm, and any two fingerprints that hash to the same value
will undergo the usual hash table collision resolution.

Generating short fingerprints means implementing something like Rabin's
algorithm.  That's complicated, so more likely a fingerprint will just be all
the fields of your object concatenated together in some strange and ad-hoc way.
 And once you've done that you *still* haven't even begun the actual hashing
part yet.

Also, with this scheme the table doesn't have any reference to the real key,
only to the fingerprint.  So you can't use that table to iterate over the
original keys.  Though if you're only using the table as a set, you can always
make the set members the table values also.

All of these things can be worked around in various ways.  But it's awkward and
the performance will probably not be as good as having the real hash table
implementation call your hash function and equality comparisons.

The fingerprint method would probably be more palatable if there was an
implementation of Rabin's algorithm available for Lua and written in C that
worked really nicely on Lua types.  That way you could generate unique
fingerprints without having to do ugly string concatenation of all your table's
data, which is both inefficient and error-prone.  And annoying.  :)

Josh