[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
**Subject**: **Re: A good hash algorithm**
**From**: Paul Hsieh <qed@...>
**Date**: Thu, 04 Jul 2002 17:01:23 -0700

> >Now a comparison with existing Lua code. In lstring.c (Lua v4.x) the hashing
> >algorithm for strings appears to be:
> >
> > for (; l>=step; l-=step)
> > h = h ^ ((h<<5)+(h>>2)+(unsigned char)*(s++));
> >
> >Hmm ... this function has number "two character poles". For example, let's say that
> >the table size is 256, and after some part of the string, the value of h is 1. Then
> >inserting any even number of 'a' characters will map the hash value back to 1
> >(modulo 256.)
>
> I haven't checked it, but have you taken into account that h is initialized
> with the length of the string?
No, that would of course fix the problem. But it just means that the patter for
where the poles are is only slightly more complicated. The larger point is that the
algorithms studied at Bob Jenkin's site have had serious study put into them. The
function you show above is unheard of to me. At the very least you should look at
the "One at a Time" hash function which looks only a little more expensive than the
above, and is apparently competitive with Bob Jenkin's formula in terms of quality.
> >My recommendation is that the Lua people at least consider using the Bob Jenkins
> >algorithm.
>
> We're very satisfied with table performance in Lua. Of course, making it faster
> would not hurt, *if* it can be done with more or less the same amount of code.
> I don't know how large Jenkins's code is.
Jenkin's code requires that the size of the object is large enough to amortize the
start up cost, but as he explain on his site, his is amoungst the fastest hash
algorithms that doesn't have defects. Is being satisfied with its "performance"
sufficient? You should still be concerned with minimizing collisions too shouldn't
you?
> I propose that you try changing Lua to use Jenkins's algorithm and report the
> results here. Roberto probably has testing code for table performance.
If you have a benchmark available that still runs on Lua 4.x, then I'd be happy to
try it out.
--
Paul Hsieh
qed@pobox.com