lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, 3 Nov 2009 18:56:19 +0100
Mike Pall <mikelu-0911@mike.de> wrote:

> Bulat Ziganshin wrote:
> > > Selecting only up to 3*4 bytes was fast, but not good enough. See
> > 
> > why not use those 12 bytes from the *center* of string?
> 
> But that's why it failed on URLs. The front is 'http', the back is
> 'html'. And the center is some common part of the path higher up
> in the hierarchy. Now, if the URLs are the same length we get a
> 100% collision probability. :-/

One approach I've used in the past is to use an (extremely) trivial
PRNG to decide which byte to hash next, using the current hash value as
the seed, and the length of the string as the initial seed.

B.