[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Hash Table Collisions (n.runs-SA-2011.004)
- From: Sean Conner <sean@...>
- Date: Wed, 4 Jan 2012 19:43:07 -0500
It was thus said that the Great William Ahern once stated:
> On Wed, Jan 04, 2012 at 05:43:36PM -0500, Sean Conner wrote:
> <snip>
> > good 8.36
> > bad 6.32
> >
> > Note: this on a quad-core 2.8GHz Pentium system. Also note that the "good
> > hash" covers 2,000,000 distinct strings, while the "bad hash" covers only
> > 20,000 (one hundred times less strings). The "good hash" function generates
> > a random, 1024 byte string (not strictly text---it's 1024 bytes of random
> > data). The "bad hash" function generates a similarly random 1024 bytes, but
> > the bytes that comprise the hash (from lstring.c:80) are identical from
> > string to string. One last note: This was coded for Lua 5.1.
>
> I suspect your generation is buggy, because I get something very different.
It's buried in the text, but ...
good 8.36 2,000,000 strings
bad 6.32 20,000 strings
I wasn't patient enough to wait for the bad version to run for 2,000,000
strings. I'm including my code below (since others are posting theirs).
-spc
#include <stdlib.h>
#include <lua.h>
#include <lualib.h>
#include <lauxlib.h>
#if RAND_MAX == 32767
typedef short rand__t;
#else
typedef int rand__t;
#endif
typedef union
{
char garbage[1024];
rand__t noise [1024 / sizeof(rand__t)];
} garbage__t;
static const char m_filler[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef";
static int badhash(lua_State *const L)
{
garbage__t bogus;
size_t count;
size_t step;
size_t i;
for (i = 0 ; i < sizeof(bogus.noise) / sizeof(rand__t) ; i++)
bogus.noise[i] = rand();
count = 0;
step = (sizeof(bogus.garbage) >> 5) + 1;
for (i = sizeof(bogus.garbage) ; i >= step ; i -= step)
bogus.garbage[i - 1] = m_filler[count++];
lua_pushlstring(L,bogus.garbage,sizeof(bogus.garbage));
return 1;
}
static int goodhash(lua_State *const L)
{
garbage__t bogus;
for (size_t i = 0 ; i < sizeof(bogus.noise) / sizeof(rand__t) ; i++)
bogus.noise[i] = rand();
lua_pushlstring(L,bogus.garbage,sizeof(bogus.garbage));
return 1;
}
int luaopen_badhash(lua_State *const L)
{
lua_pushcfunction(L,badhash);
lua_setglobal(L,"badhash");
lua_pushcfunction(L,goodhash);
lua_setglobal(L,"goodhash");
return 0;
}
- References:
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Ashwin Hirschi
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), William Ahern
- Re: Hash Table Collisions (n.runs-SA-2011.004), William Ahern
- Re: Hash Table Collisions (n.runs-SA-2011.004), Gé Weijers
- Re: Hash Table Collisions (n.runs-SA-2011.004), William Ahern
- Re: Hash Table Collisions (n.runs-SA-2011.004), Sam Roberts
- Re: Hash Table Collisions (n.runs-SA-2011.004), Sean Conner
- Re: Hash Table Collisions (n.runs-SA-2011.004), William Ahern