Re: Hash Table Collisions (n.runs-SA-2011.004)

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Hash Table Collisions (n.runs-SA-2011.004)
From: William Ahern <william@...>
Date: Wed, 18 Jan 2012 22:40:34 -0800

On Wed, Jan 18, 2012 at 11:17:16PM -0700, HyperHacker wrote:
> On Fri, Jan 6, 2012 at 12:17, Roberto Ierusalimschy
> <roberto@inf.puc-rio.br> wrote:
> >> If I am not mistaken current Lua implementation computes hash value by
> >> iterating through a string one character at a time. Would it be faster
> >> to iterate one word at a time (4 bytes at a time in 32bit
> >> architecture)? ?It will require changing hash function implementation
> >> but it could speed up the calculations. Hopefully, we can still treat
> >> 127-byte long strings as short strings.
> >
> > I do not see a simple way to write this in a portable way (respecting
> > alignment requirements for word access). For strings already internal to
> > Lua we know about their alignment, but external strings (the ones to be
> > internalized) may start anywhere. We would need to do some arithmetic
> > with addresses, something invalid in ANSI C.
> >
> > -- Roberto
> >
> 
> Could there be a compile-time option to enable some
> platform-specific/non-ANSI tricks such at this to improve performance?
> 

Also, _arguably_ it can be done in ANSI C. The new C11 standard provides an
alignof operator. Exact width integer types like uint32_t are guaranteed not
to have any padding bits, and therefore no trap representations. Therefore
you should be able to get away with type punning.

But if you're a stickler for undefined behavior, per se, you could memcpy()
from the string to a uint32_t or uint64_t. Then you're perfectly legit
according to the standard, but you have to cross your fingers that the
compiler optimizes that out.

References:
- Re: Hash Table Collisions (n.runs-SA-2011.004), Ashwin Hirschi
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), David Kolf
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Alexander Gladysh
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), Leo Razoumov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), HyperHacker

Prev by Date: Re: Hash Table Collisions (n.runs-SA-2011.004)
Next by Date: luaL_Buffer Destruction
Previous by thread: Re: Hash Table Collisions (n.runs-SA-2011.004)
Next by thread: Re: Hash Table Collisions (n.runs-SA-2011.004)
Index(es):
- Date
- Thread