Re: Hash Table Collisions (n.runs-SA-2011.004)

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Hash Table Collisions (n.runs-SA-2011.004)
From: Jay Carlson <nop@...>
Date: Sat, 7 Jan 2012 06:40:23 -0500

On Sat, Jan 7, 2012 at 1:00 AM, HyperHacker <hyperhacker@gmail.com> wrote:
> On Fri, Jan 6, 2012 at 12:17, Roberto Ierusalimschy
> <roberto@inf.puc-rio.br> wrote:
>>> If I am not mistaken current Lua implementation computes hash value by
>>> iterating through a string one character at a time. Would it be faster
>>> to iterate one word at a time (4 bytes at a time in 32bit
>>> architecture)?  It will require changing hash function implementation
>>> but it could speed up the calculations. Hopefully, we can still treat
>>> 127-byte long strings as short strings.
>>
>> I do not see a simple way to write this in a portable way (respecting
>> alignment requirements for word access). For strings already internal to
>> Lua we know about their alignment, but external strings (the ones to be
>> internalized) may start anywhere. We would need to do some arithmetic
>> with addresses, something invalid in ANSI C.
>
> That seems terribly amusing, given that C is supposed to be low-level.
> There's no way to convert the address to an integer and slow-copy the
> first few bytes until we're at an aligned address?

No. Pointers are weird. Doing this depends on implementation-specific
details. As it happens, a complete model for memory given as union {
int n; unsigned char b[sizeof(int)]; } is true almost everywhere.
However, (n & 0xff) may still be b[0] or b[3] or b[7], and I don't
believe there is any guarantee (intptr_t)&n takes on any value of
(intptr_t)&b[i].

Anyway, assuming that union, how do you implement a good hash based on
that int n? This is an honest question.

> --
> Sent from my toaster.

Appropriate enough. A design goal of ANSI C was to be reasonably
implementable on toasters....even if they use sign-magnitude 18-bit
ints. Or 286 protected mode segment descriptors....

Jay

Follow-Ups:
- Re: Hash Table Collisions (n.runs-SA-2011.004), HyperHacker

References:
- Re: Hash Table Collisions (n.runs-SA-2011.004), Vladimir Protasov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Ashwin Hirschi
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), David Kolf
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Alexander Gladysh
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), Leo Razoumov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), HyperHacker

Prev by Date: Re: Readline support
Next by Date: Re: Hash Table Collisions (n.runs-SA-2011.004)
Previous by thread: Re: Hash Table Collisions (n.runs-SA-2011.004)
Next by thread: Re: Hash Table Collisions (n.runs-SA-2011.004)
Index(es):
- Date
- Thread