[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Hash Table Collisions (n.runs-SA-2011.004)
- From: Jay Carlson <nop@...>
- Date: Sat, 7 Jan 2012 06:40:23 -0500
On Sat, Jan 7, 2012 at 1:00 AM, HyperHacker <hyperhacker@gmail.com> wrote:
> On Fri, Jan 6, 2012 at 12:17, Roberto Ierusalimschy
> <roberto@inf.puc-rio.br> wrote:
>>> If I am not mistaken current Lua implementation computes hash value by
>>> iterating through a string one character at a time. Would it be faster
>>> to iterate one word at a time (4 bytes at a time in 32bit
>>> architecture)? It will require changing hash function implementation
>>> but it could speed up the calculations. Hopefully, we can still treat
>>> 127-byte long strings as short strings.
>>
>> I do not see a simple way to write this in a portable way (respecting
>> alignment requirements for word access). For strings already internal to
>> Lua we know about their alignment, but external strings (the ones to be
>> internalized) may start anywhere. We would need to do some arithmetic
>> with addresses, something invalid in ANSI C.
>
> That seems terribly amusing, given that C is supposed to be low-level.
> There's no way to convert the address to an integer and slow-copy the
> first few bytes until we're at an aligned address?
No. Pointers are weird. Doing this depends on implementation-specific
details. As it happens, a complete model for memory given as union {
int n; unsigned char b[sizeof(int)]; } is true almost everywhere.
However, (n & 0xff) may still be b[0] or b[3] or b[7], and I don't
believe there is any guarantee (intptr_t)&n takes on any value of
(intptr_t)&b[i].
Anyway, assuming that union, how do you implement a good hash based on
that int n? This is an honest question.
> --
> Sent from my toaster.
Appropriate enough. A design goal of ANSI C was to be reasonably
implementable on toasters....even if they use sign-magnitude 18-bit
ints. Or 286 protected mode segment descriptors....
Jay
- References:
- Re: Hash Table Collisions (n.runs-SA-2011.004), Vladimir Protasov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Ashwin Hirschi
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), David Kolf
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Alexander Gladysh
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), Leo Razoumov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy
- Re: Hash Table Collisions (n.runs-SA-2011.004), HyperHacker