[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Hash Table Collisions (n.runs-SA-2011.004)
- From: Roberto Ierusalimschy <roberto@...>
- Date: Fri, 6 Jan 2012 13:41:19 -0200
> > Yes, it would be great to hear "official" opinion from both Lua team
> > and Mike.
>
> Well, this is not an official opinion, but one possible approach would
> be to break strings into two variants: short strings and long strings,
> where long strings would not be interned any more. More details later
> (too late to write it now).
If I understood the problem correctly, a variable hash seed would
solve it, except for the skipping of characters. So, one approach
is to implement strings using two variants: short strings (up to
32 bytes) and long strings.
Short strings would be implemented exactly like they are now, except for
the per-state variable seed. Long strings, on the other hand, would not
be interned any more. So, Lua does not need to compute its hash unless
it is used as a key. In that case, which I assume is not very common,
its hash will be computed considering all bytes plus the variable
seed (and "memoized").
That change also would improve a little the manipulation of large
strings inside Lua, and even opens the door for the implementation of
"external strings" (large strings where the buffer is outside Lua, for
instance in ROM).
The drawbacks would be for programs that use very long keys
(longer than 32 bytes or some other limit) or that regularly
compare such long strings between them for equality or that
create multiple copies of such long strings. I may be wrong,
but I think neither of these cases are common.
-- Roberto
- References:
- Re: Hash Table Collisions (n.runs-SA-2011.004), Vladimir Protasov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Leo Razoumov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Vladimir Protasov
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Ashwin Hirschi
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), David Kolf
- Re: Hash Table Collisions (n.runs-SA-2011.004), Miles Bader
- Re: Hash Table Collisions (n.runs-SA-2011.004), Alexander Gladysh
- Re: Hash Table Collisions (n.runs-SA-2011.004), Roberto Ierusalimschy