lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Quoth Lorenzo Donati <lorenzodonatibz@interfree.it>, on 2010-11-20 10:51:19 +0100:
> I'm really curious to know how people solve this problem usually.
> Am I missing something, or there are other techniques to achieve the
> same result? Although I find the string representation technique
> valid and simple, there are cases IMHO where it would introduce
> relevant overhead, especially when the internal "object"
> representation cannot be the same as the "string key"
> representation, so they cannot be "merged"

Keep in mind (in case you weren't already aware) that Lua interns all
strings, so equality and table-key hashing for Lua string values are
O(1).  If you generate a string that already exists, you're actually
just referencing the existing string.

To do a table lookup for a value with some kind of __hash/__eq usage
and some structure inside would (as a rough estimate of a usual case)
have to iterate the internal structure once for the __hash, then once
for each __eq (unless the objects are themselves already interned, in
which case your original problem no longer appears and you can just
use the objects themselves as keys); generating a string key iterates
the structure once and then has to hash the resulting bytes once in
order to intern it, after which it's been unified with the interned
copy.  (It's possible for intermediary combining operations to take
extra time, but I think this is true for the __hash case as well.)  If
there's no hash collisions I'd expect these to come out about the
same; if there are collisions I'd expect the string keys to be a bit
faster because the comparison cost has been minimized.

I suppose collisions in the string table itself might change that.
That and the hash function Lua 5.1 uses is technically O(1) since it
picks a sparse O(1) subset of the string to hash once the string gets
long enough.  It's all back-of-the-envelope calculations with no
actual measurement anyway, though.

If you do experiment with __hash metamethods I'd be interested to see
performance comparison numbers.

   ---> Drake Wilson