[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: object identity/address and hashing
- From: "Alex Davies" <alex.mania@...>
- Date: Fri, 29 Jan 2010 18:23:24 +0800
It sounds like you're trying to reimplement something Lua already does ;) -
I hope this is not the case.
spir wrote:
-1- Is the object identity internally implemented as its actual memory
address?
Yes.
-2- Is there a way to get this value (like python's id() func)
Unfortunately no. Would be nice to have a topointer in lua - you can make
your own via the C api very easily. The odd occasion has popped up that
I've wanted one..
-3- I read somewhere that table key hashes (*) are obtained from key
adresses (or identities, as opposed to hashing data itself). Is this
correct? If yes, what kind of hash formula/algorithm is then used?
Yes, Lua keys are based on table/userdata/coroutine identities, not the
values they contain.
-4- If the above is true, then why isn't this the common way, since it
seems to have only advantages:
I cannot imagine why any language would do it any other way.
-5- The simplest way to hash integers is certainly to get the modulo N,
where N is the number of indexes (and of "buckets" in case of collisions).
What about this simplissim solution?
As the number of buckets is a power of two in Lua, this'd produce a poor
results on pointers which tend to have patterns in powers of two. So Lua
does a modulo of the number of buckets minus 1, which works well.
-6- By exploring this topic, I discovered (on my computer) a minimal
offset of 8 bytes between addresses (I mean they are assigned modulo 8).
(Thus, addresses must be divided per 8 before beeing hashed.) Has this
something to do with Lua?
This has everything to do with the system you're compiling for. Lua just
uses malloc/realloc, which are supposed to return a pointer with the
property that it's aligned for any use. As doubles on x86 are slow when not
8 byte aligned, malloc should always return pointers that are 8 byte
aligned. On another system, this might not be the case.
(*) What's the common name for "return value of a hash func" in english?
(In french we commonly say "empreinte", meaning _print_ like in "finger
print", but there are other words depending on use case.)
We tend to just call it "the hash" or "hash value".