[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Deterministic hashing for lua tables
- From: hollyrosella@...
- Date: Fri, 16 Dec 2011 02:33:26 +0000
I am working on a fully deterministic Lua, and I gather that one source of indeterminacy in a script's execution is that Lua tables hash objects according to their positions in memory, and traversing table contents follows the hash table positions, resulting in different traversal orders depending on memory layout. Other potential indeterminacies could arise from floating-point rounding errors and interaction with the OS
(either through seeding the RNG from system time or through explicitly accessing system
Dave Rodgers published a patch last year which ensures deterministic traversal order by determistically sorting table contents. He very kindly shared this patch with me, including the revision history, and it was tremendously helpful in finding my way around relevant parts of the lua source, as well as a cleanly written modification. Having studied how Dave did it, I am wondering whether it would instead be possible to change the ltable.c:mainposition function to choose the hash table position deterministically. I am imagining changing ltable.c:hashpointer to a function which when called with a pointer p, looks p up in an internal hash table IHT. If p is already a key in IHT, the new hashpointer function would return the value there, and if not, it would
1) generate a hash value h from a deterministic series (just some PRNG with a fixed seed, which does not refer to p at all),
2) store the (p, h) pair in IHT, and
3) return h.
Thus, as long there is a fixed order in which objects are assigned as keys to a Lua table T (which should happen if execution is deterministic), the hash values used to store them in T will be fully determined by the PRNG, and hence fully deterministic.
Does this sound workable? I welcome any comments or criticism.
Another possibility would be to create hash functions for all of the lua types listed under "basic types" in lua.h. This would have the advantage of being more robust if indeterminate behavior creeps into the execution through some other source, a merit which Dave's solution shared. However, I am not sure what I could use for deterministic hashing of the LUA_TFUNCTION and LUA_TTHREAD types, and I would need to require the LUA_T*USERDATA types to provide some kind of hashing function for themselves. Also, my application is very sensitive to indeterminacy, so it would probably be better to fail quickly in my case anyway.