On Sep 17, 2011, at 6:45 AM, Fabien wrote:
If we were to play and introduce a new structure (as a patch of course ;-)), I'd want it to offer more than micro-optimisations by saving a couple of table creations and unpackings.
It's possible to write a tuple constructor satisfying your goals about comparison. Construction is linear in the number of elements, but the constant factor is reasonably high.
Here is a quick sketch of how to implement a "hash cons". General tuples can then be built from this.
A "hash cons" pair is represented as a table or closure referencing the cons'd values. In addition, it will need to hold one back pointer as we will see in a moment.
To ensure that cons( a, b ) always produces the same result for the same input values a and b (essential if we're going to use them as table keys), we use a cascaded lookup through weak tables.
T1 is a fully weak map from values to tables T2. We look up the first parameter in T1 creating a new table if none exists.
T2 is then used to map the second parameter to the actual tuple.
So, if everything were already constructed, we would have:
function cons( a, b ) return T1[ a ][ b ] end
But now we actually need to make it work and this takes addressing a number of details.
First, nil and NaN don't work as keys. One could debate how NaN is supposed to work with tuples, but essentially we need a function "safekey" that remaps these to magic values that won't be used anywhere outside the tuple implementation:
function cons( a, b ) return T1[ safekey( a ) ][ safekey( b ) ] end
Next we get to the need for a back link. If cons( a, b ) is still alive, then we need T1[ safekey( a ) ] to stay alive in case we were to invoke cons( a, b ) again. We handle this by having a back link from the cons pair to the second level table.
For security, the cons pair should be implemented to be immutable and the back link should be kept private. This pushes toward implementing the tuple as a closure though if space is a concern it would be better as a C closure.
Finally, as I noted above given a "hash cons", it is straightforward to implement a general "hash tuple" either by stacking up cons pairs appropriately or by generalizing the implementation to support more intermediate tables (and associated back links).
P.S. While we're at it, given the weak tables here, it's probably worthwhile to see the recent finalization discussion regarding the dangers of mixing finalization and weak tables.