lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Mon, Jul 6, 2015 at 8:02 AM, Cosmin Apreutesei
<cosmin.apreutesei@gmail.com> wrote:
>> This IS actually possible to implement in Lua, and it's not TOO
>> horribly inefficient (O(n log m) overhead per construction, no
>> additional overhead for reading and indexing):
>
>
> It gets more complicated and slow[1] if you want the tuple table to be
> self-cleaning though (add in gc/weak-table overhead). I would imagine that a
> built-in tuple type would be much more efficient since it would just index
> on the sum of the hashes of the elements and only check the elements on
> collisions, or am I missing something?
>
> [1] https://github.com/luapower/tuple/blob/master/tuple.lua
>

It depends on what kind of efficiency you're looking for.

As for weak values, your implementation there in luapower doesn't have
a __gc metamethod on tuples, which makes the use of weak values not
even worth the overhead it does add -- you aren't saving that much
memory since the index tables are be populated. And if you DO add __gc
to tuples, then you make the weak-valued case that much more
expensive. So unless you expect to be generating a LOT of unique,
ephemeral tuples (is this even sensible?), it's quite likely to not be
worth the effort at all. (I suppose there are some possible
alternative implementations, like not sweeping the index tree until a
certain number of leaves have reported themselves empty, or requiring
the user to explicitly sweep the index tree from time to time.)

An alternative that hashes the tuple in the index tree is something
that's completely plausible to do in Lua code as well. I suspect this
would be less efficient for short tuples and more efficient for long
tuples.

Yet another option is to simply use non-interned tables, and introduce
a __hash metamethod and add a hash-keyed table structure that calls
the key's __hash to produce a string, either memoized (preferable for
immutable, non-ephemeral tables) or not (preferable for mutable or
ephemeral tables). This is a different side of the tradeoff: it makes
table reads and writes more expensive (only slightly for immutable,
noticeably for mutable), it makes tuple construction cheaper than the
interned method, it avoids the need to maintain an index tree
(substantially less memory usage when you have ephemeral tuples, but
might use more memory if you construct a lot of identical tuples), and
it makes garbage collection free. You can do this entirely in Lua code
as well, although I would recommend a C API function to perform the
hashing.

In other words, there's a lot of different ways to do it, each with
advantages and disadvantages, and implementing any of these in the Lua
core would have similar tradeoffs and I'm fairly certain that you're
not going to satisfy everyone by choosing ONE of these to canonize in
the core. In particular, the most obvious implementation in the core
is the non-interned hashed version, and I think people would complain
vehemently about making table reads more expensive. While memoizing
would fix that, it would make ephemeral tuples more expensive (even
for non-unique ephemeral tuples). Interning would avoid those problems
at the cost of increased memory/GC overhead. These are, of course,
exactly the same tradeoffs you have to make doing it outside of the
core.

/s/ Adam