• Subject: The problem with weak tables (Was: food for thought after being ambiguous syntax)
• From: RLake@...
• Date: Tue, 14 Jan 2003 18:50:11 -0500

Thomas Wrensch escribió:

(useful suggestion snipped)

> The problem here is that the func table will accumulate functions and
> tables, not allowing them to be garbage collected. The answer is, of
> course, to make the "func" table a weak table.

Well, yes and no, and this is an minefield I have been avoiding touching.

I think your suggestion would work, but not in the general case.
Specifically, recursive functables would not be garbage collected. (Not
functables whose function is recursive, but recursive functables. There is
a difference, but in practice it is the latter that I would be likely to
use.) And there are many more instances where functables would render
themselves un-garbage-collectable, not all of which are obvious.

The problem with weak tables is that they establish a contingency
relationship between key and value which is not known to the garbage
collection algorithm.

Suppose w is a weak-keyed table, and I do the following:

do
local t = {}
w[t] = t
end

Now, t will never be garbage collected even though it is obviously
garbage-collectable immediately. Here's why:

Although w's keys are weak, its values are strong. So the garbage collector
marks all the values. Then, later, it checks the keys. However, the key t
is marked, so it is retained. Result: t is never garbage collected.

More generally, if a weak key is reachable from its own value, the key will
never be collected.

Whether this is an error or not is a matter of opinion. On the one hand,
the intention of a weak table is to establish a contingency relationship:

weak[k] = v

means "v is reachable through weak if k is reachable."

If it were not for table iteration, that would be the end of the story, but
you could theoretically iterate through the table; it is therefore not
precisely correct to say that the only path through weak to v is via k.

However, I discard this argument on the grounds that normally v will
disappear from the table if k becomes unreachable. My personal preference
(possibly compulsively theoretical) is that the iteration path should not
be available for weak-keyed tables. But I digress.

There is a fix for this problem but it is ugly. The best algorithm I have
come up with -- after almost a year of thinking about it off and on -- is
to give *every* object an extra pointer to a contingency list. When garbage
collecting a weak-keyed table (actually, the same argument applies to
weak-valued tables but the contingency relationship is reversed), instead
of marking all the values, the garbage collector has to construct a linked
list of contingencies, roughly the following in pseudo code (Lua syntax but
of course this would have to be in C)

function mark_weak_keyed_table(weak)
for k, v in weak
if marked(k) then
mark(v)
else
k.contingency = {obj = v, next = k.contingency)
end
end

Now, when an object is marked, we have:

function mark(obj)
local other = obj.contingency
while other do
mark(other.obj)
other = other.next
end
-- mark everything obj references
end

The problem with this is two-fold. First, the garbage collector needs a
potentially large amount of available memory to store the contingency
linked lists and second, any object can have an arbitrary number of
references, which could explode the mark stack.

There are a number of approaches to solving the second problem; the easiest
one is to chain contingency lists, which requires that every object have
both a contingency-list-pointer and a next-contingency-list pointer.

One solution to the first problem is to allocate the contingency objects
when we think they might be necessary rather than at garbage collection.
Every element in a weak table is a potential contingency, so the simple
every partially weak table. (But in reality it has to be every table
because Lua allows tables to be made weak at any time.)

This is reasonable because only the link field is necessary (the contingent
object is already part of the key-val) and because there is usually lots of
slack in a Lua hash element as a result of alignment constraints. There
only needs to be a bit indicating whether it is the key or the val which is
the contingent object.

An alternative is to keep all contingency information in a fixed-size
storage area. A fixed-size hash table could be used to maintain contingency
list head pointers, and a fixed-size vector to maintain contingency (obj,
next) pairs. If the garbage collector runs out of space in this structure,
it just ignores weak links for the remainder of the
garbage collection, and allocates more space for the contingency structures
the next time.

All of this will have an impact both on mark speed and storage. So the
question is, is it worth it?

I ran into this problem even before Lua implemented weak tables, when I was
trying to write code that would take advantage of them. Unfortunately, the
code I was writing was almost guaranteed to create circular references
between weak keys and corresponding values, and I realised that it was
unworkable with the proposed implementation. Specifically, I was trying to
make persistent function closures. The easy solution was to keep the
uncompiled text of the function and use the function itself as a key to
extract the text. However, in Lua function objects are closures, not simply
functions; before a closure can be serialised, it is necessary to serialise
all its upvalues. Furthermore, it is necessary to somehow maintain object
identity. All of that meant keeping a weak table keyed by closure whose
value was a complex structure including the actual upvalues. Now, a
recursive function has itself as an upvalue and I realised that recursive
functions would never be garbage collected.

On the one hand, I think that it is a necessary change. The existing
algorithm is flawed in a way which might bite people unexpectedly. It is
not in general easy to know whether a key is reachable from a value, and it
is this sort of analysis that garbage collection is supposed to avoid. On
the other hand, the situations in which the problem will occur are probably
rare in most people's code, and the extra overhead and complexity are
definitely factors.