# Garbage Collecting Weak Tables

NOTE: The issue described here was addressed with LuaFiveTwo ephemeron tables.

Extracted from a long posting I made to Lua-L explaining why weak tables are not a viable solution to a particular problem, because the key might be reachable from the value.

The problem with weak tables is that they establish a contingency relationship between key and value which is not known to the garbage collection algorithm.

Suppose `w` is a weak-keyed table, and I do the following:

```do
local t = {}
w[t] = t
end
```

Now, `t` will never be garbage collected even though it is obviously garbage-collectable immediately. Here's why:

Although `w`'s keys are weak, its values are strong. So the garbage collectormarks all the values. Then, later, it checks the keys. However, the key `t` is marked, so it is retained. Result: `t` is never garbage collected.

More generally, if a weak key is reachable from its own value, the key will never be collected.

Whether this is an error or not is a matter of opinion. On the one hand, the intention of a weak table is to establish a contingency relationship:

```weak[k] = v
```

means "`v` is reachable through `weak` if `k` is reachable."

If it were not for table iteration, that would be the end of the story, but you could theoretically iterate through the table; it is therefore not precisely correct to say that the only path through `weak` to `v` is via `k`.

However, I discard this argument on the grounds that normally `v` will disappear from the table if `k` becomes unreachable. My personal preference (possibly compulsively theoretical) is that the iteration path should not be available for weak-keyed tables. But I digress.

There is a fix for this problem but it is ugly. The best algorithm I have come up with -- after almost a year of thinking about it off and on -- is to give *every* object an extra pointer to a contingency list. When garbage collecting a weak-keyed table (actually, the same argument applies to weak-valued tables but the contingency relationship is reversed), instead of marking all the values, the garbage collector has to construct a linked list of contingencies, roughly the following in pseudo code (Lua syntax but of course this would have to be in C)

```function mark_weak_keyed_table(weak)
for k, v in weak
if marked(k) then
mark(v)
else
k.contingency = {obj = v, next = k.contingency)
end
end
end
```

Now, when an object is marked, we need:

```function mark(obj)
local other = obj.contingency
while other do
mark(other.obj)
other = other.next
end
-- mark everything obj references
end
```

The problem with this is two-fold. First, the garbage collector needs a potentially large amount of available memory to store the contingency linked lists and second, any object can have an arbitrary number of references, which could explode the mark stack.

There are a number of approaches to solving the second problem; the easiest one is to chain contingency lists, which requires that every object have both a contingency-list-pointer and a next-contingency-list pointer.

One solution to the first problem is to allocate the contingency objects when we think they might be necessary rather than at garbage collection. Every element in a weak table is a potential contingency, so the simple solution is to add an additional link pointer to every key-val pair in every partially weak table. (But in reality it has to be every table because Lua allows tables to be made weak at any time.)

This is reasonable because only the link field is necessary (the contingent object is already part of the key-val) and because there is usually lots of slack in a Lua hash element as a result of alignment constraints. There only needs to be a bit indicating whether it is the key or the val which is the contingent object.

An alternative is to keep all contingency information in a fixed-size storage area. A fixed-size hash table could be used to maintain contingency list head pointers, and a fixed-size vector to maintain contingency (obj, next) pairs. If the garbage collector runs out of space in this structure, it just ignores weak links for the remainder of the garbage collection, and allocates more space for the contingency structures the next time.

All of this will have an impact both on mark speed and storage. So the question is, is it worth it?

I ran into this problem even before Lua implemented weak tables, when I was trying to write code that would take advantage of them. Unfortunately, the code I was writing was almost guaranteed to create circular references between weak keys and corresponding values, and I realised that it was unworkable with the proposed implementation. Specifically, I was trying to make persistent function closures. The easy solution was to keep the uncompiled text of the function and use the function itself as a key to extract the text. However, in Lua function objects are closures, not simply functions; before a closure can be serialised, it is necessary to serialise all its upvalues. Furthermore, it is necessary to somehow maintain object identity. All of that meant keeping a weak table keyed by closure whose value was a complex structure including the actual upvalues. Now, a recursive function has itself as an upvalue and I realised that recursive functions would never be garbage collected.

On the one hand, I think that it is a necessary change. The existing algorithm is flawed in a way which might bite people unexpectedly. It is not in general easy to know whether a key is reachable from a value, and it is this sort of analysis that garbage collection is supposed to avoid. On the other hand, the situations in which the problem will occur are probably rare in most people's code, and the extra overhead and complexity are definitely factors.

--RiciLake

Presently when tables are marked as weak in both the keys and the values, the entry is collected when either the key or the value is unreachable. Could a new option, the entry collected when both the key and value are unreachable, without tracing either of them, address the recursive function problem described above? It seems like it would address the general problem of eliminating the contingency relationship between key and value. -- DougCurrie