In this installment of my series on SGen, Mono’s new garbage collector, we shall be looking at how finalizers and weak references are implemented, and why you (almost certainly) should not use finalizers.
Tracking object lifetime
Both finalizers and weak references need to track the lifetime of certain objects in order to take an action when those objects become unreachable. To that end SGen keeps lists of finalizable objects and weak references which it checks against at the end of every collection.
If the object referred to by a weak reference has become unreachable, the weak reference is nulled.
If a finalizable object is deemed unreachable by the collector, it is put onto the finalization queue and it is marked, since it must be kept alive until the finalizer has run. Of course, all objects that it references have to be marked as well, so the main collection loop is activated again.
A nursery collection cannot collect an object in the major heap, so it is not necessary to check the status of old objects after a nursery collection. That is why SGen keeps separate finalization and weak reference lists for the nursery and major heaps.
SGen uses a dedicated thread for invoking finalizers. The finalization queue is processed one object at a time. As long as an object is in the finalization queue it is also considered live, i.e. the finalization queue is a GC root.
Resurrecting an object means making it reachable again from within its finalizer or a finalizer that can still reach the object (or via a tracking weak reference). The garbage collector does not have to treat this case specially—until the finalizer has run the object is considered live by virtue of its being in the finalization queue, and afterwards it is live because it is reachable through some other root(s).
Tracking weak references
If an object is weakly referenced and has a finalizer, the weak reference will be nulled during the same collection as the finalizer is put in the finalization queue. That is not always desirable, especially for objects that might be resurrected.
Tracking references solve the problem by keeping the reference intact at least until the finalizer has run. Once the finalizer has finished, a tracking reference acts like a standard weak reference, i.e. it will be nulled once the object becomes unreachable, typically during the next collection (unless the object was resurrected).
When SGen encounters a tracking reference, instead of nulling it, it turns it into a non-tracking reference. The referenced object is now on the finalization queue and therefore considered live again, so the reference will not be nulled before the finalizer has run.
Why finalization is Evil
Finalization should not be used to manage scarce resources, such as file descriptors.
Time of finalization is not determined
Unless you force a garbage collection and wait for the finalizers to finish, the time at which they are run is not determined. If your program does not allocate a lot of memory, or your heap is huge, it might take a very long time until a major collection is triggered, so dead objects on the major heap might not be finalized in time before your scarce resource is depleted.
In this highly recommended talk (starting at 8:35) Cliff Click gives an example where it was necessary to put a hack into the JVM garbage collector that triggered a collection whenever the system ran out of file handles, because Apache Tomcat relied on finalizers to reclaim them.
Finalizers run one after another
A finalizer that performs a time consuming task will delay the execution of other finalizers. An especially worrisome case is finalizers that do potentially blocking I/O. Depending on the timeout of the operation, other finalizers can be blocked for a long time.
Finalizers make objects live longer
Since finalizable objects are considered live until their finalizers have run, they also consume memory until their finalizers have run and the next garbage collection is triggered. This not only applies to the finalizable objects themselves but to all objects they reference as well. This is particularly problematic if finalization is blocked by an ill-behaved finalizer.
Order of finalization is not determined
A finalizer cannot count on other finalizers running before or after it, irrespective of whether it references or is referenced by those objects or not. In other words, a finalizer cannot assume that an object it references has not been finalized yet. That is true even if that object is known to be strongly held by a GC root, namely when shutting down.
An exception to this are critical finalizers, which are run after “normal” finalizers, but between normal finalizers there is no determined order, neither between critical finalizers.
It turns out that SGen’s handling of tracking references, as described above, is not only conceptually wrong, it was also incorrectly implemented, which, ironically, mostly fixed the bug in the design.
Our conceptual mistake was to demote tracking references to non-tracking ones the first time the object became unreachable, which would have led to the reference being non-tracking even if its target was re-registered for finalization. The bug in the implementation was that we didn’t actually do that. Instead, SGen would keep tracking references around until their targets became really, truly unreachable, even via finalization, i.e. until it was done finalizing, without any further chance of resurrection. Only at that point would it turn the tracking reference into a non-tracking one, thereby making the object alive once again and keeping it around for one more garbage collection cycle.
The bug is now fixed. Thanks to Alan for noticing this case and investigating.