SGen is Mono’s new garbage collector that we have been working on intensely for almost two years now and that has been becoming stable and competitive during the past few months.
In this series of blog posts I will try to explain how garbage collection works in general, how SGen works in particular, how to get the best performance out of it and, finally, what our plans are for the future.
This first post will give a very brief overview over what a garbage collector does and how it does it, before outlining the broad architecture of SGen.
Why a new garbage collector?
Since its inception Mono, like many other garbage collected runtimes, has been using the Boehm-Demers-Weiser collector, which I shall refer to as “Boehm” henceforth. Boehm’s main advantages are its portability, stability and the ease with which it can be embedded. It is designed as a garbage collector for C and C++, not for managed languages, however, so it does not come as a surprise that it falls short for that purpose compared to collectors dedicated to such languages or runtimes.
Our goal with SGen was to overcome Boehm’s limitations and provide better performance for managed applications, in terms of allocation speed, collector pauses as well as memory utilization. In this and the following posts of this series I will mention points where we improve upon Boehm whenever appropriate.
Before digging into the details of SGen it seems prudent to discuss how garbage collection actually works. I will only discuss topics that are relevant to SGen, and even paint those only with very broad strokes. For more comprehensive overviews of garbage collection see “Uniprocessor Garbage Collection Techniques” by Wilson or, for more detailed information, Jones and Lins’s book.
The garbage collector, in short, is that part of the language’s or platform’s runtime that keeps a program from exhausting the computer’s memory, despite the lack of an explicit way to free objects that have been allocated. It does that by discovering which objects might still be reached by the program and getting rid of the rest. It considers as reachable objects those that are directly or indirectly referenced by the so-called “roots”, which are global variables, local variables on the stacks or in registers and other references explicitly registered as roots. In other words, all those objects are considered reachable, as well as those that they reference, and those that they reference, and so on. The pedantic will note that the program might not actually be able to reach all of those. Computing the minimum set of objects that the program could reach is impossible, however, which we know thanks to this guy.
SGen, as well as Boehm, are “stop-the-world” collectors, meaning that they do their work while the actual program, slightingly called the “mutator” in garbage collection lingo, is stopped.
There are three classic garbage collection algorithms: Mark-and-sweep, Copying, and Compaction, of which only the first two are relevant to our discussion of SGen since we have not implemented a compacting collector and have no plans of doing so in the foreseeable future.
Mark-and-sweep is the oldest and probably most widely implemented garbage collection algorithm. Boehm is a mark-and-sweep collector.
The idea is to have a “mark” bit on each object that is set on all reachable ones in the “mark” phase of the collection, by recursively traversing references starting from the root set. Of course in practice this is not actually implemented recursively, but by using a queue or stack.
The “sweep” phase then frees the memory occupied by the objects that were not marked and resets the mark bits on the marked ones.
Of course many variants of this algorithm exist that vary in details.
Traditional mark-and-sweep has two main weaknesses. First, it needs to visit all objects, including the unreachable ones. Second, it can suffer from memory fragmentation like a
A copying collector addresses both problems by copying all reachable objects to a new memory region, allocating them linearly one after the other. The old memory region can then be discarded wholesale. The classic copying collector is the so called “semi-space” algorithm where the heap is divided into two halves. Allocation is done linearly from one half until it is full, at which point the collector copies the reachable objects to the second half. Allocation proceeds from there and at the next collection the now empty first half is used as the copying destination, the so-called “to-space”.
Since with a copying collector objects change their addresses it is obvious that the collector also has to update all the references to the objects that it moves. In Mono this is not always possible because we cannot in all cases unambiguously identify whether a value in memory (usually on the stack) is really a reference or just a number that points to an object by coincidence. I will discuss how we deal with this problem in a later installment of this series.
Generational garbage collection
It is observed that for most workloads the majority of objects either die very quickly or live for a very long period of time. One can take advantage of this so-called “generational hypothesis” by dividing the heap into two or more “generations” that are collected at different frequencies.
Objects begin their lives in the first generation, also referred to as the “nursery”. If they manage to stick around long enough they get “promoted” to the second generation, etc. Since all objects are born in the nursery it grows very quickly and needs to be collected often. If the generational hypothesis holds only a small fraction of those objects will make it to the next generation so it needs to be collected less frequently. Also, it is expected that while only a small fraction of the objects in the nursery will survive a collection, the percentage will be higher for older generations, so a collection algorithm that is better suited to high survival rates can be used for those. Some collectors go so far as to give objects “tenure” at some point, making them immortal so they don’t burden the collection anymore.
One difficulty with generational collection is that it’s not quite possible to collect a young generation without looking at the older generations at all, because the mutator might have stored a reference to a young generation object in an older generation object. Even if that young object is not referenced from anywhere else it must still be considered alive. Clearly scanning all objects in older generations for such references would defeat the whole purpose of separating them in the first place. To address this problem generational collectors make the mutator register all new references from old to young generations. This registration is referred to as a “write barrier” and will be discussed in a later installment. It is also possible to register reads instead of writes, with a “read barrier”, but this is impractical without hardware support. It’s obvious that the write barrier must be very efficient since it’s invoked for every write to a reference field or array member.
SGen, which historically stands for “Simple Generational”, is a generational garbage collector with two generations. The nursery is collected with a copying collector that immediately promotes all live objects, if possible, to the old generation (or “major heap”).
For the old generation we have two collectors that the user can choose between: A mark-and-sweep and a copying collector. The mark-and-sweep collector is available in four variants, with and without a parallel mark phase and with a dynamic or fixed-size heap.
In addition to that SGen has a separate space for large objects (larger than 8000 bytes), called the “Large Object Space”, or “LOS”, which is logically part of the old generation. Large objects are not born in the nursery but directly in the LOS, and they are not moved.
One major difficulty SGen has to deal with is objects that are “pinned”, which means that they cannot be moved. The reason for this is usually that they are referenced from stack frames that we cannot scan “precisely”, i.e. for which we do not know what is a reference and what is not. Work is currently under way to scan managed stack frames precisely, but we will always have to handle unmanaged stack frames, too, for which no type information is available and which can therefore only be scanned “conservatively”. More on this in one of the following posts.
Having covered a few basic concepts of garbage collection I shall go into more detail about SGen in the blog posts following this one. Here is a tentative list of the posts I intend to write in the coming weeks. Comments and suggestions are of course welcome.
Allocating objects and the minor collector
How is the nursery organized, how are new objects allocated and how is the nursery collected?
The major collectors
An overview of SGen’s major collectors, mark-and-sweep and copying. I’ll discuss how they organize the heap and how they collect.
Several shorter topics: Large objects, GC descriptors, conservative scanning, pinning, write barriers, finalization, weak references and domain unloading.
Debugging a garbage collector
Finding bugs in a garbage collector is ridiculously hard. Here I’ll describe some of the tools we use for assistance.
Getting the best performance out of SGen
How to fine-tune SGen for a workload and how to take write applications to take advantage of SGen’s performance characteristics.
What’s in stock for SGen in the future?