Generic Types are Lazy

In my last post about my work on generics sharing in Mono I expanded on the runtime generic context and its extensible part. Today I want to talk about when the runtime generic context is actually filled, i.e. at which point the actual metadata information is put into the slots.

There are two “natural” strategies for doing that: Strictly and lazily. The strict strategy is to fill the runtime generic context when it is created, whereas doing it lazily means to instantiate the slots when they are actually needed by managed code.

At first it seems that doing it strictly is the way to go because it’s less overhead for the managed code, which can just assume that the values are there instead of having to check and call into unmanaged code if they are not. Unfortunately, we need to be able to handle infinitely deep types, like this canonical example:

class A<T> ...
class B<T> : A<B<B<T>>> ...

The transitive closure of the superclasses involved in this short example is infinite. The class B<T> has the superclass A<B<B<T>>>. The class B<B<T>>, which is involved in that type, has the superclass A<B<B<B<T>>>> in turn, and so on. Nonetheless, that code has to run, so strict instantiation in all cases is not an option.

At first we thought we could keep using strict instantiation in most cases and use lazy instantiation only in some cases by looking at the depth of the types involved, or, more specifically, the depth at which type variables occur. Very informally speaking, we can define the depth of an occurrence of a type variable in a type as the number of angle brackets you have to open to get to that occurrence. For example, the T in T has a depth of zero because it has no angle brackets around it at all. The T in A<T> has depth one, and the T in Dictionary<List<List<string>>,List<T>> has depth two. Since what we have to avoid is the creation of ever deeper types, it seemed to us that if we used lazy instantiation in all the cases where the type depth increases, and strict instantiation everywhere else, the problem would be solved.

Unfortunately, that is wrong. What we were missing was that you can “transform” a dependency where the type depth does not increase into one where it does by subclassing, while being stuck with the original decision to instantiate strictly. Let’s look at this example:

class A<T> ...
 
class B<T>
  someMethod ()
    ... A<T> ...
 
class C<T> : B<C<C<T>>> ...

A<T> is completely harmless. B<T> needs A<T> in someMethod(), so A<T> will be added to the extensible part of B<T>. The nesting depth of A<T> is the same as of B<T>, so we make the decision to instantiate it strictly. Therefore someMethod() will access that particular slot in the runtime generic context without a check, assuming that it is already instantiated, so that slot will also need to be instantiated strictly in every subclass of B<T>. Then we have C<T>, which is a subclass of B<C<C<T>>>. Because it’s a subclass of B<X>, it needs to have A<X> in its runtime generic context. However, for C<T>, X is C<C<T>>, so we’ll need to put A<C<C<T>>> into that slot, and we have an infinite recursion again, of the same form as in the first example.

So, unfortunately, we have to use lazy instantiation for all slots. The (potentially) good news is that we can bring the run-time overhead for it down to almost zero by employing memory management tricks, should we decide that it’s worth the effort to implement that. The current implementation takes about 5 machine instructions to fetch a slot from the runtime generic context.

Advertisements

Hack Week 2: MathMap

This week at Novell we had another Hack Week, for which we are encouraged to work on some new or interesting project of our own choice. I decided to spend the week working on MathMap, which I’ve blogged about before. I’ve improved the composer to a point where it’s actually usable for doing real stuff. That work has made it into the new release 1.3.2.

As a bonus I’ve begun implementing a new feature which will greatly extend the range of image manipulations possible with MathMap. I will talk about this in detail when it is at a point where I can give an actual demonstration.

What I’ve also done is put together a new screencast. This time it’s not about a new feature but is an introduction to the small language that’s at the core of MathMap:

Arc

Paul Graham has released his Lisp dialect Arc. I am underwhelmed. It’s basically a standard Lisp with some syntactic sugar. There are already too many Lisp dialects. Hell, there are too many implementations of Common Lisp and Scheme, the two main Lisp dialects. Do we really need yet another one?