Another Generic Sharing Update

Since my last report on generic code sharing I chased down a few bugs we uncovered when trying out IronPython 2.0. That new version uses the Microsoft Dynamic Language Runtime, which extensively utilizes generics. One issue we came across was how to figure out the actual method for a delegate when only the native code pointer (acquired with ldftn) and no target class is given. For example:

public class Gen<T> {
  public void work () { ... }
}

With generic code sharing the methods Gen<string>.work and Gen<object>.work will share the same native code, so given only a pointer to it it’s not possible to differentiate between the two. What one could do to make it possible to tell between the two would be to let ldftn produce a pointer not to the method directly but to a small piece of trampoline code for which there is one for each instantiation of the method. Fortunately it seems like we don’t have to bother with that, since the .NET CLR doesn’t either. Instead it gives you the instantiation of the method where all type arguments are object, so we do the same.

Another thing I did was implement sharing of methods of generic value types. There doesn’t seem too much code out there which utilizes generic value types extensively, but it wasn’t a big deal to implement so I went ahead and did it. Since instances of value types don’t contain VTable pointers we need to pass the runtime generic context (RGCTX) explicitly for all methods, like we do for static methods of reference types. One complication that arises here is when the value type implements an interface. When casting such a value type to the interface type it gets boxed and receives a VTable for the interface methods. Since the caller of those methods doesn’t know it’s dealing with a value type, much less which particular one, it cannot pass the RGCTX, so the methods in the interface VTable need a wrapper which will pass it. This is very similar to the wrapper we use when taking the address of a static method of a reference type (for constructing a delegate, for example).

I’ll end with an updated table of memory statistics for a few test applications. “Nemerle” is the Nemerle compiler compiling itself. “IronPython 2.0″ is running pystone. “F# 1.9″ is running a simple “Hello world” program on the command line and “F# 2.0″ is compiling a simple program.

No sharing Sharing
Methods
compiled
Code
produced
Methods
compiled
Code
produced
Memory for
(M)RGCTXs
Savings
Nemerle 7127 2008k 6159 1895k 23k 90k
IronPython 2.0 9060 1607k 5833 1011k 42k 554k
F# 1.9 15268 2187k 9828 1659k 111k 417k
F# 2.0 27186 3781k 15828 2830k 239k 712k

Advertisements

Sharing Generic Methods

All of my posts on sharing generic code so far have been about sharing non-generic methods of generic classes, like this one:

class Gen<T> {
  public T [] newArr (int n) {
    return new T [n];
  }
}

But what about generic methods, like this one:

class Gen<S> {
  public Dictionary<S,T> newDict<T> () {
    return new Dictionary<S,T> ();
  }
}

If we want to share this methods between different instantiations, i.e. different type values of T, we need to provide a place for the code to look up the type of Dictionary<S,T>. This place cannot be the runtime generic context, because the data in there only depends on the type arguments of the class, i.e. S, but not of generic methods.

Our solution is to introduce a data structure very similar to the runtime generic context, called the method runtime generic context, or MRGCTX. It is associated not with generic classes and their type arguments, like the RGCTX, but with generic methods and their type arguments. We use the same MRGCTX for generic methods of a specific class if the method type arguments are the same. As an example, these methods would all share the same MRGCTX:

Gen<object>.foo<object,string> ()
Gen<object>.bar<object,string> ()
Gen<object>.quux<object,string> ()

while no two of these methods would use the same MRGCTX:

Gen<object>.foo<object,string> ()
Gen<object>.foo<string,object> ()
Gen<string>.foo<string,object> ()
Ban<string>.foo<string,object> ()

The MRGCTX contains, apart from the RGCTX-like slots, two items of data: A pointer to the vtable of the method’s class, and the values of the method’s type arguments. The first one is needed to get to the class’s RGCTX if no this argument is passed, i.e. in static generic methods. The type arguments are needed to instantiate new slots in the MRGCTX – without knowing what the value of T is, for example, we cannot look up the type Dictionary<S,T>.

So how much memory do we save with shared generic methods? In my previous post on sharing generic code I presented a table with the savings in memory my three large test applications. Here it is again, updated with data for sharing generic methods:

No sharing Sharing Sharing w/gen methods
Methods
compiled
Code
produced
Methods
compiled
Code
produced
Methods
compiled
Code
produced
Memory for
(M)RGCTXs
Savings
IronPython 3614 719k 3368 691k 3324 687k 7k 25k
Nemerle 7210 2001k 6302 1943k 6150 1891k 34k 76k
F# 15529 2193k 11431 2062k 9823 1652k 154k 387k

Note that this time I’m also counting all the memory used by (M)RGCTXs and the (M)RGCTX templates, which I didn’t do last time.