Mostly Software

October 12, 2007

The Trouble with Shared Generics

Filed under: Software — schani @ 10:12 am

Some time ago I blogged about my work on generics sharing in Mono. Now, about three weeks later, I’m committing around 100k worth of patches and feel the need to explain in more detail what the heck it is exactly that I’m doing, why something that sounds to trivial takes that much code to get working, how far we’ve come and how much still remains to be done.

First I’ll illustrate why it’s not entirely trivial to share generic code. We’ll start with a simple class GenA<T>, like in my previous blog post. Let’s confine ourselves to cases where GenA is instantiated with a class type, i.e., where T is a class and not, for example, the type int or a struct. Ideally, we’d like each of GenA‘s methods of all those instantiations to share the same native code. For now we’ll only look at methods which are non-generic (i.e. the methods themselves don’t have type arguments, but our class has one) and non-static (i.e. they take an implicit this argument).

Some methods can already be shared, like this one:

public T ident (T obj) {
  return obj;
}

Sharing this method between all instances of GenA (with reference type arguments) is easy because its behavior does not depend on the type argument T. All it has to do is return the object pointer you pass it, no additional knowledge about T required.

Once we introduce a construct that requires knowledge about T things become much more complicated:

public T[] newArr () {
  return new T [3];
}

This method has to know the actual class that T stands for because it has to create an array using that class as the element type. Handling the case where the element type is one of the type arguments is not enough, though. We might have a method like this:

public GenA<T>[] newGenAArr () {
  return new GenA<T> [3];
}

Here the element type is not the type argument but the method’s class itself, which of course also depends on the type argument. We can go even further by using a class which isn’t even connected to GenA:

public SomeClass<T>[] newSomeClassArr () {
  return new SomeClass<T> [3];
}

While this case might seem very similar to the one just before it, it is fundamentally different in one important regard, which I’ll come to in a moment.

What’s important to know at this point is that to efficiently create an array in Mono you need to have the element type handy. In the case of newGenAArr this is easy: We have the this pointer through which we can get to the type of GenA<T> for whichever T the method is called. We don’t actually do it that way, for reasons I’ll explain later, but the point is that we could.

For newArr it’s a bit more difficult, but since Mono stores the actual type arguments for an instantiation of a generic class in that instantiation’s class run-time data structure we could again get to it via the this pointer. We don’t, but we could.

The third case, newSomeClassArr is fundamentally different from the other two because a generic class instantiation doesn’t store the types of completely unrelated classes which just happen to be instantiated with the same type arguments as itself. We only actually need those which are used somewhere in one of the class’s methods, but, unfortunately Mono doesn’t first parse all of a class’s methods when it is loaded, so at class initialization time we cannot know which, or even how many, such types we need.

Clearly the last case requires a new extensible data structures to store such type information.

We’ve decided to introduce a new data structure for all of that type information, called the “Run-time Generic Context”. Each instance of a generic class, that is, each closed generic class, has its own run-time generic context from which we can fetch the type information for all the cases described above. There are several reasons for introducing this new data structure. First, getting all the type information through the this pointer often requires a lot of indirections and we want to be more efficient. Second, in static methods we don’t even have a this pointer (of course we also don’t have a run-time generic context there yet, but we’ll pass it as an implicit argument eventually). And third, we need a new data structure anyway, if only for the last case above.

Now let’s see how far I’ve come implementing generics sharing. First off, we can only share non-generic non-static methods at this point. Also, we only share for generic instantiations where all type arguments are reference types. That is, the classes Dictionary<Object,Object> and Dictionary<String,Object> will share methods, but the classes Dictionary<int,Object> and Dictionary<int,String> will not share methods. All of that will be implemented later on, of course.

With that out of the way we can look at the operations (or op-codes) for which sharing is implemented. By that I mean which CIL operations a method is allowed to use and still be shared. All operations whose function does not depend on the type arguments already work, which is why the method ident above, for example, can be shared. Here, then, are the operations which can depend on the type arguments:

Operation Class Type Argument Other
newobj/initobj
isinst
Boxing
castclass
Static field access
newarr
mkrefany
ldtoken
ld(virt)ftn
initobj
sizeof

The last three columns of the table indicate whether a method can be shared when it uses that row’s operation with its own type, with one of its type arguments or with some other generic class instantiated with one of the type arguments. Using those operations with types which do not depend on the type arguments is no problem for sharing.

When I wrote my first blog post on generics sharing the table would have looked exactly like above. That is to say, if a method used any of those operations with a type depending on the class’s type arguments it could not be shared. Now, after about 100k of patches (not all of which are committed yet), here’s what the table looks like:

Operation Class Type Argument Other
newobj/initobj
isinst
Boxing
castclass
Static field access X
newarr
mkrefany
ldtoken
ld(virt)ftn
initobj
sizeof

I’m trying to shock you, of course. Not every X in that table is going to take another big load of patches to get going. Most of what I did for the past few weeks was infrastructure work to make implementing those operations easier. The X for static field access means that this method can be shared now:

static T[] arr;
 
public T[] getArr () {
  return arr;
}

One of the difficulties in getting this to work was that before a static field is accessed the run-time must call the class’s static constructor. Up until now whenever a static constructor had to be called the actual class for which it was called was known at compile time. With generics sharing this is not the case anymore, so I had to fiddle around in the trampoline code to make this work. A consequence of this is that there are some architecture-dependent bits to generics sharing, and they’re only implemented on x86 and AMD64 at this point.

An interesting thing to note is that all the features I’ve talked about which require knowledge about the type argument are not present in Java generics. Basically, generics in Java are implemented by sharing all of the code but not doing any of the additional hard work required to get the non-trivial features going. That’s why Java generics tutorials always use linked lists instead of arrays as examples: You simply can’t create an array.

About these ads

7 Comments

  1. Creating arrays in Java is possible, it’s just not as easy as in C#:

    static T[] safeCreateArray (Class c, int size) {
    return (T[])java.lang.reflect.Array.newInstance(c,size);
    }

    Notice that Class is required, e.g. Class (as returned by String.class), the use of Reflection, and a warning-generating cast.

    See also: http://www.jprl.com/Blog/archive/development/2007/Aug-31.html#jcs-java-runtime

    Comment by Jonathan Pryor — October 12, 2007 @ 12:48 pm

  2. Of course, no preview and WordPress removes < and > retrying…

    Creating arrays in Java is possible, it’s just not as easy as in C#:

    static <T> T[] safeCreateArray (Class<T> c, int size) {
    return (T[])java.lang.reflect.Array.newInstance(c,size);
    }

    Notice that Class<T> is required, e.g. Class<String> (as returned by String.class), the use of Reflection, and a warning-generating cast.

    See also: http://www.jprl.com/Blog/archive/development/2007/Aug-31.html#jcs-java-runtime

    Comment by Jonathan Pryor — October 12, 2007 @ 12:50 pm

  3. Well, yes, if you do the work that the compiler’s supposed to do and if you can live without type safety, which kinda defeats the purpose of using generics in the first place, then you can create arrays in Java generics. Which is like saying that you can do object-oriented programming in assembler.

    Comment by schani — October 12, 2007 @ 12:56 pm

  4. [...] Quick Generics Sharing Update Filed under: Software — by schani @ 7:17 pm In my previous post I said that now that a lot of the generics sharing infrastructure is implemented, sharing support [...]

    Pingback by A Quick Generics Sharing Update « Juggling, Photography, Software, and Atheism — October 15, 2007 @ 7:17 pm

  5. [...] talks about his ongoing work to implement generic code sharing. Unlike Java generics that are a language feature, .NET generics are a virtual machine feature. [...]

    Pingback by Miguel de Icaza: JIT Updates - 工程師的雞排攤 — October 16, 2007 @ 5:30 pm

  6. [...] Filed under: Software — by schani @ 6:43 pm Tags: generics, mono, types In my last longer post on generics sharing in Mono I talked about the three different interesting ways an open type that’s used in a [...]

    Pingback by Other Types « Juggling, Photography, Software, and Atheism — January 29, 2008 @ 6:43 pm

  7. [...] I must first explain a bit about how we started sharing generic code, rehashing a few bits from an older post. The CIL has lots of instructions, only some of which need special care when dealing with generic [...]

    Pingback by Sharing Everything and Saving Memory « Juggling, Photography, Software, and Atheism — April 22, 2008 @ 2:17 pm


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

The Shocking Blue Green Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: