Associated Types Considered Weird

In my first post on Swift I expressed mild bewilderment at Apple’s decision to make protocols generic not by giving them generic type arguments, like C# does, but via what they call “associated types”. To reiterate the example, the Sequence protocol in Swift is defined as

protocol Sequence {
    typealias GeneratorType : Generator
    func generate() -> GeneratorType
}

whereas if Swift had generic type arguments in protocols (as it does in classes), the definition would look more like

protocol Sequence<T> {
    func generate() -> Generator<T>
}

When playing around with a set implementation that Alexis Gallagher posted, I came across a problem with associated types: it exposes more type information than the user of a class needs, and worse, more than the user of a class should have.

Here’s how Alexis implemented the generate method on his set type:

func generate() -> SetGeneratorType<KeyType> {
    let dictGenerator = self.dictionaryOfItems.generate()
    return SetGeneratorType<KeyType>(dictGenerator)
}

The type that generate is actually required to return (to adhere to the Sequence interface) is Generator. It’s not possible, however, (as far as I can tell) to declare generate as such:

func generate() -> Generator

The reason being that Generator here is under-specified—it must be a Generator whose associated type Element is KeyType, but that can’t be specified. If Swift had generic type arguments for protocols, it would simply be Generator<KeyType>.

The problem here is that the set type needs to publicly expose an implementation detail, namely the concrete type by which it implements its Generator. If that implementation detail changes, the method’s type changes, and code might break.

This problem is exacerbated by the fact that users of the generator might have to depend on its type. As an example, look at my simplification of Alexis’ code:

typealias GeneratorType = MapSequenceGenerator<Dictionary<KeyType,
                                                          Bool>.GeneratorType,
                                               KeyType>

func generate() -> GeneratorType {
    return dictionaryOfItems.keys.generate()
}

The body of the function explains the weird GeneratorType: The dictionary has a keys property that is exactly what I need—a sequence of all the elements in the set. The type that I’m returning now is the generator type of that sequence, instead of my own, and I must specify the concrete return type. The only way to get the concrete type, however, is to look up the type that keys returns, which means that I now depend on an implementation detail of Dictionary, and, making matters worse, I’m surfacing it to my users.

For comparison, let’s look at how this would work it Swift had generic type arguments for protocols. Sequence and Generator would be defined as

protocol Sequence<T> {
    func generate() -> Generator<T>
}

protocol Generator<T> {
    func next() -> T?
}

Dictionary’s keys property would be

var keys: Sequence<KeyType> { get }

And this would be our set’s generate method:

func generate() -> Generator<KeyType> {
    return dictionaryOfItems.keys.generate()
}

No leaked implementation details and no unneeded dependencies.

Advertisements

Generators in Swift

In my last post on Swift I examined sequences and generators. Today we’ll implement our own sequence.

Many languages have a function like Clojure’s interpose, which takes an item and a sequence, and inserts the item between each two subsequent elements of the sequence:[1]

Array(interpose("in", ["San Francisco", "California", "the USA"]))
=> ["San Francisco", "in", "California", "in", "the USA"]

There are three things we need here: The interpose function itself, which must return a Sequence, which, in turn, generates a Generator, which does the actual work of producing the items.

Let’s start with the Generator:

struct InterposeGenerator<G: Generator, T where T == G.Element>
        : Generator {
    let sep: T
    var gen: G
    var needSep: Bool
    var nextOrNil: T?

    init (sep: T, gen: G) {
        self.sep = sep
        self.needSep = false
        self.gen = gen
        self.nextOrNil = self.gen.next()
    }

    mutating func next() -> T? {
        if needSep {
            needSep = false
            return sep
        } else {
            let n = nextOrNil
            if n {
                nextOrNil = gen.next()
                needSep = nextOrNil != nil
            }
            return n
        }
    }
}

InterposeGenerator is a little state machine that alternates between producing either an element of the original generator or the separator. A bit of care is required to make sure that it doesn’t produce the separator again after the generator is exhausted.

One noteworthy thing here is the invocation of self.gen.next() in init. If we try to call next on the constructor argument gen instead of the property self.gen we get a compiler error:

immutable value of type 'G' only has mutating members named 'next'

This makes sense: the Generator is a value type, and it’s passed in here as an argument, which apparently makes it immutable, and next is a mutating function. At least, that is how it would typically happen—the compiler doesn’t seem to have any information to that effect, though. Generator is a protocol, which can be implemented by reference as well as value types:

protocol Generator {
    typealias Element
    func next() -> Element?
}

As you can see, next is not marked as mutating, either. I suspect there are two things going on here: the protocol definition that Xcode shows is probably incorrect, in that the mutating modifier is missing. The definition that Swift actually uses contains the modifier. In addition, the compiler doesn’t know whether the Generator we are using to create InterposeGenerator is a value type or not, so it must assume that it is, for the sake of safety in the case of mutating functions.

Let’s move on to the Sequence. It doesn’t really have to do anything other than creating an InterposeGenerator when asked to, so apart from all the type fiddling involved it’s rather simple:

struct InterposeSequence<S: Sequence, T
                         where T == S.GeneratorType.Element>
        : Sequence {
    let sep: T
    var seq: S
    func generate () -> InterposeGenerator<S.GeneratorType, T> {
        return InterposeGenerator(sep: sep, gen: seq.generate())
    }
}

Finally, the interpose function just creates an InterposeSequence:

func interpose<S: Sequence, T
               where T == S.GeneratorType.Element>
        (sep: T, seq: S) -> InterposeSequence<S, T> {
    return InterposeSequence(sep: sep, seq: seq)
}

It seems to be a pattern in the base library for the Sequence and Generator to be implemented in the same type, which we could easily do here, as well. I’ll leave it as an exercise for the reader.

Languages with a yield operator, like C# or Python, allow sequence generation to be expressed much more succinctly, with the compiler taking over the task of managing the state machine:

static IEnumerable<T> Interpose<T> (T sep, IEnumerable<T> seq) {
    var en = seq.GetEnumerator ();
    if (!en.MoveNext ())
        yield break;
    for (;;) {
        yield return en.Current;
        if (!en.MoveNext ())
            yield break;
        yield return sep;
    }
}

  1. It doesn’t modify the original sequence, of course, but rather returns a new one.  ↩

Playing with Swift

I really like what we’ve seen so far of Swift, Apple’s new programming language. For starters, I was happy to learn that references cannot be nil[1], and integer arithmetic is checked by default[2].

Let’s look at how Swift deals with sequences, and learn a little bit about generics on the way. Arrays, for example, are sequences, so you can iterate over them with a for loop:

for x in [1,2,3] {
    println(x)
}

Lots of other types are sequences, too, like strings, dictionaries and ranges. What makes them sequences is that they all implement the Sequence protocol:

protocol Sequence {
    typealias GeneratorType : Generator
    func generate() -> GeneratorType
}

I’m not sure why Swift elects to do generic protocols via typealias instead of just having the protocol take type arguments, like C# does, which might look something like this:

protocol Sequence<T> {
    func generate() -> Generator<T>
}

Whatever the reason might be, we can see that a sequence can generate a Generator, which is defined as follows:

protocol Generator {
    typealias Element
    func next() -> Element?
}

Generators have a next method which returns either the next element in the sequence, or nil, when the end has been reached. The for loop from above is probably mostly equivalent to

var gen = [1,2,3].generate()
while let x = gen.next() {
    println(x)
}

The equivalent of Sequence and Generator in C# are IEnumerable and IEnumerator.

The map function takes a sequence and a function, returning a new sequence containing the results of applying the function to the elements of the given sequence. For example:

for x in map([1,2,3], {i in i*2}) {
    println(x)
}

produces

2
4
6

It’s important to realize that the sequence which map returns is not necessarily of the same type as the one it is passed:

println([1,2,3])
println(map([1,2,3], {i in i*2}))

prints

[1, 2, 3]
VSs17MapCollectionView (has 2 children)

Also, sequences can generate their elements lazily:

var seq = map([1,2,3], { (i: Int) -> Int in
    println("processing " + String(i))
    return i*2
    })
var gen = seq.generate()
println(gen.next())

prints

processing 1
2

Let’s write a function that takes a sequence and returns an array containing all the sequence’s elements!

First we need to figure out the type of this function. It turns out that this doesn’t work:

func toArray (seq: Sequence) -> seq.GeneratorType.Element[]

The compiler complains that seq is an undeclared type. This doesn’t work, either:

func toArray (seq: Sequence) -> Sequence.GeneratorType.Element[]

Here the complaint is that Sequence.GeneratorType is ambiguous. That’s because we can’t tell the compiler that the two Sequence types are the same. Maybe we can use generics to solve the problem:

func toArray<S:Sequence> (seq: S) -> S.GeneratorType.Element[] {
    var arr = S.GeneratorType.Element[]()
    ...
    return arr
}

Swift is now happy with the type of the function, but unfortunately it doesn’t let us instantiate the array. Instead it thinks we’re trying to subscribe S.GeneratorType.Element, which isn’t possible, because it’s a type and not a value. As far as I can tell, the only way to get around this problem is to introduce a second generic argument, which we can force to the correct type via a where clause:

func toArray<S : Sequence,
             T where T == S.GeneratorType.Element>
        (seq : S) -> T[] {
    var arr = T[]()
    for x in seq {
        arr.append(x)
    }
    return arr
}

println(toArray(map([1,2,3], {x in x*2})))

And there we have it:

[2, 4, 6]

  1. Swift would be the first mainstream programming language that corrects Tony Hoare’s “billion dollar mistake”, after barely 60 years of all others repeating it.  ↩

  2. It seems we’re finally learning that it’s better for a program to halt than to give away all your secrets and then launch all the nuclear missiles in the case of a bug.  ↩