Why variance matters

One basic and omnipresent part of program design is variance, and it deserves more attention than it gets. I gave a brief crash course in variance back when I discussed the design impact of inheritance. I’ve decided to elevate this discussion to full post of its own.

What is variance?

Variance is about our ability to substitute types for other types. We generally think of this as being more or less specific types. A typical inheritance hierarchy or subtyping relationship can create a chain from less constrained to more constrained types, like num :> int :> nat. We can treat the appearance of a type in three useful ways:

Invariant behavior means that only that exact type is acceptable. Many functional languages lack subtypes, and so are generally invariant.
Covariant behavior means subtypes are acceptable in lieu of the specified type. Many object-oriented languages have a default preference for covariance.
Contravariant behavior means supertypes are acceptable substitutions in this type. Anytime when covariance is possible, contravariance also becomes possible.

The key takeaway I’d like to offer today is simple that variance is a form of added complexity, with some qualifications. This doesn’t make variance bad, of course. When that complexity is called for, it’s desirable, because it helps us cope with the actual challenge of the problem we’re facing.

Where does variance come from?

There are two main sources of variance. The first is simply functions.

If we have a hierarchy like num :> int :> nat and a function type like int -> int, how are we allowed to reinterpret this type? We have a couple of options. We are free to be more constrained in its arguments, and we can freely be more relaxed and forget constraints about its result. So we can cast that function type safely to nat -> num, but not num -> nat. Arguments are covariant, while results are contravariant.

But there’s a second major source of variance in our programs, and that’s IO or state. Any time we’re looking at a type that describes what can be written or sent to it, we’re looking at something that’s covariant. That is, a Sink<int> can be reinterpreted as a Sink<nat>. We’re free to insist we only write more specific things to it. Any time we have a type describing what can be read from something, we’re looking at something that’s contravariant. A Source<int> can be reinterpreted as a Source<num>. We’re free to relax our expectations about what we’ll read from it.

But this is where mutation has something big to say, because a mutable variable is essentially both a sink and a source! This means that any time you have something mutable, like RefCell<int> that we can read and write to, we can no longer legally reinterpret this as a RefCell<num> or a RefCell<nat>. It’s fixed.

Java’s early mistake.

Early on in the design of Java, before it had generics, it was decided that array types should be covariant. At first, this seemed like a reasonable decision, because every language needs be able to implement a sort function, and without generics or array covariance, this would be difficult to do in Java. If sort accepted an Object[], you wouldn’t be able to pass in a MyClass[] without covariant arrays!

But arrays in Java are mutable, and above I just pointed out mutability makes things invariant. So what do we get as a consequence? Every assignment to an array in Java is required to do a run-time type check to make sure the object is an acceptable type for the array. Which also means every assignment to an array in Java potentially throws a type error exception.

Making matters worse, when generics were introduced to Java, they brought along a different and fairly incompatible set of design decisions, pretty much necessarily. Java generics are invariant by default, which is the right decision, considering how pervasively mutable things typically are in Java. But together with a bunch of other design differences, a MyClass[] and Array<MyClass> turn out be very, very different. That surprises people. And it should; it’s unexpected. It’s the required friction resulting from a past design mistake that can’t be fixed.

Java might be forgiven for pioneering these mistakes, or at least popularizing their consequences. But then Golang walked right into the same mess. It really irks me when there’s no one around with the expertise or authority to say “hey, let’s maybe not just repeat all the same old mistakes?”

(Yells old man at cloud, “oh come on Rust, macros again?” But I shouldn’t get too cranky…)

How do we interact with variance?

There are two general modes of interaction with variance: use-site variance, and declaration-site variance.

Sink and Source are good examples of where declaration-site variance is useful. It’s simple, and it does the job. We’re able to say that the type constructor itself has particular variance on each parameter, just like we can with the function type.

But use-site variance is more flexible. It allows local code to make promises about how a type will be used that aren’t guaranteed of just any old use of that type. The classic example here is dealing with array copying. Arrays are invariant, so a copy from Array<T> to Array<T> could only ever do so to arrays of exactly T. But clearly this function is only reading from its input and writing to its output, so it could be covariant and contravariant respectively. With use-site annotations, we can have an invariant Array type, but have a copy function that will happily read from a Array<nat> and write to an Array<num>, permitting variance where it is locally acceptable.

The notation for describing variance, uh, varies. Java uses the rather irritating Array<? extends Type> for “covariant Type” and Array<? super Type> for “contravariant Type”. Scala and friends use Array[+Type] and Array[-Type] respectively. C# and friends use Array<out Type> and Array<in Type>. Each language has varying support and syntax for declaration and use-site variance.

Variance appears even without subtyping

Although the complexity of variance is avoided without subtyping of some kind of create it, the underlying causes of variance still exist even when all of your type constructors are invariant. There’s a pretty neat example of this with Haskell.

One of the standard abstractions in Haskell is the Functor which provides the fmap function we’ve talked about before. But it turns out, this is the covariant functor.

Let’s introduce two simplish types:

data Sink t = Sink (t -> IO ())
data Source t = Source (IO t)

These types represent IO actions that either consume a t or produce a t. And here’s the thing: we can give a Functor instance for Source, but we can’t for Sink. Try it.

For a covariant functor, we operate on the values of type t in the data structure. But when that t appears as the argument to a function, we don’t have that value anymore. We have a function that takes that value. It’s “polarity” gets flipped, and now to deal with those, we need a contravariant functor.

-- found in Data.Functor.Contravariant:
class Contravariant f where
  contramap :: (a -> b) -> f b -> f a 

For people coming from ordinary functors, this looks weird. The types are backwards! How are we going to get to b from an a when all we have is a -> b?

But the key thing to remember is that with a contravariant functor we’re not dealing with a data structure that has values of type a in it, we’re dealing with a data structure that has functions from type a in it. (Typically.) That’s what makes it contravariant.

Now we can write that instance for Sink:

instance Contravariant Sink where
  contramap f (Sink g) = Sink (g . f)

So this let’s us transform a Sink b into a Sink a by writing a function a -> b. The meaning is simply that when we “sink” a value of type a into our new Sink, we apply a -> b and then “sink” the resulting b into the original.

But Why?

Sometimes people get distracted by wondering why we’ve got all these fancy names for these things that once you wrap your head around their seemingly complex descriptions, turn out to be very simple.

Because that’s what abstractions are.

It’s in the very nature of an abstraction that, for any one particular example in isolation, it’s a frivolous added complexity that we don’t need. The point of an abstraction is that it’s a concept we can construct in our heads, and then use elsewhere. Lots of things are covariant or contravariant functors. Or Profunctors, or Bifunctors. Understanding these things in the abstract means we get a mental tool we can then just apply next time we see the pattern.

Avoiding the complexity of variance

I have this hypothesis that I’m not entirely sure how to substantiate. I think a significant part of the complexity that arises from variance and subtyping is a result of nominal subtyping rather than inherent.

I think that structural subtyping (or row polymorphism) results in a lattice structure that reduces or removes most of the complexity involved.

Nominal subtyping exposes a more limited structure, and although the lattice structure we see from subsetting looks more complicated, it’s much more regular. In addition, we gain simple options like `{A,B,C,D}` which from a nominal subtyping perspective requires *multiple inheritance* to achieve, which introduces additional complexity.

If ya’ll got any references I should read related to this, I’d like to hear about them.

But this is part of the reason I think variance impacts dynamic languages less than it does static. It’s not that variance doesn’t apply, or that variance is just complexity that’s created by a static type system. The complexity is partially inherent to variance, but partially created by nominal subtyping in particular. Dynamic languages often make use of “duck typing” which roughly corresponds to structural subtyping or row polymorphism, and so the reasoning required is less cognitively complex.

Or at least, that’s my guess.

End Notes

Besides invariant, covariant, and contravariant, there’s also bivariant, but that’s usually reserved for trivialities, and mostly get brought up due to a misplaced sense of completeness. I think it complicates initial understanding of the concept.

The interaction of variance and the “is a” notion of inheritance was part of the reason I criticized both inheritance and the “is a” philosophy of what inheritance even means awhile back.

So much of functional programming seems to be about making code as simple as it looks: invariance everywhere, immutability by default, more direct data structures, default non-nullability. Sometimes it strikes me as weird that functional programming could ever have been criticized as being complicated. I’m starting to agree with the idea that we do programmers a disservice by teaching the “machine model” of programming first. I’m not sure how else this stuff is more complicated except that people are struggling to map it down to the machine model, instead of understanding it on its own terms.