How we go about creating abstractions

One of the most celebrated hard problems in computer science is naming things. Coming up with names for abstractions has to be the worst. At least with concrete objects you have a specific thing in mind, with abstractions you have to be careful you aren’t prejudicing your thinking about potential instances of the abstraction.

Just take a look at the effort to come up with “more friendly” alternative names for Monad. It’s… not pretty. (F# is trying out “computation expression” which isn’t scary words, but it’s also an essentially meaningless name that likewise doesn’t convey anything helpful to someone hearing it for the first time. Might as well call it “blorp.” Or monad.)

But besides the choice of which name to use, the decision to name something at all is a also decision and it also affects semantics. For the most part, this decision is often made for us by our programming languages.

This decision is most stark in the distinction between structural and nominal typing. With structural typing, we have a type that describes a “shape,” and anything that fits that shape is an instance of that type. With nominal typing, we create a name. That name comes along with a shape that has to fit instances of that abstraction (such as the methods of an interface; these have to be implemented), but whether another concrete type is an instance is now about intention. You have to actually make it an instance; to decide whether an abstraction fits a type.

And making that choice can carry meaning. It’s no longer an accident that a type is an instance of an abstraction.

Meaningful types

A good way to see how types can carry meaning like this is to see what can go wrong when they don’t. One of my favorite examples of this phenomenon was a bug someone encountered with some Go code. The root of the problem here is that the Go standard library has a Duration type… that’s just a transparent alias of an integer. Code that looked like this:

var timeout Duration = 4 * 1000 * 1000
do_a_thing(x, timeout)

Becomes essentially meaningless. What timeout is being used? What does 4 million mean? We can certainly go read the documentation for the Duration type to find out it’s supposed to be nanoseconds (and then if we think about the above code for a second, we can spot a probable bug), but the type didn’t help us at all. Duration might as well not exist, it has a name but it carries no real meaning.

The problem is that a Duration isn’t any different from a int64, so the type actually doesn’t help us much at all. We aren’t ever prompted to take seriously the conversion from “a number” to “a duration” or vice versa. Ideal code could look like this:

var timeout = Seconds(4)
do_a_thing(x, timeout.Milliseconds())

Now we’re able to read exactly what’s going on: we want a 4 second timeout, and we’re presumably passing a value into some API that didn’t use Duration at all, but wants the timeout value in milliseconds. This code is actually saying what it does.

The difference between these two designs is that in the second one Duration is an abstract type. It’s not a number, even though internally that’s all there is. But we can easily convert between numbers and durations.

Java and the infamous names

Part of the underlying design philosophy of Java is a pretty radical commitment to nominal typing. Java doesn’t offer type aliases (like Go’s Duration above) of any form. Nor does it offer function types.

On the one hand, this was possibly a good decision. Java has one of the most well-behaved and simple module namespacing systems of any language I know of. The language embraces a design aesthetic that encourages creating explicit meaning and directly representing that in your code by giving it a name.

But it’s probably also partially responsible for the famed proliferation of ManyNamesAndPatternsAndNouns names. Naming things is hard, after all, and it you have something that’s inherently “shape-like” and have to come up with a name for it… it can be rough. I think the Java designers had a good idea here, but function types in particular are something they should have embraced. There is actually a balance to be struck in this.

Interface abstractions

In an object-oriented language like Java, declaring an interface creates a name, and it defines a shape. Whenever we declare a new class, we can choose for it to implement some set of interfaces. In this way, we can define a new abstraction as an interface, and modify a set of types to become instances of that abstraction.

But this approach has a significant limitation: given an existing set of classes from third-party code, we are unable to define a new interface and retroactively make these existing classes instances.

Type class abstractions

In Haskell, type classes offer a slight improvement over this particular situation. (We’ve discussed previously how type classes don’t replace objects. This is just about when we’re allowed to make instances.) We’re able to define a new type class, and make our types be instances of this class, just like in OO languages. But we have an added capability: we can define a new type class, and define instanceson existing third-party types we have no control over. This can be a significant improvement to our ability to define a new useful notion in our code, and then actually make use of it.

But this comes with restrictions as well. The Haskell “orphaned instances” rule means we’re only able to do exactly what we describe above. We can define new types and give them arbitrary instances, and we can define new type classes and provide instances for arbitrary types. That might sound like everything at first glance, but things are always more complicated than that. Consider wanting to give an instance of a third-party type class for another third-party type. You can’t. That breaks the orphaned instances rule.

The rule has a purpose: it’s to ensure we don’t get duplicate (and potentially conflicting) definitions of type class instances for the same type. That’s a good thing. But even so, you can disable the rule and do it anyway.

But even in that situation you run into problems. Libraries in Haskell routinely want to provide instances in combination with a type or type class from another library. But you might not want to place a hard dependency on that extra library, because your users might not actually use it. You somehow want to provide some optional code to help use two libraries in conjunction, but… without depending on them. I don’t want to get into it, but the result here isn’t very pretty.

Structural abstractions

So we’ve seen two approaches to nominal abstraction, each with their own limitations. The next interesting approach to consider is to abandon nominal typing, and see what we can do structurally. Here we can look to Go interfaces.

Go interfaces describe a shape—some names and their types—and anything that happens to have those methods becomes an instance of that interface. This allows post-hoc interface implementation: define a new interface, and a third-party type that happens to have those methods automatically implements it.

This gives us all the flexibility we were denied with the nominal approaches. We can define a new interface and make types implement it easily enough. We can define a new interface that happens to match existing types easily enough. And the intersection problem doesn’t arise: one library defining an interface, and another library defining a type, means we can freely use them together without weird dependency issues.

But as you probably can guess, we still get problems. The smaller and higher-level problem we can suffer from is that our types mean less. Because Go interfaces are purely structural, we don’t know anything about intention, only shape. Nobody has necessarily sat down and had a good think about the intersection of a particular type and a particular interface. Nor is there any particular location to attach documentation to. This can be an issue because we frequently imperfectly implement some interfaces. (Even the Java standard library is full of these. “Well, this is like a List but don’t try to mutate it or it’ll throw exceptions.”)

But another serious issue is that we still do have an intersection problem, really. Sure, if the interface happens to match existing methods, we get the free instance. But if it doesn’t, and we wish it did, we’re shit out of luck. Go throws at us a similar rule to Haskell’s orphaned instances rule:

You can only declare a method with a receiver whose type is defined in the same package as the method. You cannot declare a method with a receiver whose type is defined in another package (which includes the built-in types such as int).

This rule makes sense for the same reason it makes sense in Haskell. So all we’ve really gained is the ability to recognize patterns already present, but not a true ability to actually retroactively create abstractions.

An aside: Implementations

The three approaches are also interesting in how they get implemented. The object-oriented approach generally passes a single pointer for an interface parameter. The object itself has constraints on its representation: it has to have a “vtable” pointer that describes the type’s methods.

The type class approach takes a different strategy. Instead of modifying the type, a “dictionary” that’s functionally similar to a vtable is explicitly passed to a method. So f :: Class a => a -> a becomes f :: Class_a_dict -> a -> a. Then functions from Class are called by accessing them from the dictionary.

The Go interface approach also takes a very different strategy that I’ll call “existential pairs.” Like type classes, the type representation itself is untouched and doesn’t need to be modified. Instead, each individual interface-typed parameter is a pair: a pointer to a vtable-like dictionary, along with a pointer to the actual value. This is a lot like the type class approach, except that the dictionary is paired up with the pointer to the underlying value, instead of being separate.

The representation ends up mattering. A lot of these languages seem to have been designed by starting with what they wanted, finding a representation that did that, then going back and modifying the designs with that representation in mind. And almost every language has quirks as a result of the representation.

Everyone using C++ knows there’s some bit of memory at the beginning of their classes that makes them different from plain C structs. Haskell gets to have “methods” without objects, also called “return type polymorphism,” like return :: a -> m a for monads, something you can’t do with OO languages. Go gets a very weird equality semantics for interface types because it chose not to equate null with (vtable, null), so passing null to a function that takes an interface unexpectedly doesn’t actually get a null value, exactly…

Dynamic abstractions

With dynamic languages, we actually end up mostly in the same position as with structural abstraction. Duck typing works up to the point where the classes we want to use happen to have the right methods with the right names.

But dynamic languages languages often go further with their flexibility. I’m talking, of course, about about monkey patching. I’m not going to go into the details today, but I trust I can say this practice comes with dramatic downsides. Haskell has orphaned instances, and Go prevents new methods, for a reason. It keeps our programming language sensible. Giving up reasonable properties in pursuit of ever more power was the subject of one of my earliest essays.

How do we cope?

One of the remarkable things about this situation is that most programmers are happily using object-oriented languages, living with the most restrictive approach we looked at. I can think of three reasons why this doesn’t bother us much. Before you look at them below, pause for a moment and consider it yourself. Maybe you’ll come up with something you can tell me about. :)

First, I believe we actually very rarely reach for creating abstractions of this sort. Much of the programming we do deals with concretions and data, data, data. In this respect, a pretty significant limitation in what looks like the core design of object-oriented languages is really just a nuisance, because we don’t actually use the abstractive capabilities of OO languages as much as the language seems designed for. The language is designed for ad-hoc polymorphism, but everyday programming mostly calls for data and abstract data types instead. Notably, any situation where you define an interface, but actually have a fixed set of implementing classes in mind, avoids this problem entirely.

Second, possibly a majority of the situations this problem would come up in are simple enough that they’re just one function. When it’s just one function, we can pass around a closure instead of a more complicated object. (I’ve mentioned before this is part of how functional programmers get by without objects. The most common and simple cases are often small!) Modern Java in particular might not have function types, but it’s lambdas are pretty much just syntactic sugar for defining one-method adapter classes.

Which is the third approach to coping. Anytime we have a non-trivial, multi-method interface we need to implement for an existing type, we just write a wrapper. Wrappers induce boilerplate, but for complex enough types we might not even think of them as being a wrapper, nor be bothered by the boilerplate. The object-oriented approach can be so ingrained that we might even just see a wrapper of this sort as being natural, a legitimate thing in its own right. Not every instance of the “adapter pattern” advertises itself as such.