One of the nice wins for object-oriented programming was the way an object can be used to handle state. Encapsulate your state internally and ensure each public method preserves the invariants you’re interested in, and you have a very simple and effective design.
But this design victory falls apart for multi-object systems. As soon as we need to manipulate a system of multiple objects in tandem, we no longer have any of those advantages. This flaw isn’t completely fatal: if the system itself can be hidden away behind an object representing the system, we can recover our footing enough to carry on with an object-oriented perspective.
But some systems are just big, inherently. It’s no solace to us that we can wrap the system itself in a nice looking object, if we’re the ones who have to labor inside that large, complicated system.
Today’s post is about a simple technique that has a lot of far-reaching consequences.
The purely OO world-view is, of course, incomplete. Good design requires making choices about type design between objects and other possibilities. In particular, sometimes a type should represent pure data.
Part of the reason OO badly copes with multi-object systems is that we’re being forced (by OO ideology) to give up thinking in terms of data. As soon as we start to admit data types as a possibility again, more options immediately open up.
A design technique that can easily handle some of kinds of complex multi-object systems is to represent actions as a data type. Instead of thinking in terms of a function that modifies multiple objects, you think in terms of a function that computes a description of how multiple objects should be affected, then a separate interpreter of that data type that animates that description into real action.
There are a multitude of advantages that come with this approach:
Stateful code can become dramatically easier to test. Many functions that would perform actions, instead of directly manipulating the system, will now instead just compute a description of the action to perform. This makes many (sometimes quite complicated) stateful functions turn into pure (or at least purer) functions, which can be tested by just looking at their outputs, instead of by having to set up and inspect changes to the program’s state.
All manipulations are performed only by the interpreter, which is a smaller amount of code. This makes it possible to ensure that system invariants will hold, much in the same way we’re able to ensure single-object invariants were preserved in the public methods of an object.
It becomes a lot easier to test the interpreter, which is where the scary multi-object system mutation actually happens, and so is most subject to tricky and complex bugs.
There’s an additional level of de-coupling between code that creates “action descriptions” and the system that would be acted upon. This can make it easier to do some kinds of invasive changes in the future.
There are of course, several disadvantages:
Many of our programming languages lack good support for representing data (in particular, algebraic data types), making it overly verbose to create such a data type, and so more more costly and difficult than it should be.
Designing the intermediate data type describing actions can be hard. Non-trivial data types can become akin to miniature programming languages (hence my choice of the term “interpreter”), and keeping the abstractions sensible is tricky.
You probably do this every day
If this sounds like a weird approach to programming to you, or abstract functional programmer nonsense, I’d like to take a moment to remind you that you do this all the time. Every time a programs computes some SQL and sends it off to a database, you’re computing data to describe an action, and sending it from one system to another to be animated by an interpreter.
The danger of not going far enough
The lack of good support for representing data in many OO languages can lead us into some unfortunate degenerate cases. One of these cases is the “Command Object Pattern”.
At first glance, this seems like a reasonable approximation of what I’m talking about: the objects just hold data describing what’s to be done, until they’re animated by executing the command, right?
The problem with command objects is that they typically do not reduce the dangerous surface area of the program. Each abstract “action” the program wants to perform usually becomes itself another command object.
Think back to our single-object analogy with state encapsulation. A red-black tree object is designed with a reasonably small public interface of methods, each of which we can ensure preserves important invariants. Once we have done so, outside users of that RB tree are supposed to interact with it only through those methods. The degenerate case of the command object pattern would be like turning every one of those outside interactions into the equivalent of a new public method instead. This is the moral equivalent of violating encapsulation.
Part of what makes this tactic (reifying actions as data) effective is when the application has a lot of different sort of abstract actions that need to be taken, but we’re able to represent that full variety of actions with a relatively small data type. The “SQL” data type might be pretty big, but compare it to all the queries being executed by all the applications out there.
Note that I have nothing against command objects necessarily, I just want to get across that command objects alone are not following this design technique.
A more applicable OO pattern
If we’re going to represent a big range of possible actions with a smaller data type, we’re probably going to need to design for composition somehow. We need to take smaller parts and combine them in ways that accomplish something greater.
That almost always means our data type is going to be tree-like. Again, we suffer from a lack of good support for e.g. algebraic data types in many commonly-used languages.
As with many deficiencies in OO languages, there’s a pattern for that. This one is called the Decorator pattern. It’s pretty much just trees represented as objects, don’t expect anything too mind-blowing there.
But I’d like to present an example of this sort of thing in the wild: “Use the decorator pattern for clean I/O boundaries.” Here’s the concluding quote from that articles:
Decorators are a great compositional pattern allowing the different concerns that inevitably cluster around I/O boundaries to be neatly separated and recombined. This opportunity presents itself several times in every app we write, and does not require any fancy language, type system, or framework. See how you go!
Databases as logs
This design technique shows up in a lot of different places. I’m just going to start pointing to several interesting ones briefly.
One of the internet’s classic great posts is “The Log: What every software engineer should know about real-time data’s unifying abstraction”. It’s not just about “real-time data,” for the record.
Every database (and journaling file system) fundamentally functions by writing out a description of every change it makes. This description (the log) has benefits for reliability, concurrency, replication, atomicity, transactions, and much more.
How do you mutate multiple remote objects at the same time? You cannot simply directly manipulate them: the series of operations wouldn’t be done atomically, for one thing.
As soon as we start going beyond very basic object-oriented CRUD, we need endpoints that can perform more complex, atomic actions. The schemas for these actions are essentially the data types describing an action.
Actions can be validated
Some kinds of action descriptions can be analyzed before they’re executed to ensure things aren’t going off the rails. A simple example is to implement centralized permissions checking. If the permissions checked are part of the action description, then it’s not possible to forget to check permissions down any particular code path, assuming the action-interpreter itself did not forget them. Any code can generate the request, but a smaller attack surface is validating it before actually doing it.
Analysis of action descriptions can start doing more involved things, too, like transforming actions mechanistically. Perhaps detecting a certain sort of suspect action and instrumenting it with logging. Or even to implement an optimizer, in the bigger cases.
Cutting off dependencies
In “Why an interface with only one implementation?” I pointed out how an interface can decouple two modules. Data types like an action description can serve the same purpose as an interface. I have already mentioned that this technique can help de-couple modules from each other. We saw an example of how this works in the case study: “What can we learn from how compilers are designed?”
The testing benefits
I already mentioned how this design can improve the testability of the code. Since the number of “actions” to be performed can be large, making these actions become pure (or purer) functions can make them a lot easier to test. Testing the interpreter isolation is helpful, too.
But another interesting result is that the interpreters become more amendable to property testing. It’s pretty easy to generate random actions, and you can often construct simplified models of the real system under test and verify the equivalence of the real system with the model. But even without that, property testing can interact with invariant checking in stateful code in a pretty synergistic way.
You might find interesting my past post: “Designing imperative code with properties in mind”
Another interesting example of this technique in the wild is Elm’s Model-View-Update architecture.
Elm’s got an approach to building single-page applications to compete with React’s, and a big part of the style involves communicating from the “view” using a message data structure that produces a new model via an
update function that animates the changes.
I’ve ended this post by rattling off a small list of examples of this kind of design in the wild, without going into too much depth on each of them. I hope that wasn’t too disjointed.