One the the themes that I’m (apparently) developing with this series is recognizing some of psychological biases that affect how we as programmers think about program design. The very first post looked at the creative design loop and pointed out how we sometimes feel like this is unprincipled. A similar idea was alluded to in another recent post regarding how we feel about the word “engineering” and whether we measure up to that standard. (And of course, I perhaps didn’t do such a great job here with my own attempt to distinguish two styles of programming when I called them wizarding vs engineering.) A more explicit example of this theme so far is calling ourselves out on our preference for more power when it comes to abstraction design. We have to find techniques to appropriately deal with our biases that can push us in the wrong directions when it comes to design. The first step is recognizing what those biases are.
I was recently struck by another common problem we routinely make for ourselves, and felt compelled to write about it. We routinely conflate small with simple and big with complex.
It’s odd that we continue to do this. This is computer science, land of one instruction computers that are nevertheless universal (Turing-complete) machines. One is a pretty small number, and it’s hard to get more complex than something that can do anything, right?
And big doesn’t have to be complex either. The one case where we do manage to get this right is for the standard library of a programming language. Large ones are routinely praised for their “batteries included” approach, and it’s pretty well understood that you don’t need to know about things if you’re not using them. While some people have criticisms about large standard libraries when they start to include poorly designed abstractions that now can’t be gotten rid of, the criticism isn’t typically that large standard libraries are too complex. Bigger is better, and makes every thing easier to get done, right?
Why do we think small is simple?
One of the old Spolsky posts became relatively famous for declaring software re-writes a bad idea. In it, he hypothesized (or rather, rantingly proclaimed with certainty) that rewrites happened more often than they should because “it’s harder to read code than to write it.”
Reading code is certainly a skill, and it’s one you have to develop. I’m not sure we do it enough, or that we talk about how to go about it enough. This might be an interesting topic for a future post.
But I think there is some truth buried in the hyperbolic certainty here. Certainly, people who code got into it because they liked writing programs, or at least I hope you did. Reading code, by contrast, can feel like studying. Especially if the code wasn’t written with reading in mind, and if you’re not exactly sure how to go about it, and especially if the documentation doesn’t exist to point you in the right directions. Documentation can help, better code organization can help, a mentor can help, and getting yourself into a position to try small experiments with the code all let you develop an understanding more easily.
So one possible reason we believe small things are simpler things is that it just seems like it’s an easier read. If reading is hard, then less reading is easier.
But I don’t think that’s the whole story. We also sometimes seem to fall into the trap of feeling like we don’t understand a system until we understand everything. This kind of perception is reinforced when we teach a language like C and immediately begin to pierce the veil and talk about stack layouts and registers. If we can’t treat a new language, framework, library, or tool like we treat a typical language standard library—something we understand we don’t need to understand all of—then larger can again seem like a burden. If incremental adoption isn’t allowed, or we just don’t allow it of ourselves, we’ve got another potential problem.
And if we feel like we can’t adopt something until we understand it all, that’s a lot of upfront effort. Time and time again we see that making it easier to start with something is more important than how easy it is to fully go through with it. It’s one of the time management hacks that actually works for me, find a way to eliminate as many barriers as possible (no matter how small) between me and starting the thing I should be working on. Concentration is hard, sure, but just starting at all is usually the biggest barrier. Too much up-front effort, and even a truly simple system might not be recognized as such. An awful lot of languages and frameworks out there earned their popularity by how easy it was to get started, not necessarily how easy it was to maintain.
How does this hurt us?
It’s a common joke that all software just accumulates until it’s overly “complex,” is then replaced by something “simpler,” which then proceeds to become just as “complex” over time. I wonder how much time we’ve collectively wasted on re-inventing the wheel, not badly, not better, just… the same old wheel design, from scratch. Maybe a new paint job.
There’s likely a lot of things that play into this idea. One possible explanation may simply be this bias against larger software, believing that size is a measure of complexity.
The devil’s dictionary of programming gives us these definitions:
simple—It solves my use-case.
opinionated—I don’t believe that your use-case exists.
lightweight—I don’t understand the use-cases the alternatives solve.
I think it’s missing one: “bloated—It also solves use-cases I don’t know or care about.”
So if we really are making the same thing over and over again, in pursuit of small when the problem just isn’t small, then that’s a complete waste. We should be looking to figure out how to avoid doing that. Perhaps mistaking small for simple is part of what we’re doing wrong.
Of course, another idea I had on Twitter about this joke is that we might be looking at our own poor insight into design. Poor designs perhaps work fine when they’re small, but become cumbersome when they’re large. If that’s the case, then the same design mistakes can get made over and over, not observed for what they are simply because each time we start over, the result is small and so appears simple again! Our greedy algorithm here gets us trapped in a loop.
But another concern I have is that a good design might simply be larger at first. This scenario is visualized in the image to the right. Here we really are correct to lament the over-complexity of the existing software design, but we’re doomed to repeat its mistakes, if we shoot for the simplest way to start over. We have to actually anticipate and understand all the use-cases and foresee how things will go as we improve (and add to) our software, to really evaluate whether our new design is actually a simpler one, or if it’s just a smaller one. And if we also have an unrecognized bias towards small, we’re less likely to successfully do that.
There was some recent hubbub about whether Kubernetes was too complex. I think the article linked did a good job of outlining the twisting flow the conversation.
I’ve felt since I first heard of Kubernetes a couple of years or more ago that it was incredibly well designed. Its success since then has been one of those rare and welcome deserved successes, in my opinion. It’s certainly an incredibly huge chunk of stuff to learn, though. And the setup process to stand up a cluster on your own was horrifying last I tried it. I can see why people would find it daunting. But it all comes down to the complexity of the problem being solved. A distributed, fault-tolerant container orchestration system isn’t a small thing. I hope we don’t end up confusing big with overly-complex.