Programmer as wizard, programmer as engineer

One of my goals is to demystify (at least a little) the various things that make up the context in which we’re designing code. One of my earlier essays was on the idea of system boundaries. To recap quickly, a system boundary is where design changes can become “breaking changes” for downstream (perhaps external) users of that code. Outside of a system boundary, we can easily fix our code, but on a system boundary, we can end up cursed with our past mistakes indefinitely. Or we end up cursing our users with breaking changes. (And they sometimes curse you back.)

System boundaries are interesting in part because they can change our design preferences. When I wrote about the problems with inheritance, one note I added was that inheritance really poses no problems except on system boundaries. (And likewise has fewer issues when used internally with abstract data types in contrast with objects. The key idea here being that objects are open to extension, but ADTs are not.) But this still isn’t a completely perfect characterization of when inheritance’s problems really arise.

Another major piece of context is why we’re writing our code. Are we just trying to solve a problem quickly, or are we trying to build a robust solution? I got into programming at least in part because it seemed like the closest thing to magic that really exists. So I’m going to call these two distinct styles wizarding and engineering.

These different styles have different design preferences. Tools on the engineering side tend to be more careful in their design, which often makes them somewhat less “innovative.” Languages tend to be statically typed, and more rigidly so. Proliferation of structure and boilerplate are more tolerated, because they can be written once and persist for years.

Tools on the wizarding side tends to emphasize terseness. The more that can simply be done implicitly (“magically!”), the better off the user is, not having to do it themselves.

One immediate mistake we can make here is to believe that wizarding is somehow the sloppy, unprincipled, illegitimate version of engineering. Programmers can have good (if bitter) reasons for believing this, which we’ll get to in a minute. But on the face of it, wizarding is what programming should be most like. Wizarding is fun. Wizarding is pulling out 10 line script solutions to problems that seemed intractable to others.

I think one of the overarching goals of compute science is to make more programming like wizarding. We want our computers to be human-amplifiers.

But we have different design goals and constraints when we’re doing one or the other. With engineering, we’re tasked with the long term maintenance of a piece of software, and we also become concerned with many lower-priority properties, such as its performance. With wizarding, we’re often best served by designing for ease of throwing code away rather than maintaining it. Well-engineered code is irreplaceable, but well-cast spells should be transient.

I’m not terribly happy with this figure, but at least it gets the idea across of thinking about both when we’re designing on a boundary and when we’re engineering.

Above, I try to draw a picture showing how the combinations of system boundaries and engineering/wizarding gives us different design contexts. To loop back around to thinking about inheritance, it’s really only the upper left quadrant that we truly suffer all of inheritance’s drawbacks in full force. In the bottom left, we can fix the code. In the bottom right, we can throw away the old code. In the top-right, we’ve potentially caused some “breaking changes” for our users, but since that code should be transient anyway, they can just not upgrade, re-write, fix, or delete their code anyway. Right?

The Problem with wizarding

With capital letters, because there really is just one problem that overshadows all others.

We start off creating something quickly using a wizarding approach rather than an engineering one.
Then we maintain it forever, and that decision comes back to bite us.

The Problem is that we’re really, really terrible at transitioning from the bottom-right to the bottom-left. That is, at initially creating a quick and dirty ~~prototype~~ MVP, and then evolving it into a well-engineered product. The design decisions made to create something quickly are occasionally poisonous to maintaining it long-term. However much fun there is in the wizarding approach, it all turns to pain when we misuse it like this.

The rehabilitation of Java.

One of the interesting cultural transitions I’ve had a front seat to watch in the past decade has been peoples’ impressions of Java as a programming language. Back in, oh, 2004 or so, Java was solidifying its reputation as the Enterprise Programming Language for Architecture Astronauts, to be avoided at all costs by right-thinking programmers everywhere. Over the next decade we saw the rise of anything else: PHP, Python, Ruby, Scala, Javascript, and more.

Then something weird happened. I can only assume that people suddenly found themselves maintaining applications written in many of these languages, and discovered that all these languages offered a lot of pain, too. Meanwhile, a new genre of Java development was born, one that didn’t touch Java EE with a ten-foot pole. Turns out Java has some of the best build tools, IDEs, refactorings, and so on. And you don’t have to use the stuff that makes EveryNounBuilderFactoryPattern.

In a total reversal, I now know a lot of developers who prefer to use Java.

And there’s little hope to stop running afoul of the wizarding problem. Getting to an MVP as fast as possible will probably always be the first priority. After all, who knows if what we’re building is even a good idea. And once there, we’re almost always going to start shipping that as a product, and thus wanting to maintain and evolve it from there. We can’t throw it away, no do-overs, too expensive.

So really, we need to get better at making this transition.

What do we do?

I have no silver bullets, of course. What I want to do for the moment is just look a little at what people have done to try to cope with this situation. What do we do when something more on the wizarding side starts to drift more to the engineering side?

One of the biggest things that I think happens as a result of the wizarding problem is that testing gets a little distorted. The goal of testing should be to give us confidence that our software works. Testing is great.

But we can start to suffer “test damage” to the design of our code. We can end up with an over-proliferation of mocks. Tests can become rigid and fragile. People have certainly managed to create test suites that make it harder to maintain the code.

Testing becomes less an assurance we haven’t made mistakes, and more a tool to help us do the reasoning we’re no longer able to do directly on the code. It becomes all-important that we can run our tests quickly, as part of an immediate feedback loop as we’re programming. Any tests that take too long become a problem, and we investigate ways to run those tests faster, even if it potentially distorts our code’s design.

Again, I want to emphasize that testing is great. (Even methodologies like test-driven development aren’t necessarily an issue.) The problem we have here is that the priorities get out of whack. If we use wizarding tools, but end up having to do engineering work with them, heavy testing is a natural way to cope, but we also risk going overboard. Testing should help us ensure we didn’t botch our reasoning, but when testing starts to replace reasoning… things are not going well.

Another lucky way we can cope with the wizarding problem is through “micro-services” or similar approaches to isolating applications from each other. It’s perfectly fine to use Rails to develop a web application. The system boundaries we get—the database schema, the HTTP API—are not usually harmed by doing so. Any other services that interact with it are generally agnostic to what language the service is implemented with. If we decide it’s become important enough that we need to rewrite it in something more suitable (probably, higher performing) that’s an option. Well, so long as the application has not grown so large that a rewrite has become infeasible. I get the impression Google has been able to migrate a lot of C++ and Python to Go using this approach.

Meanwhile, many other dynamic languages have begun growing static type systems. Python has developed mypy. For Javascript, we have Flow and TypeScript. Gradual type systems has started to garner a lot more interest.

Facebook has dealt with their PHP legacy code problem in a rather interesting way. First developing a PHP compiler to improve performance. Then developing a JIT VM. Then building a PHP-like typed language on that VM called Hack. And they’re evidently planning on migrating Hack away from being especially PHP-compatible in the future.

So that’s one possible option: if you’re big enough, you can afford to create a whole new language, initially syntax compatible, but allowing you to slowly refactor things in a positive direction. It will be interesting to see how successful this effort is. One worry I’d have is that it ends up only being successful for Facebook itself. They get to co-evolve the language and their codebase, but everyone else is presented with just another programming language (and one with some funky legacy designs from PHP at that).

But regardless of whether this works well for them, that’s the kind of scale of the problem we’re coping with. Doing something that huge and expensive starts to make business sense!

An under-appreciated option

Many people write Python code. And many people use libraries in Python that are wrappers around a C native library. But a lot fewer people think holistically in terms of writing an application in “Python + C”. Off the top of my head, I think maybe the Dropbox client. Any other big ones?

I suspect this is partially a failure of our education. Foreign function interfaces all have some similarity to each other. It’s probably a topic you could teach in class without being overly language-idiosyncratic. But I’ve never heard of it getting taught. (I might write up some notes on this later.) This leaves it seeming like a mysterious and difficult thing to do, when it’s not that bad.

But this is one way in which we can casually evolve good spell-work into good engineering work. FFIs allow us to go from high-level glue code that makes up the overall application, to often lower-level engineered code that does critical work for us. Unlike service isolation where we need totally different applications talking to each other, this is something that can be done within a single application.

Unfortunately, it’s probably not especially common that C is the engineering language you want, and C is the FFI language of nearly everything. Thinking of a typical web application, the parts we’d use C for are the parts we already have libraries for, or at least external services to use instead (caches, databases). We probably aren’t interested in rewriting any business logic in C.

Perhaps Rust will help, since it is a more attractive option compared to C. And perhaps we’ll see programming languages in the future developed as a pair: a statically-typed engineering language, and a wizarding scripting language, designed to work together. Or maybe gradual typing will work out better than I suspect and we won’t need two separate languages to do this. We’ll see.

Some end notes

I hope I’m not saying anything so controversial that I’ve inadvertently dived face-first into a useless “static vs dynamic” type war with this post. As I said, I think the wizarding approach is probably how most programming “should” be. But the only way we’ll ever really get there is if we have a solid foundation of engineering work to build on top of. And I think overwhelming experience at this point shows we have a lot of trouble transitioning smoothly from prototype to product.

If you liked today’s post, you might also like “Write code that is easy to delete, not easy to extend.” from tef’s blog a couple years ago. As a bonus, it includes some additional references to interesting papers at the end.