How did we end up with containers?

Last week we looked at Maven and the design of build tools. The key takeaways were:

The best focus is on building artifacts as a whole rather than on compiling files.
With artifacts in mind, dependency management comes into scope.
Using a fixed set of versioned dependencies means you’re using the same tested artifact as everyone else, and not putting together a never-before-seen combination and hoping nobody made a mistake.

Today, I want to look at containers. The thesis here is pretty much that containers bring these same benefits to everyone, no matter what tools or language you’re using.

The further benefits of artifacts

I’ve already brought up how building an artifact instead of an executable is beneficial because it puts all users in the same configuration. Containers have essentially the same advantage. When distributing a container, the user is getting exactly the same versions the developer used. This means that whatever testing the developers have done should apply equally the actual software the user is running. This dramatically improves the ability to troubleshoot and provide support.

Story time: The Minecraft developers switched over to distributing Java along with Minecraft instead of requiring Java already be installed on user’s machines. Besides being a great boon to users (No Java Update nagging notifications! No forgetting to unchecked the “install malware” checkbox!) this is hugely beneficial to the developers as well. It enables them to adopt newer versions of Java as desired, without having to answer angry questions from users who haven’t upgraded. And it means that when they discover an incompatibility between some newer versions of Java and certain CPUs when doing what Minecraft does they can simply choose to not ship those newer versions yet. In short, they get to support their users by simply shipping a working artifact instead of hoping whatever configuration ends up on user’s machines happens to work, and then having to deal with the fallout when it inevitably doesn’t.

For containers deployed to servers, this comes with added advantages to the deployment process. For one, the build process actually becomes less important. If you’re building a container artifact, testing it, and then deploying it, it matters less if your build process is held together with spit and chewing gum. As long as someone can work the magic and get a container spit out, and it works, then it works.

Obviously, a more reliable build process is better, but I’m exaggerating for effect a bit here. My point is simply that how you get containers is less important. Contrast this with a typical deployment method prior to the “devops” era. Your server has some global state, and you run a script that brings down the old service, mutates the server state to be ready for the new service, and brings it up. Hopefully. You know, unless something went wrong in there. Hope you can roll things back somehow? If you even have a rollback script, hopefully it’s tested? Of course, how can you test it any better than that deployment script, which just failed, so who knows what happens next! What fun.

With the container approach, many of these problem points happen largely at the build time for the container, instead of at deployment time in production. So now, when they appear, it doesn’t hurt. A build failed, or it produced a broken container, that’s all. We’re able to deal with this problem long before anything touches a production system. And even if something does fail when starting the service in production, “just start the old container up again” is a rollback mechanism much more likely to succeed. (Obviously, we haven’t addressed other inherently stateful things like schema migrations here, but at least we’ve reduced the scope of where problems can appear.)

Quick tip: This is surprisingly not widely known yet, at least in my experience, but one year ago, Docker added support for “multi-stage” builds. An example from the Docker documentation:

FROM golang:1.7.3 as builder
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html
COPY app.go    .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]

Notice first that we’re defining two containers (we have two FROMs.)
The first happens to be the build environment and gets labeled as such with as builder. (Nothing special about the name.)
The second is the artifact we’re creating for the application, and gets built binaries using COPY --from=builder.

The most obvious benefit here is this lets you distribute a smaller container. The approach can also resolve problems with varying development environments, too. But there’s also a surprising insight into testing this approach illuminates.

Typically, we’d run the units during the build process, in this case, from within the builder container, where we’d have the testing dependencies that we likely won’t carry over to the artifact container. But this doesn’t give us that nice benefit of testing the artifact we’re building. So this design actually casts light on an important distinction in the kinds of tests we write: there’s unit tests we run as part of a build, and there’s integration/systems tests (Oi, more terminology we don’t use in a consistent way) that we subject the container to later.

A typical continuous integration pipeline builds, units tests, and creates the container. Then downstream does further testing on that artifact, before perhaps beginning an actual deploy to production.

Decoupling dependencies

Once again, we look at dependencies as a major focus for containers. Containers obviously bundle their exact (non-external service) dependencies within the container itself. But the even more important part is the second-order effects of this. We end up in a very Maven-like situation, regardless of what tools we’re using.

Outside of the few (usually language-specific) package managers that support it, most package managers do not permit more than one version of a package to be installed. As a result, because each container is a separate project, each project gets isolated from each other in terms of its dependencies.

No longer do you need to ensure you get the same environment in all your applications, all your servers, and all your development machines. All these things can become decoupled from each other. That is a LOT of decoupling.

An extremely understated diagram of what environments get decoupled. Everything is free of each others’ constraints. Developers can use Ubuntu. Servers can run CentOS. Applications don’t step on each other’s toes when they need different versions of libraries, in production nor in development. Freedom!

Programmatic documentation, and security

Containers typically offer an isolated-by-default experience. This is profoundly beneficial to documentation, as we don’t get working code unless we’ve accurately characterized our dependencies. Instead of finding just any old thing laying around in /usr/include or /usr/lib, we only get precisely what we asked for. This ensures the necessary dependencies are accurate.

(I’ve shipped a lot of code with what I thought was good documentation until a user informed me that a script needed wget or something and this apparently wasn’t installed by default. While it’s not containers exactly, Vagrant was huge for helping me actually verify things like this. And make things more precise… turns out -headless is helpful in avoiding unnecessary dependencies for Java in Debian-derivatives.)

This default isolation also helps further document behavior. If you want to expose a port to the outside world, that needs to be documented in the container metadata, otherwise nothing will be able to get in. Likewise, if your container needs access to external resources, you need a mechanism to start passing in those configurations. This also helps discourage building these assumptions into the application.

Something else I don’t see commented upon often: I think containers offer the best security model we’ve had in a long time. Previously, besides the standard DAC permission model, we’ve had tools like SELinux to more stringently lock down the permissions of the code we run. But this has had adoption problems, in part because it’s complicated.

And things that are complicated, besides having trouble getting adopted in the first place, are also things that are easy to get wrong.

But containers offer essentially the same benefit, but in a completely obvious way. Instead of trying to label files appropriately and ensure an application does not open files it’s not labeled to access, we can confine the application inside a container that doesn’t even see files it’s not allowed to look at. The security benefit is near-identical, but the design is such that it’s obvious how everything works, and what’s wrong when something doesn’t. (There’s not some complicated security policy denying you access, you just didn’t map it into the container, duh!)

So are containers great?

It’s not enough to compare containers to traditional sysadmining and say this is better in comparison. To learn more about design, the important part is why it’s better.

Here’s the thing: other than the security model, everything that makes containers good is really a workaround for our other tools doing things badly. Partially this is a long way to say: yes, containers are great, that security model bit is nothing to scoff at. But consider a better world:

There’s nothing about building native code (or any code) that prevents us from building artifacts. We’d need a build tool that’s designed to produce dpkg/rpm/other packages the way Maven is designed to produce jars, but that’s completely possible without containers.
There’s nothing about our libraries or compilers that requires us to dump everything into a pile of global state in /usr/include and /usr/lib. We could absolutely have our build tools be packaging-aware.
There’s nothing about our libraries that requires us to hard-code magic paths to /usr/share, there’re perfectly capable of figuring out where they were loaded from and finding their data files in the appropriate relative locations.
There’s nothing about our distro’s package managers that requires there to be only the global system state and no other local installations whatsoever.
There’s nothing about native packages that requires packages to be non-relocatable, and perhaps simply installed to /usr/pkg/arch/name/version/ by default rather than having a required installation location.

In short, there’s no reason why apt and dnf or whatever couldn’t work in a similar way npm or bundle do, except that Linux distributions seem to be disinterested. (And occasionally throw a specious fit about “bundling” when features like this are proposed. Or are unwilling to understand why the ability to install multiple versions of a package is necessary.) I’ll have to stop here before I write a rant.

End notes

In the past, I’ve had a few discussions with people who didn’t understand why people are excited about containers. I hope for those people this pair of essays was illuminating. There are certainly problems with (e.g.) Docker, but the underlying idea has a lot of merit. For people who already got why containers were a thing, I hope I managed to bring some interesting design aspects to your attention.

In the somewhat distant future, I might continue this line of thinking and do a case study about the design of Kubernetes. (It seemed obviously The Future even before I saw Kelsey Hightower do some amazing canary deployment demo I can’t find the youtube video for anymore.) But next week, I’ll get back to some more practical software design advice.