Using documentation to improve the design of software

Last week was about the impact of testing on design. I ended up a bit more preoccupied with how the impact can be negative (and how to avoid that) than how testing can have a positive impact. So today, I wanted to talk about a deeply related topic that usually does have a positive impact when used in the design loop.

Getting push back before you get users

One of the things that frequently happens is we implement software with a certain design, and don’t have a reason to revisit it and think through why we’re going with that design until it’s too late. And even if we do, it can be hard to think of what’s wrong with that design, while we’re still fresh from knowing why we implemented it that way. It’s very easy to conclude a design is just fine, because it was the easiest thing to implement but not to use, without even noticing that’s what we’re doing.

Documentation is one of the best tools we have to overcome that impulse. When we have to explain how to use software, we have to try to take the user’s perspective. This can help us confront how difficult it is to use, without yet having users. Even better, while designing for testing can sometimes have drawbacks, it’s hard to imagine designing for good documentation leading to a poorer quality design.

When we discover special cases that need explanation, we can instead look for a design that lacks that special case. When we discover “gotchas” in our design, we can instead try to smooth away such lurking problems. When we discover some things are hard to illustrate with examples because of excess boilerplate, we can instead work to make that code easier to use in the common cases.

Error messages are documentation, too. The quality of error messages can vary greatly. Here’s a rough scale from worst to best:

The “incomprehensible and un-googleable” error message. (Syntax error)
The “at least it’s uniquely searchable” error message. (T_PAAMAYIM_NEKUDOTAYIM)
A brief message that kinda makes sense, but leaves you wondering what to do about it.
Unique error codes, with associated documentation explaining their subtleties. (msvc C2001)
Links to documentation instead of an error code. (No need to search!)
Explanation of the problem in the error, no link needed! (e.g. Elm compiler errors)
Figuring out what the right thing to do is, and just doing it! (e.g. Not complaining about missing GOPATH)

Writing documentation allows you to get a handle on these things before you’ve got users. Once you have users, you have breaking changes to worry about. And of course, these system boundaries are exactly the places you should be documenting best. But documentation gets even better: you can get feedback from potential future users on it before they have the software itself.

Understanding users better

It’s a serious problem when we don’t really understand users or what they want. The usual symptom of this is to build a feature-oriented interface, rather than a task-oriented interface. Documenting our APIs, and getting feedback from users on that documentation, can help us understand what our users want.

While “feature-oriented” vs “task-oriented” most often comes up when thinking about UIs, not the software design itself, it still applies. A good example is NaCl / libsodium which does a lot of cryptography tasks. Handing users a bunch of primitives and letting them build their own solutions proved to be such a failure (because everyone got it wrong), that this library instead looked at what users wanted and structured things that way. Want to encrypt something so only someone with the private key can read it? No problem, just use crypto_box_seal, the documentation for which helpfully includes an example and doesn’t require you to know anything about how the encryption happened.

Documentation and system boundaries

Documentation can potentially cause similar problems as tests. If you go overboard and document everything, you’ve created an impediment to change. Your documentation might become stale, wrong, or otherwise useless or harmful.

Like testing, any amount of documentation about a hard system boundary is helpful. We don’t want to make breaking changes to these APIs, so we shouldn’t suffer any drawbacks from having more documentation. This is not only helpful for users, but you also get to reap the benefits to your design that writing documentation can bring.

The question is what to do about the rest of your code. The design of non-system boundaries is less important, since they are always something you can refactor later. But we don’t want to make this refactoring more difficult to do.

I think we have relatively standard practices here, but I’m not sure how widespread this is, so let me know if this gels with your experience. Outside of hard system boundaries, where we have to write good documentation, I think we get two general kinds of documentation.

First, on “soft” system boundaries (non-exposed interfaces, internal to our application, but which we nevertheless wish to treat like a system boundary), we often settle with just a JavaDoc-style reference. Soft system boundaries are a bit of spectrum. Hard system boundaries are actually exposed points of breaking changes to external users or applications. Non-system boundaries really are internal and invisible except to a narrow scope. Each of these is a category that some code either is or isn’t. But soft boundaries can vary a lot. Sometimes there will be reason to write more than just some reference documentation, but I think this is comfortable minimum. Unlike testing, where we sometimes end up with a strong impulse to over-test, I’m not sure if we frequently suffer from over-documentation.

Second, for non-system boundaries, I think the discussion changes from “documentation” to something more related to commenting techniques. There’s a large school of thought that believes in writing “clear code” that “doesn’t need comments” and for non-system boundaries, these people may have a good point. I’m fond of the idea that code should be largely self-explanatory in saying “what/how” and comments should be for adding non-obvious “whys.”

Kinds of documentation for system boundaries

There are roughly four types of documentation you should try to write.

Good error messages! Never forget these count as documentation, too.
Reference. This is the JavaDoc-style description of each individual exposed piece of an API.
Explanatory/Concepts/Topics. There are frequently unifying ideas that are a part of an API which have no explicit code component, these needs documentation too.
Examples. Sometimes this gets split up into “tutorials / how-to / guides,” but the idea is simple enough: give people code. Samples. Both introductory and more involved. If someone is asking a Stack Overflow question about your API and the best answer doesn’t involve linking to code in your documentation, you may have something missing in this department.

Each of these presents an opportunity to learn about your own design from a different perspective. Examples can be the most helpful kind of documentation for users. And examples are the category that most requires you to put yourself in their perspective, to come up with the right examples.

In the course of writing the latter two kinds, you should have enough code involved that you will want some means of testing your documentation. There are a lot of ways this can happen, but it’s probably quite important that it does happen. Even if it’s just the minimal “we manually sync the tested code and the code in the documentation and try not to forget.” Some languages have “literate programming” styles for writing documentation that can help, because the documentation becomes an executable. Some languages will even help you extract test cases from reference documentation. But otherwise these snippets of code risk falling out of date with the real code (which is part of the reason we generate reference documentation from comments in the code, JavaDoc-style.)

While much of the documentation you write can benefit from extensive cross-linking, examples are also a special case. While one should be able to get from an example to references pretty easily (and references themselves should contain small examples, and cross-link to more involve examples), it can sometimes be harder to find the exact example you’re looking for. So part of writing good examples may be paying some attention to search optimization. Make sure that user googling for something will be able to find the appropriate example that answers their question. This, too, requires understanding what users will be searching for.

Some examples of good documentation

MDN is my favorite reference documentation. MDN lives one of the deepest insights into users’ needs that I’ve ever seen: it’s a reference for every browser’s behavior, not just Mozilla Firefox. They didn’t just tell you about their implementation, they tell you everything you could want to know about any HTML, CSS, or Javascript you want some reference documentation for. It’s a gem.

The Rust language documentation is pretty good. In particular, it has a lot of examples throughout. Click the first edition and check out the table of contents: you see both reference and topic-style documentation. If you look at the second edition, you see them de-emphasize reference style further in favor of an even more concept-oriented style.

Django also shows off all of the forms of documentation you could want. It even says so at the start. :) Click through to the “topic guides” and compare how much more there is there compared to many of the other sections.

Github API and Stripe seem widely regarded and pretty good documentation. I haven’t had personal experience with them, however.

As an anti-example of what happens when documentation is unable to help you improve your design because it’s too late, consider git. I think the porcelain has managed to improve significantly since the bad old days, but its still an interface that leaves you searching for documentation to remember how to do all but the most common things, because the necessary commands are non-obvious. One of their best changes was to add output pointing out the commands to do common tasks (e.g. to git status).

Further complicating things for git is that the most readily available forms of documentation are the man pages, and man pages are a terrible design for documentation. If you ever want to ensure that you write crappy documentation that will be sure to leave users confused and you prone to screaming “RTFM” over and over, then make it a man page, and follow the rules. Common cases will be cunningly buried underneath carefully alphabetical listings of command line options. (My favorite example: strftime.) Examples, if they exist at all, will be buried at the bottom of the page, but helpfully not so close to the bottom of the page that going to the end and scrolling back will necessarily be much better. It’s an art form.

A final note

All this can only have a positive impact on design if documentation is part of the design loop. This doesn’t necessarily mean documentation has to be done from the start, it just means it has to be done before it’s too late. And it means the people doing the documentation have to be empowered to push back against the design they’re documenting.