You should break the Law of Demeter

In a post from a couple weeks ago, I briefly mentioned the Law of Demeter, and noted I wasn’t a fan. I’d like to go into that today.

The “what” and “why” of the Law of Demeter

This law makes the most sense in an object-oriented context, so I’m going to describe it that way. The “Law of Demeter” suggests that using “dot” to chain from one object to the next is poor design. The rationale for this is that such code induces coupling: the code you’re writing no longer merely uses its direct dependencies, but also start chaining down deeper and deeper into indirect dependencies.

In the previous post mentioned above, this was presented in a similar context as global variables. Whenever a global is introduced, it could be used from anywhere. That makes it harder to understand everything: there’s just that much more than any given function could be doing.

Aside: While diving into the Rust Future design discussions, this exact sort of problem threw me for a loop at first. The Future type gives no indication about what should wake it up to poll again. At first glance, it seems like it might have to hot-loop over all futures, constantly polling each one, which would be ridiculous.

Instead, at least in the initial design, there was a thread-local bit of global mutable state that was there for the currently running task. A manually implemented Future is supposed to mutate that state to register a file descriptor as something that can wake this task, which would then poll the Future.

So… it’s actually totally sensible behavior, but it’s just not immediately apparent from the types how it works. I vaguely understand that more modern revisions of the design are now passing a Waker down into the future, so that the mechanism for registering file descriptors is more obvious.

Likewise, chaining into indirect dependencies can obscure what a function does and how it works. The function is now operating on unstated objects, and so a similar problem can occur. Observing the Law of Demeter would counter that. It would prevent directly talking to more distant objects than the ones immediately at hand.

But more important is the coupling. If we need to change the design of something, we then need to update everything that depends on it accordingly. But if we’ve violated this law, that becomes harder. Instead of having to update just the direct dependencies, we might have to update their dependencies, and their dependencies’ dependencies, and so on transitively.

So “Demeter-chaining” can cause inter-module dependencies to grow like a spider web across code. This is obviously a bad thing, right?

What goes wrong and why?

Because of this general proscription against “Demeter-chaining,” the knee-jerk practice that often develops as a response is simply banning, or regarding with suspicion, any chaining of method calls. If you see a.b().c() then clearly a is missing a method that calls b().c(). So add that new x method to a and change the expression to a.x().

You should be immediately suspicious. If we’ve found a design problem, we generally need to change the design. Not just… pile things on top.

The obvious problem this practice introduces is that your simple local expression a.b().c() has now been broken up and partially sprinkled into a whole other class: that of a. What was once a simple, concise bit of code is now (potentially quite artificially) broken across modules. One obvious concern with that is now you may paradoxically have more coupling: when making a change to this code in the future, you might now also have to make changes to the class of a, too. This starts to grind against design aesthetics that praise “single responsibility.”

Two things have gone wrong here:

Sometimes chaining is perfectly benign, and the Law of Demeter gives us no guidance about when that is. Worse, since the above problem can occur as a result of following the “law,” this kind of chaining can be emphatically good design.
The law focuses on the use-site, where the chaining occurs, but the potential design problem is always at declaration-site. The problem here is not at all that we wrote code that Demeter-chains like a.b().c(), the problem (if there is a problem) is 100% the mere existence of b() at all!

When is chaining a good thing?

There are two ways chaining can be completely benign.

In object-oriented languages, we are often presented with very poor language support for data. “Data forced to masquerade as objects” are obvious candidates for chaining being perfectly innocuous. These “objects” aren’t supposed to encapsulate anything, so don’t try to somehow force encapsulation.
Some kinds of chaining between ADTs or Objects is perfectly expected and legitimate. The correct thing to do isn’t to be suspicious of chaining, it’s to be suspicious of public outgoing dependencies of modules.

My post describing how to think about modules introduces the distinction between a public and private dependency. A private dependency is an encapsulated one: a user of the class could have no idea the dependency exists. It’s an implementation detail. A public dependency is exposed by the module’s abstractions.

For instance, by a method on one class, that returns an object of another class. A method like b().

So the question to ask is simply: should this be a public dependency? Or not? As I said, the problem isn’t the lack of an x() method to limit chaining, the problem (if any) was the mere existence of the b() method. The exposure of that type in the abstraction’s signature (for OO, a public method on a class or interface) means that type is a public dependency of the module. Public dependencies bring all kinds of downsides, hence the justifications given for the Law of Demeter. But many public dependencies are unavoidable, expected, and legitimate.

What should we do instead?

I recommend completely disregarding the Law of Demeter. We could try to regard it as a suggestion, instead of a law, but that isn’t a good idea either. Anything we use as a signal of a potential problem needs to have a low false-positive rate. This is true of warnings and static analysis tools, and it’s true for design rule-of-thumbs. If the signal cries wolf too often, it does more harm than good.

The correct things to think about, to design better code, are public dependencies. Chaining puts our attention in the wrong place. I wish we had better tools for analyzing module dependencies. It might be interesting to have a linting tool that points out any time a patch introduces a new public dependency for a class.

A common design error isn’t the pre-existence and use of b(), but the casual unthinking introduction of a method like b() because “well, a has that object I want but it’s private/encapsulated so I’ll just add a getter…” It’s the kind of short-term “path of least resistance” coding that causes design errors: it seems like such a simple and innocuous change, but it’s actually a big one, with far-reaching implications.

Demeter’s demon

One place where the Law of Demeter shines is looking at system boundaries. Well, okay, it still directs our attention in the wrong place (at the chaining, and not the abstraction design). But here’s a good law, one I’ll call “Demeter’s Demon” because why not?

The public dependencies of a system boundary are equally system boundaries themselves.

If we have a hard system boundary (say in the form of an interface), and it has public dependencies (like methods returning another type), then those dependencies are also hard system boundaries. External users (whose code we cannot observe or control) of the one type will have also used the other type. It’s not encapsulated, it’s exposed.

Imagine a simple red-black tree ADT. Such a class would likely have an internal Node type. If its methods can return values of that Node type, encapsulation is violated and that type too has been exposed. If its interface never mentions Node, it avoids becoming a boundary, stays internal, and can be freely changed in the future. We can no longer change it without risking a breaking change.

For non-system boundaries, a public dependency may increase coupling a bit, and might make us do a bit more work to accomplish a change in the future. Oh well. But for system boundaries, an unthinkingly and erroneously exposed public dependency is a disaster.

End notes

Are there any tools out there designed to analyze code over time? I know of lots of static/dynamic analysis tools, and I know they sometimes do things like cache past warnings, so that they can only raise new warnings. But it seems like that doesn’t go far enough. The “warn on new public dependencies” thing would be more like analyzing a repository than like analyzing a snapshot of the code, seems like. Though I suppose you could do the same thing with just caching information from one build to the next.