Controlling module dependencies

Last week, I mentioned this bit about modules:

We often think the important things about a module are what abstractions it exposes, and what dependencies it has. But the two most important questions about a module’s design are:

What must this module NOT expose?

What dependencies must this module NOT have?

(Which creates an implied third:) What other modules must NOT depend upon this module?

When I talked about modules, I spent quite a bit of time looking at what modules consist of, but I neglected this aspect. The first of these questions has the more obvious answer, however—we’re just talking about encapsulation. That’s relatively standard stuff when it comes to software design. A module should hide details, and the best thing to designate a “detail” and so hide is some assumption that might change.

But the dependencies part is under-appreciated. So today I want to work a few examples.

How do we make two modules work together?

Suppose we have some perfectly fine existing code with two modules: A & B. These modules are presently totally independent (no relationship between them). We get some new requirement, and now we have to make these modules work together. How do we make that happen?

We have 4 general options.

We can modify the implementation of A, and introduce a dependency from A to B.
We can likewise modify B to depend on A.
We can compose the modules: introduce module C, which simply uses A & B, if they are amenable.
We can do dependency inversion: modify A & B to involve an interface I, then after that change we can “compose” the result and introduce our new concern in module C.

Four options for getting two modules to work together: 1) modify A, 2) modify B, 3) composition: module C simply uses A and B, or 4) modify A & B to use an interface I, module C can use the lot.

As a general rule, options 1 & 2 are the simplest. They keep the number of modules from proliferating. However, we may not want these modules to be aware of each other. This is what may drive us to options 3 & 4. As you can see in the diagram above, options 3 & 4 keep these modules independent of each other.

Option 3 is nearly as simple, but we may not be able to accomplish it. To make the modules “work together,” we might be forced to modify their implementations to accommodate that kind of composition. In a sense, that’s exactly what option 4 is: one specific kind of modification to A & B, introducing a mediating interface in a separate module, that allows that composition to happen.

A neat thing about the difference between approaches 3 & 4: if you don’t need a fancy interface, option 4 starts to look like option 3. For instance, if you’re writing in a functional language, and you can use higher-order functions (which use non-nominal, structural types) instead of having to define a new separate interface. When you don’t need to declare a new interface, you don’t have to find a module to house it.

So this is part of what I mean when I say that “a dependency that this module should NOT have.” The simplest, most obvious, and most straightforward thing to do is to always stick with 1 & 2. We have to reach for the others because we’re in a situation where introducing those dependencies is unacceptable.

Let’s be more concrete.

I’m going to write some example Java code showing each of these options. I’ll use the example of a User and Book classes, where we want to introduce a relationship (“books a user has read”). To keep things brief, I have to over-simplify in a couple of ways:

This is a relationship between two classes, but usually that’s the least interesting form of dependency. The ones that matter most are between jar files (i.e. larger collections of code that are distributed and versioned together.)
This example is a bit in the vein of the relational model, and so these classes are all simple ADTs, not interfaces (or “object types” in my preferred sense of the word). Again as a result, things look a bit simple.

So, here’s option 1:

class User {
  // etc...
  List<Book> books_read() { ... }
}
class Book {
  // etc...
}

and of course, option 2:

class User {
  // etc...
}
class Book {
  // etc...
  List<User> users_who_have_read() { ... }
}

Notice how the most natural relationship changes here. When we pick one to depend on the other, we end up in a situation where one “view” on the data is preferred: either we can look at a user and see their books, or looks at a book and see its users. We might even want both of these things, in which case, we might start to see a mutual dependency. (Or just an awkward API, since to get a user’s “books read” list, we don’t have to put a method on User, it could also just be a static method on Book.)

Let’s try option 3:

class User {
  // etc... unmodified
}
class Book {
  // etc... unmodified
}
class BooksRead {
  static BooksRead by_user(User u) { ... }
  static BooksRead by_book(Book b) { ... }
}

So one nice thing about the “composition” approach here is that we can easily support both relationships, in a consistent way, without introducing dependency loops or the like. (Again… for individual classes like User and Book from (presumably) the same package, there’s probably no issue with introducing a dependency loop. But imagine versioned jar files or libraries that you’re distributing.)

One thing you might notice about this example is that it looks exactly like how you would model this relationship with tables in a database. Not a coincidence. That’s a compositional approach.

Option 4:

class User {
  // etc...
  <U> U lookup(EntityRelationship<User, U> r) { r.lookup_first(this); }
}
class Book {
  // etc...
  <T> T lookup(EntityRelationship<T, Book> r) { r.lookup_second(this); }
}
interface EntityRelationship<T, U> {
  List<U> lookup_first(T e) { ... }
  List<T> lookup_second(U e) { ... }
}
class BooksRead implements EntityRelationship<User, Book> {
  // etc...
}

Now, this is contrived in several ways, but it’s an example of how an interface could sit in the middle. It’s totally unnecessary in this situation (as our example of option 3 shows, which is clearly preferrable). Our need to modify User and Book is solely to give them methods, so we don’t have to go looking for a class like BooksRead that’s somewhere else to find this data. But that’s a pretty small advantage, compared to the added complexity we see here.

This example is also contrived in that we’re using an interface to get some abstraction, but an interface is totally unnecessary here. If we were writing in C#, this example could naturally be implemented (and look more like option 3) using extension methods. We’re not actually using the dynamic dispatch of the interface here, not really.

The trouble is Book and User are really just simple (single-class) ADTs. To make this example “real,” we’d want an example where module A exports an interface with multiple implementations, and the interface we introduce here is more like a visitor, where dynamic dispatch is actually required.

When do we care about non-dependencies?

I think the above examples make it concrete what each of these options looks like. But next we might wonder: how should we bother choosing among them? Ok, the earlier options are simpler, so why would we ever go with the less simple options?

I’m not sure I have a good rule for this. The problem with just always doing the simple thing here is that you can end up with a tangled mess. It’s too easy to introduce a dependency without ever thinking about it. And sometimes it’s just obvious, once you actually think about it, that such a dependency should not exist. The biggest problem is just not being aware of what you’re about to do.

Let’s suppose module A is “a compiler” and module B is “an IDE plug-in.” It should seem obvious that what we want is the plug-in to depend on the compiler, and we do not want to complier to depend on the plug-in. This is just a natural result of wanting to use the compiler in situations where an IDE plug-in is irrelevant.

I choose this example in part because this is a mistake I made in the past. It might seem odd that anyone would ever do this, but we were thinking in terms of having a “specification language” from which we “generated” a compiler. So obviously, why not also generate an IDE plug-in from that specification, right?

But “specification” is just a fancy word for “program” and “generate” is a fancy way of saying “compiles.” So our compiler started getting polluted with IDE concerns.

These mistakes are usually just a case of not thinking about it. As soon as I realized what we were doing, it was obvious we should have done things differently.

We can also start to imagine what things looks like for the other options. The simple composition case looks like a compiler (module A), an IDE plug-in runtime library (B), and the plug-in that uses such a library together with a compiler (C).

And if we drill into how exactly the works, we can start do see something like the dependency inversion case. How does that runtime library (B) get “composed” with the compiler? Probably by defining some interfaces that the plug-in (C) implements using the compiler (A). So this is similar, but there was no reason for the interfaces (“I”) to be separated into their own module; they could just remain together with the runtime (B).

There are some more funky things we can struggle with here. If we want to keep IDE concerns separate, what about things that are really naturally written as part of the compiler? “Jump to declaration” support, for example, is something that may be best implemented as part of the compiler, but is clearly a “concern” of the IDE.

The best approach here is probably to put it into the compiler, and just to ensure that it has no actual dependencies on IDE modules. That is, you can compute the metadata to support a “jump to declaration” feature, but then leave it up to the IDE plug-in to interpret that metadata in a way that actually implements the feature for whatever IDE.

End notes

One mechanistic way to spot inappropriate dependencies might be through code review. For example, in Java, you could look at the changes to the import lists, instead of letting your eye just skip over it.