Asynchronous programming

Last week was about concurrency vs parallelism. I believe most of the confusion about the difference between these two topics stems from the synchronous nature of the I/O APIs our operating systems offer us. This tends to trick us into believing that we can write sequential code, when in fact everything is concurrent, and then it leads us to believe we need parallelism to achieve concurrency.

So today let’s look at how to write asynchronous code.

The old tricks

Last week I mentioned that threading was often abused as the workaround for synchronous APIs. If a call might block (because it’s synchronous), then the only way to continue working while you wait is to have another thread around that execution can switch to.

But this isn’t the only option for trying to fake it, because not every API was totally synchronous. Networking APIs were always pretty decent about understanding that you might not want to stop running just because an expected packet hasn’t arrived yet.

One part of the reason we got along for so long with this state of affairs is that if you add one small hack here and a little loophole there, you can get along surprisingly well. So what I’d like to describe here is probably the single biggest optimization that was ever deployed to the web.

Let’s talk about database queries in PHP. (Or at least, how it used to work. I’m a bit short of time this week, so I haven’t verified whether things have changed.)

The API you’re presented with is an apparently synchronous one: you $result = $db->query(...) and then you can iterate over the rows in the result with $row = $result->fetch_assoc(). But behind this abstraction is hiding quite a bit of machinery. One of the relevant bits is simply that the query gets sent when you call query, but the $result object isn’t actually the result. It just stores the information that can be used to try to check on the result later. It’s not until you try to get a $row that you’re really in danger of blocking on the response to the query.

This trick might seem small, but it has enormous effect. Most web applications need to send out a few queries to the database to render a web page, and many of these are independent. With this trick, and a little care in programming, all those queries can be sent out at the same time, instead of waiting for one to finish before sending the next.

Synchronous vs asynchronous independent requests. Multiple concurrent requests can significantly improve performance.

We can see another reason for confusion between parallelism and concurrency: these look like they’re parallel, right? But in truth, our thread is just sleeping doing nothing during all queries. We don’t need any parallelism to sleep. Not unless we’re forced to use synchronous APIs.

But this situation has many downsides:

It’s obviously a hack. We continue to have concurrency hiding behind an synchronous-looking API, so it’s not obvious what’s happening. So many applications needed slight tweaking from “query/fetch/query/fetch/query/fetch” to “query/query/query/fetch/fetch/fetch” that I suspect it used to be the easiest consulting gig out there.
This trick has to be re-done for everything. I believe this trick actually happens at the “driver” level for the database. (I seem to vaguely recall that MySQL did this, but earlier on, Postgres did not, and this lead to a lot of confused performance claims.) Querying a cache service? That has to do it, too. Querying another REST API? That has to do it, too.
It’s extremely limited. You still have one sequential “thread” of execution: whatever you choose to block on, that’s where you block. If a different query gets its results, you can’t just go process them while you wait.

Building applications around concurrency

In some sense, the most primitive concurrency tools are for synchronization: mutexes and the like. Certainly, whenever people talk about concurrency bugs, it’s generally errors involving the use of these mechanisms. And certainly, these are the lowest-level and highest performance tools around.

But these concurrency tools are really oriented around parallelism. If concurrency is about asynchronous, nondeterministic events, these are an extremely narrow case. Taking a lock blocks until it’s available; it’s a synchronization primitive. The most important way to think about concurrency isn’t the tool you use to re-extract synchronous sequentialism.

More accurately, the most fundamental tool for achieving concurrency is the event loop. If we’re going to operate asynchronously, we need the ability to react to events as they come in. Everything that blocks (even locking a mutex) is just a very special purpose event loop: it only pays attention to a single source of events, waiting for that single response.

If we’re going to do concurrent programming, all we need to do is embrace event loops. We just need to be able to observe and react to all relevant sources of events, and not get tunnel vision.

Designing for concurrency, then, is about figuring out how to design programs that are fundamentally driven by a main event loop.

The most obvious approach

If we’re given an event loop and asked to write programs that work with it, what do we do?

An event loop gets a notification from the OS, and then it has to handle it somehow. So the most obvious thing to do is register a notification source (perhaps a file descriptor) along with a callback that should be executed once the event has happened. The event loop can wait for any events, and once they arrive, dispatch to the associated callback. Indeed, this is something that has to happen, at some level, so this is where we should start!

And this is where Node.js arrived on the scene. It’s not exactly pioneering, but it deserves a decent amount of credit for mainstreaming this approach. Really taking seriously a programming model built around an event loop was part of what distinguished it from the competition.

And to a significant extent, it worked great. You won’t often catch me praising Javascript, but really the fact that it was all Javascript was not the defining feature of the platform. Oh, and there was a seriously good amount of educational material to draw on. If you’d like to shore up on how concurrency works in Node, here’s an excellent talk in under 30 minutes.

So Node programmers dutifully wrote callbacks for their I/O calls:

// Here's step 1:
fs.readFile(filename, (err, data) => {
  if (err) throw err;
  // Now do step 2 here
});

And this worked, and it did so quite effectively. A big part of the reason some people began to develop delusions that Javascript was “close to the metal” and high-performance was that actually embracing asynchronous programming was extremely effective. Paypal’s (in?)famous benchmark showing a Node.js equivalent having twice the performance of a Java application sparked a huge amount of hand-wringing trying to figure out how to explain those results away.

What is going on there? In short: their Java application almost certainly used the traditional synchronous threaded model, and that model is dreadful for I/O heavy applications. On the one hand, any time those threads are doing compute work, you don’t want more threads than CPU cores, because that will cause contention and not only increase latencies, but slow throughput, too. On the other hand, any time those threads are doing I/O, you often want significantly more threads than CPU cores, otherwise you’re blocked and unable to send out as many concurrent requests as you could have, and meanwhile CPU cores will be wasted idling despite there still being compute work to do.

There are various ways of losing less badly here, but there’s basically no winning. This synchronous OS-threaded model is just fundamentally broken. You either have to embrace event loops, or try to hide the event loops behind some kind of “green threading” abstraction.

I’ll have more to say about this in a future post.

But callbacks very obviously started to exhibit design flaws. The most obvious was nested callback hell. If you have a sequence of steps to perform, you’re nesting each step inside another callback, and another level of indentation, and wow, does that get tiring.

But more surprising is the problem that we saw get solved previously in PHP. What if we want to make multiple asynchronous calls (like multiple DB queries), then wait for them all? A stack overflow answer from 2013 gives us some… er, “solutions.” Later on, we find some better solutions using library tools like async.parallel.

So callbacks have some problems, but we did get all this nice concurrency. The trouble is that it’s mostly concurrency between separate HTTP requests because it was slightly harder to accomplish the one intra-request thing that concurrency really helps with in a web application.

Our lazy PHP consultant had one weird trick to make applications respond faster, and unfortunately callbacks helped create a market for lazy Node consultants to do effectively the same thing for many Node applications. Sure, the concurrency was visible now, but it was still slightly tricky to make it do what you actually wanted.

And then, the promise of promises

How did the PHP trick work? The key ingredient was that the asynchronous function—the DB query—returned an object that wasn’t really the response, but only a proxy for a future response.

Neat idea. Let’s steal it, and construct a generic, re-usable, composable type for doing exactly that with any kind of async function, and expose that concurrency, making it work with our “embrace event loops” style of programming.

And now you have Promises. Or Futures. Or Tasks. Or… a few other names. But anyway.

Promises solve most of the problems we’ve talked about so far. Unlike the PHP hack, the concurrency is visible, and we can control it. We’re not still limited to a single sequential compute thread that can only block on one thing. Unlike callbacks, we reify the control flow into a value, and we can then more readily write re-usable functions to manipulate that value. Multiple promises can easily be composed into one promise, so the “multiple DB queries” case is easier to handle.

Let’s take a look at what that looks like. This post isn’t really about Node, so I simplified this into something more pseudo-code:

var a = query1().then(processResults1);
var b = query2().then(processResults2);
var c = query3().then(processResults3);

Promise.all([a, b, c]).then([x, y, z] => {
  // Now we have all our results
});

This immediately send all three queries concurrently, then processResults each in the order the responses to those queries come back, as soon as possible, and finally is able to use all three results, once available, to finish its task.

The only real problem remaining here is that we still have to write something that looks a bit like a callback (to pass to then), and that reifying control flow as an object isn’t exactly nice. Our languages already have control flow, why do we suddenly have to re-implement control flow and start stitching things together as values?

Well, part of that reason is because our languages don’t really come with an innate “do these things in any order” operation, so a little bit of that is unavoidable. But the rest is addressed by async/await, which will have to wait for another post.

End notes

Upcoming topics in the next few weeks: async/await with more on promises/futures, C# & Rust’s concurrency models, Haskell Erlang & Go’s concurrency models, a comparison of C# and Erlang, and a counter-argument about “function coloring.”