Routing PostgreSQL queries between read-write & read-only instances

Gajus Kuizinas

Fullstack Engineer
Node.js
PostgreSQL

Scaling databases is hard. However, perhaps the lowest hanging fruit is introducing read-only replicas.

A typical load balancing requirement is to route all "logical" read-only queries to a read-only instance. This requirement can be implemented in 2 ways:

  1. Create two database clients (read-write and read-only) and pass them around the application as needed.
  2. Use a middleware to assign query to a connection pool based on the query itself.

Option 1: Two distinct clients

The first option is preferable as it is the most explicit. However, it also has the most overhead to implement – adding this to an existing codebase could mean modifying thousands lines of code. The other downside of this approach is that it that it is easy to overlook when a read-only query is using a read-write database client, and while that isn't a major issue, it would be nice if the inverse it true – we would be warned if we are using the wrong connection for the query. That might be possible with a middleware that routes queries.

Option 2: Middleware

The second option routes queries based on the query itself and the context in which it was initiated. The following has to be true for us to safely route a query to a read-only instance:

  1. It must be a SELECT query.
  2. It mustn't be part of a transaction.
  3. It mustn't execute a volatile function.

The first two requirements are relatively straightforward to implement. The third one requires that we introduce a convention.

Handling volatile functions

A volatile function in PostgreSQL is any function that has side-effects (e.g. writes data) or that can change output even within a single table scan (e.g. random()). In our context, only the first part of volatile function is relevant. However, there is no way of telling just by looking at the query if it is volatile or not.

SELECT foo() # Is foo() volatile?

Therefore, we need to have a convention for marking such queries as not safe for read-only routing. There is more than one way of doing it, but we chose a very simple approach: a magic comment that marks query as volatile. Just add @volatile anywhere in the query (comment) to indicate that it has side-effects and have middleware check for that keyword.

We could also do the inverse here and use a convention to mark which queries are safe for routing to the read-only instances using @readOnly keyword. This way we could avoid checking the query for other keywords altogether and entirely rely on what the engineer instruments. However, this approach suffers from the same short-comings as the #1 option: Easy to miss opportunities and requires editing every query one-by-one.

Implementing query routing using Slonik

We are using Slonik PostgreSQL client in our project, and lucky for us, Slonik has a beforePoolConnection middleware that can be used to implement all 3 requirements. In short, beforePoolConnection is called just before the query is assigned a connection. All we have to do is create a second read-only pool and route queries to that read-only pool when all 3 conditions are satisfied. The code is mighty straightforward:

const readOnlyPool = await createPool('postgres://read-only');
const pool = await createPool('postgres://main', {
  interceptors: [
    {
      beforePoolConnection: (connectionContext) => {
        if (!connectionContext.query?.sql.trim().toUpperCase().startsWith('SELECT ')) {
          // Returning null falls back to using the DatabasePool from which the query originates.
          return null;
        }

        // This is a convention for the edge-cases where a SELECT query includes a volatile function.
        // Adding a @volatile comment anywhere into the query bypasses the read-only route, e.g.
        // sql`
        //   # @volatile
        //   SELECT write_log()
        // `
        if (connectionContext.query?.sql.includes('@volatile')) {
          return null;
        }

        // Returning an instance of DatabasePool will attempt to run the query using the other connection pool.
        // Note that all other interceptors of the pool that the query originated from are short-circuited.
        return readOnlyPool;
      }
    }
  ]
});

// This query will use `postgres://read-only` connection.
pool.query(sql`SELECT 1`);

// This query will use `postgres://main` connection.
pool.query(sql`UPDATE 1`);

A few things that are worth clarifying about the above code:

  • We are not handling SELECT INTO in this middleware. We know that our use case does not require it, so I left it out. If you are using SELECT INTO, a simple fix could be to check if the query contains INTO keyword.
  • It is worth emphasizing that if a client initiates connection using pool#connect() or pool#transaction(), then connectionContext.query is null, i.e., we are going to fall back to using the main pool.

And that's it! Short and sweet. We now have an efficient query routing mechanism that reduced the load on our main instance and allows us to better scale our service.

2022

Partner With Gajus
View Services

More Projects by Gajus