Less Work is Faster: Database Caching Done Right

7 min read

11 days ago

Less Work is Faster: Database Caching Done Right

Every engineer eventually confronts the complexities of scaling applications and systems. At first glance, the solution seems straightforward: add more cores, introduce replicas, or distribute workloads. However, at scale, each decision quickly compounds in complexity, introducing unexpected overhead, rising costs, and increasingly fragile systems.

The reality is that effective scaling isn’t solely about adding more resources, but also about strategically reducing redundant computations.

The Path of More Compute

The scalability journey starts early on in an application’s lifecycle, with choosing how to store and retrieve the application’s data. Most applications begin by selecting well-established databases like MySQL or Postgres, which have been trusted for decades due to their robustness and reliability. Initially, this choice is easy: these traditional databases provide durability, performance, and flexibility at near-zero cost. Furthermore, as applications become more popular and user counts rise, scaling appears straightforward—simply increase the available compute resources and throughput increases.

...effectively maintaining and growing these databases over time requires careful management of performance, reliability, automation, observability, and especially scalability

And indeed, in the early stages, upgrading cloud instances to higher CPU counts is both practical and cost-effective. However, effectively maintaining and growing these databases over time requires careful management of performance, reliability, automation, observability, and especially scalability. Scaling through resource addition inevitably hits limits, both technical and economic. Eventually, scaling horizontally by distributing the application across multiple machines becomes necessary. And while horizontal scaling mitigates bottlenecks at first, the complexity of such a configuration quickly mounts. Managing multiple database replicas introduces issues such as replication lag, intricate infrastructure requirements, uneven query distribution, and challenging failure recovery scenarios. 

Inevitably, at some point along this path, someone will innocently ask the very reasonable question: "why do we need five database servers to serve a few thousand users?". It usually comes up shortly after someone else has tried to figure out if the cloud bill has a typo. It can be a surprisingly hard question to answer. Some of the cost comes from horizontal scaling. Some of it comes from inherently non-scalable parts of the system. Some of it comes from consistency enforcement. But for many modern applications, there is one aspect that dominates all the others: doing the same work again and again.

The Path of Less Work

An alternative approach is to reduce the computational workload itself. This approach spans macro-level choices (like adopting faster technologies) and micro-level optimizations (such as eliminating redundant computations). The biggest savings tend to come from identifying cases where your application does work that it didn't need to do in the first place. One common example is moving from a more-powerful to less-powerful primitive when all you need is what the less powerful one provides (e.g., moving from JOIN to EXISTS or from stable to non-stable sorting). Another, and the one we'll focus on here, is to re-use a previous computation rather than re-computing it; in other words, caching. This is particularly effective in modern database workloads, which frequently exhibit Zipfian access distributions, where a small set of queries dominate database traffic. Efficiently caching these high-frequency queries can significantly reduce database load and resource consumption.

In fact, this was such a good idea that MySQL had it built-in for a while in the form of the MySQL Query Cache. The "for a while", however, is telling; the feature was removed in MySQL 8.0 with no replacement in sight. The story is an all too familiar one for caching systems; the invalidation strategy was too coarse-grained, the cache keys were chosen poorly, and the cache itself did not scale. And that's not to say the MySQL developers are to blame: caching is easy at a glance, but once you look into the details, it's a complex affair. You want writes to only invalidate the caches that reflect a given write specifically (rather than, say, empty all the caches for a table when there's any write to that table). You need to mitigate thundering herds (also known as cache stampedes), either by coalescing requests or by updating the cache in-place instead of invalidating. You need to ensure that semantically equivalent but syntactically different queries use the same cache to use the cache space efficiently. And if possible, you want queries that share intermediate state to benefit from caching for that state as well. You need to determine what gets to stay in the cache over time to maximally utilize the compute resources you have access to. Not to mention you need to ensure that the caching itself can scale, which is no simple feat. Ultimately, the way to minimize cache misses is to deeply integrate the caches with every level of the compute — it can't be tacked on just at the end.

The MySQL Query Cache isn't special in this regard; the exact same issues arise when trying to apply caching using memcached or Redis directly from your application logic. Either you have to write extensive systems for managing cached compute (like Meta's TAO), or you try to solve these issues ad-hoc as they arise. And doing so inevitably leads to an organic growth of complexity across your code base that tries to make all the intertwined-but-dispersed caching logic produce the right behaviour. Changing an existing database engine to intelligently manage caching throughout is similarly challenging.

Effective caching hinges on addressing these challenges systematically, and directly into the system that holds and queries the application’s data.

Database Caching Done Right

It's this pain point that the academic project Noria, and now Readyset, is trying to address in a more holistic fashion. Rather than wrapping caching around the database, Readyset is effectively a database implementation that integrates the cache at every level of query execution. Instead of computing query results each time a query arrives, the queries are transformed into a dataflow graph of materialized views (effectively caches of intermediate query state). Updates such as INSERTs and UPDATEs flow through the graph, continuously and incrementally updating the state they flow through; a process known in the database literature as incremental view maintenance. In this model, reads (i.e., SELECTs) become mere[^its-hard] key lookups into the "leaf" materialized view of the dataflow graph. That is, they inherently hit a cache, and thus there’s no more need for ad-hoc cache management code littered throughout the application code.

[^its-hard]: There are plenty of nuances around how such a system handles missing state in materializations and how to incrementally maintain arbitrary SQL, but that's beyond the scope of this article. Suffice to say that practical incremental view maintenance systems for OLTP workloads have proven elusive and difficult to build.

This shifts the cost of executing queries from reads to writes, and from executing the query from scratch to "re-executing" it with the benefit of knowing its previous output. Under the assumption that writes are less frequent than reads, and that most reads are for a small subset of the data, the database has to do significantly less work. If two queries are the same, the second automatically re-uses the resultset from the first. If the data changes, any affected cached resultsets are updated in-place once, and the queries against them still only hit the cache throughout[^consistency].

[^consistency]: Caching inherently comes with a consistency trade-off — you either block reads if there are concurrent updates, or you accept occasionally serving stale results. This trade-off isn't to be made lightly, although the performance benefits from accepting occasional eventual consistency are significant. This analysis from Meta is good reading on the topic.

What's particularly neat with this approach is that Readyset can be run alongside your existing database with no changes needed to the application. By piping the replication log from your main database to Readyset, all the same data becomes available in both. Then, you can use a tool like ProxySQL to send only the queries that particularly benefit from caching towards Readyset, while the remaining ones continue to hit your main database. This is attractive not just because cached queries become virtually instantaneous, but also because those queries now come at near-zero compute cost. Assuming your most voluminous queries are also the most cacheable, this frees up CPU cycles for the database to use on handling all your other queries. 

This is attractive not just because cached queries become virtually instantaneous, but also because those queries now come at near-zero compute cost.

It's worth noting that incremental view maintenance isn't a silver bullet either — you still need to think about what you query for, how you express those queries, which queries are likely to benefit from caching (and not suffer ill-consequences from eventual consistency!), how your data is arranged into a schema, etc. And you may still need caching at the application layer, such as to mitigate complex rendering costs or memoize state from external systems. However, when it comes to reducing the cost of your database queries specifically, Readyset's incrementally maintained materialized views is the way to go.

Impact on Real World workloads

In a recent case study, Readyset was deployed as a caching solution for a platform that enables businesses to build, launch, and optimize online sales funnels quickly and efficiently. In this application, queries become repetitive and resource-intensive during peak traffic periods, such as holiday sales. Readyset enables these common queries to be cached and incrementally updated only when necessary, thus significantly reducing the amount of repeated work that must be executed. Benchmarking results from a real-world deployment demonstrated substantial performance improvements:

  • Database CPU utilization reduced by up to 10x during peak traffic.
  • Average query latency decreased from hundreds of milliseconds to under 10 milliseconds.
  • Infrastructure cost savings by significantly reducing the number of required read replicas.

Conclusion

Scaling systems by simply throwing more compute at the problem doesn’t scale indefinitely — eventually, efficiency has to come from doing less work. And, concretely, doing less work means recognizing when the work has already been done (i.e., caching) and only doing slightly more work to update a previously-cached value (i.e., incremental maintenance). But traditional approaches to reducing workload in this way, like application-layer caching, introduce their own set of complexities, brittleness, and maintenance overhead. Readyset takes a different path: by pushing caching and incremental computation into a dedicated caching layer that understands queries just like the database does, it eliminates the need for custom caching logic in the application and simplifies overall system architecture. While complex under the hood, Readyet’s impact is architectural clarity, reducing work, reducing risk, and making it easier to scale.