MySQL

MySQL 5.7 EOL: The End of MySQL Query Cache

6 min read

8 months ago

Back in October 2023, MySQL 5.7 reached end-of-life. With it went support for some of the earlier functionality of MySQL. No more updates, no more security patches, and, in some cases, no more customer support (or, rather, extended support continues, but at a premium).

One of the features on the chopping block was MySQL Query Cache, which has now been completely removed in the current stable major release, MySQL 8.0.

Many engineers and DBAs are happy to see the back Query Cache. But not everyone. If you are stuck on legacy MySQL 5.7, using Query Cache, and need to upgrade with the EOL, you might need help on where to move your caching. The good news is that caching has improved significantly since 2001 when version 4.0.1 introduced the MySQL Query Cache.

Let’s look at why MySQL Query Cache has been deprecated and what solutions are now available for MySQL caching.

Why MySQL Query Cache is Being Deprecated

The MySQL Query Cache was introduced to improve performance by directly caching the results of SELECT queries in MySQL. When an identical query is executed, the results can be served from the cache instead of re-executing the query, thus reducing load.

This was a great idea at the time. Query Cache worked for read-heavy workloads with expensive queries. It was instrumental in scenarios where complex queries examined millions of rows only to return a few results, such as generating drop-down lists for web forms. The Query Cache could also mask performance problems caused by missing indexes, making it helpful for novice users.

But this was before Memcache, Redis, and way before Readyset. Over time, it became apparent that the Query Cache had some serious limitations:

  1. Scalability: The Query Cache does not scale well with high-throughput workloads on multi-core machines. It can quickly become a bottleneck in such environments.
  2. Invalidation: Any modification to the underlying tables invalidates all cached results for those tables. This can lead to frequent cache invalidation and reduced effectiveness in write-heavy workloads.
  3. Byte-for-byte matching: Queries must match byte-for-byte to be served from the cache, which means that even minor differences (e.g., whitespace, comments) can result in cache misses.
  4. Non-deterministic features: Queries using non-deterministic features like temporary tables, user variables, RAND(), NOW(), and UDFs cannot be cached.
  5. Predictability: The Query Cache only improves performance for queries that hit the cache. It does not improve the predictability of performance, which is often more critical for user-facing systems than peak throughput.

Considering these limitations, the MySQL team decided to deprecate the Query Cache. It has been turned off by default since MySQL 5.6 (though it seems plenty of developers were already switching it off) and is completely removed in MySQL 8.0.

The decision to remove support for the Query Cache was based on the realization that engineering resources are finite, and investing in improvements generally applicable to all workloads is more beneficial. Moving the caching layer closer to the application, such as using ProxySQL as a man-in-the-middle cache, improves performance.

Moving Away From MySQL Query Cache

What are your options for replacing MySQL Query Cache?

Application-level Caching

You can implement caching mechanisms directly within application code. In-memory caching libraries or frameworks specific to your programming language, such as Redis, memcached, or Ehcache, can go further than query caching and is suitable for caching at various levels, including object-level, query-level, or page-level caching.

The good news is that these options usually provide fine-grained control over caching logic and allow for custom invalidation strategies. The bad news is that you will have to significantly update your application code to add a caching layer on the front end, and then you have the maintenance of the cache–invalidation, eviction, and consistency–on the back end.

Database Proxy Caching

Database proxy caching involves using a tool like ProxySQL to sit between your application and your database. The proxy will cache the results of queries and serve them to the application without having to hit the database.

This approach offers a simple way to introduce caching without changing application code. You can configure the caching policies, like setting a TTL for cached results or defining invalidation rules. As a bonus, database proxies often have additional useful features like connection pooling, query routing, and load balancing.

Database proxy caching does add another component to your architecture that needs to be managed and scaled. It may also introduce some latency for cache misses, as the proxy needs to forward the request to the database.

ORM Caching

If you're using an ORM framework like Hibernate (Java), Django (Python), or ActiveRecord (Ruby on Rails), you can leverage the built-in caching capabilities. ORM caches can store and retrieve database objects, reducing the need for repeated database queries.

The main advantage of ORM caching is that it provides transparent caching of frequently accessed entities, associations, and query results. It integrates natively with your application code, minimizing the need for manual caching logic. You also have fine-grained control over caching behavior, such as setting cache expiration and eviction policies.

The downside of ORM caching is that it's tied to your specific ORM framework, so it may not be as flexible as other caching solutions. It also requires careful configuration to ensure optimal performance and avoid stale data. You also have all the downsides of an ORM to contend with.

Materialized Views

Materialized views are a way to precompute and store the results of complex queries in your database. They are persisted in the database and can be refreshed periodically or on-demand.

The main benefit of materialized views is that they provide fast access to precomputed results, eliminating the need to repeatedly execute the underlying complex queries. They are particularly suitable for scenarios where the data is relatively static, or updates are infrequent.

However, materialized views do require careful planning and management. You need to define the appropriate refresh strategies to keep the data current. They also consume additional storage space in your database, so you must consider the trade-off between storage and performance.

Query Result Caching

Query result caching, such as Readyset, involves storing the results of expensive queries independently of your database. You use the query as the cache key and then retrieve the cached results for subsequent identical queries, bypassing the need to execute the query again.

This approach is particularly suitable for read-heavy workloads with infrequent updates to the cached data. By serving frequently accessed query results from the cache, it can significantly reduce the load on your database.

Using Readyset as a Drop-In Performance Layer

Readyset is a drop-in performance layer that integrates with your existing MySQL infrastructure without requiring changes to your application code. This makes it ideal when you're already experiencing the stress of updating your database.

A key advantage of Readyset over the MySQL Query Cache is its automatic update of query results. This eliminates manual invalidation and ensures your cached data remains consistent with the underlying database. With MySQL Query Cache, any modification to the tables would invalidate all cached results for those tables, leading to reduced effectiveness in write-heavy workloads. Readyset doesn’t have this problem, as the datagraph recomputes results with new writes.

You can leverage ProxySQL to integrate Readyset with your MySQL setup. By combining Readyset with ProxySQL, you can achieve significant performance improvements while maintaining transparency to your application.

Here's how you can use Readyset with ProxySQL:

  1. Configure ProxySQL:
    1. Adjust the monitor user in ProxySQL to match the user that Readyset and your application use to connect to the database.
    2. Add Readyset as a new hostgroup in the mysql_servers table. As a fallback in case Readyset becomes unavailable, it's a good practice to also add one or more read replicas with a lower weight.
    3. Redirect queries to the Readyset hostgroup using rules in the mysql_query_rules table. You can redirect all read traffic to Readyset or selectively redirect specific problematic queries that Readyset supports.
  2. Create caches in Readyset:
    1. For each query you want to cache, create a cache in Readyset using the CREATE CACHE FROM statement.
    2. Readyset will automatically populate the cache with the query results and keep it up to date as new database writes occur.
  3. Execute queries through ProxySQL:
    1. When your application executes a query that matches a rule in ProxySQL, the query will be redirected to Readyset.
    2. Readyset will serve the cached results if available, significantly reducing the response time compared to executing the query directly on the database.

By leveraging the combination of Readyset and ProxySQL, you can achieve the benefits of query result caching without the limitations of the MySQL Query Cache. Readyset's automatic cache invalidation and consistency guarantees ensure that your application always receives up-to-date results, even in write-heavy workloads.

To further simplify the integration of Readyset with ProxySQL, you can leverage the Readyset ProxySQL Scheduler. This scheduler automatically inspects your workload, identifies problematic queries, and caches them in Readyset. It queries the stats_mysql_query_digest table in ProxySQL to gather the list of queries for a specific hostgroup. For each query, it connects to Readyset, validates if the query is supported, and automatically creates a new cache entry if it is. The scheduler also supports a warm-up period, where it can mirror traffic for a particular query before fully switching the query rule to use Readyset as the destination. This allows Readyset to warm up the dataflow graphs responsible for handling the query, ensuring optimal performance from the start.

Readyset's integration with ProxySQL allows you to take advantage of ProxySQL's advanced features, such as query routing, connection pooling, and load balancing. This can further optimize your database performance and scalability.

By adopting Readyset as a drop-in replacement for the MySQL Query Cache, you can significantly boost your database performance, reduce query response times, scale your application more effectively, and say goodbye to your old friend MySQL Query Cache at the end of its life with confidence.

If you're still on legacy MySQL 5.7 or below, or if you need to replace the deprecated Query Cache, partner with Readyset and get a fully-managed version of Readyset Cloud in production.