Caching real GitLab queries on Azure Postgres with Readyset
6 min read
•
2 days ago

GitLab on Azure Postgres is expensive once a few teams start hammering the REST API. The bill grows because Postgres has to serve the same project list, issue count, and dashboard rollup over and over again, on top of the real work (writes, search, CI metadata). The standard answer is to scale up: For example, D4ds_v5 to D8ds_v5 to D16ds_v5, same shape, double the price (check the pricing calculator).
This post walks through a different answer. We put Readyset between GitLab and an Azure Database for PostgreSQL Flexible Server, captured every byte exact query GitLab issued during a representative API workload, decided which queries actually tolerate a few seconds of staleness, and cached only those. The result is a 62% Azure cost reduction at the same throughput, with the queries that matter (aggregate counts, list endpoints, activity feeds) running 5x to 25x faster.
What you will see in this post
- The Azure topology: GitLab CE 18.11.2 + Readyset + Azure PG Flexible.
- The cache selection rule. We did not cache everything that was
cache-eligible. Read-your-own-writes paths were left to proxy. - Per-query speedups against real GitLab queries pulled from
pg_stat_statements. - The cost story: scaling Postgres down from D8ds_v5 to D2ds_v5.
- The integration playbook: every Azure-specific step, in order.
Tested on Azure
| Component | Detail |
|---|---|
| Postgres | Azure Database for PostgreSQL Flexible Server, General Purpose D8ds_v5 (8 vCore, 32 GiB), 128 GiB Premium SSD P10, PostgreSQL 18.3 |
| Readyset VM | Standard D2s v3, 2 vCPU, 8 GiB, Ubuntu 24.04, public.ecr.aws/readyset/readyset:latest-nightly, CACHE_MODE=shallow |
| GitLab VM | Same sizing as Readyset. GitLab CE Omnibus 18.11.2, external Postgres pointed at Azure PG |
| Region | Azure Canada Central (all three resources) |
| Run date | 2026-04-30 |
GitLab is real here. Omnibus install, all 1,043 tables migrated, the Rails web/API/Sidekiq stack live. The workload is a Python script hitting the REST API with a personal access token. This is not a synthetic microbenchmark.
The cache selection rule (read this first)
Readyset shallow caching serves cached results for a bounded TTL window (10 seconds by default), refreshed asynchronously. That trade is fine for queries the user expects to be eventually consistent: aggregate counts, project lists, label dropdowns, group activity feeds, dashboards. It is not fine for read-your-own-writes paths.
Examples we deliberately did not cache: "the issue I just closed", "the MR I just opened", "my dashboard right after a self-assign". Those should always reflect the user's most recent write, so they proxy through to Postgres.
This is the rule the rest of the post follows. Out of 250 distinct application query shapes Readyset evaluated, 208 were cache-eligible (the planner can serve them). We cached only the read-mostly, tolerates-staleness subset. The numbers below come from that subset.
Baseline: GitLab against Postgres direct
API workload: 508 HTTP requests, 8 concurrent workers, 180 seconds.
| Endpoint | n | p50 | p95 | mean |
|---|---|---|---|---|
GET /api/v4/projects/:id/issues?state=opened | 94 | 2861 ms | 7187 ms | 3220 ms |
GET /api/v4/projects | 90 | 5827 ms | 9985 ms | 6113 ms |
GET /api/v4/projects/:id/merge_requests?opened | 85 | 789 ms | 2597 ms | 1083 ms |
GET /api/v4/projects/:id | 81 | 2197 ms | 5124 ms | 2611 ms |
GET /api/v4/projects/:id/pipelines | 76 | 848 ms | 3505 ms | 1285 ms |
GET /api/v4/projects/:id/events | 50 | 2632 ms | 5626 ms | 3088 ms |
GET /api/v4/groups | 22 | 1582 ms | 3757 ms | 1898 ms |
These are end-to-end HTTP latencies, including Rails, Workhorse, Puma, Sidekiq, and the SQL.
Capturing the byte-exact queries
Once pg_stat_statements is enabled and the workload has run, the top-by-impact reads come straight out of the catalog:
Six of the top reads matched the cache selection rule (aggregations, sort-then-limit, recent-activity rollups). We created a Readyset cache for each.
That is the entire integration step on the application side. No GitLab code changes, no schema migrations, no ORM overrides.
Per-query results
Each query was run 100 times after warmup, on three paths: Postgres direct, Readyset transparent (no cache declared, just proxying), and Readyset cached (after CREATE CACHE FROM).
| Query | PG p50 | PG QPS | RS cached p50 | RS cached QPS | Speedup |
|---|---|---|---|---|---|
orderby_issues | 85.16 ms | 12 | 3.24 ms | 308 | 26.3x |
recent_issues_admin | 15.95 ms | 62 | 0.63 ms | 1,573 | 25.3x |
issue_count_by_state | 4.31 ms | 228 | 0.45 ms | 2,173 | 9.5x |
count_issues | 8.37 ms | 119 | 1.18 ms | 845 | 7.1x |
top_authors_recent | 4.07 ms | 243 | 0.70 ms | 1,428 | 5.8x |
issue_count_open | 2.41 ms | 407 | 0.42 ms | 2,347 | 5.7x |
The big wins are the queries that hurt Postgres most under concurrent load: ORDER BY ... LIMIT over issues, COUNT(*) GROUP BY state_id, recent-activity rollups. Caching those is double duty: faster for the user, fewer CPU cycles spent on the database.
Latency parity is also a win
If you replay every cache-eligible query (92 of them, after excluding the read-your-own-writes set), the median speedup is roughly 1x. Most of GitLab's high-call queries are Rails primary-key lookups that Postgres already serves in well under 1 ms via its index. At that scale, Readyset's pgwire overhead competes with the speed of Postgres itself, so per-query latency comes out roughly even.
That is not the same as "the cache did not help". Every query Readyset answers from cache is a query Postgres did not have to execute. Even at identical latency, the database's CPU, buffer pool, and connection slots are now free for the work only Postgres can do: writes, uncacheable analytics, and anything in the proxied bucket.
The cost story: scale Postgres down
The customer was paying for, to absorb concurrent reads, not because the workload needed 8 vCores of write throughput. With Readyset serving the cached reads from RAM, Postgres can drop to a fraction of that compute.
| Setup | Compute (Canada Central, USD/mo) |
|---|---|
| Without Readyset: Azure PG D8ds_v5 + 128 GiB SSD | $593 |
| With Readyset: Azure PG D2ds_v5 + 128 GiB SSD + Readyset VM (D2s v3) | $226 |
−$367/mo · 62% saved · ≈ $4,400/year
That is $367/mo saved at the same throughput target, or roughly $4,400 per year. The Readyset VM costs $70/mo on its own. The savings come from the database tier moving from D8ds_v5 to D2ds_v5, not from any kind of license arbitrage.
Integration playbook (Azure-specific)
For shallow caching, the Azure-specific surface is small. Logical replication, replication slots, and the WITH REPLICATION role attribute are not required. Shallow caching is TTL-based; it does not stream changes from upstream.
What is required:
azure.extensions allow-list must include PG_TRGM, BTREE_GIST, PG_STAT_STATEMENTS, PLPGSQL. GitLab needs pg_trgm and btree_gist to install. pg_stat_statements is what makes the byte-exact query capture possible. Dynamic parameter, no restart.
CREATE DATABASE gitlab_realas the GitLab database.- GitLab Omnibus installed on the GitLab VM with
postgresql['enable'] = falseandgitlab_rails['db_*']pointing at Azure PG. - NSG rule on the Readyset VM allowing TCP/5433 inbound from the GitLab VM.
- Readyset container started against the Azure PG endpoint:
Pass --verify-skip if Readyset complains about SUPERUSER. Azure Flex does not grant a true SUPERUSER role; the startup check is overly strict for shallow caching.
- Repoint GitLab at the Readyset VM in
/etc/gitlab/gitlab.rb:
GitLab is now reading and writing through Readyset. Without any caches declared, Readyset is a transparent proxy and every query is forwarded to Azure PG. That is your baseline. Now you can declare caches one at a time, watching the speedups roll in.
Caveats
Shallow caching is bounded staleness, not zero staleness. The default TTL is 10 seconds. Users who interact with their own writes (open an MR, close an issue, update a label) must hit a non-cached path so they always read their own writes. This is a per-query design decision, not a switch.
Latency parity is the floor, not the ceiling. On hot primary-key lookups Readyset and Postgres land within a millisecond of each other. The database wins on raw QPS for those because there is no protocol round trip beyond the index lookup. The point of caching them is not latency; it is taking that load off Postgres so the database tier can shrink.
Azure Flex's SUPERUSER limitation is real. --verify-skip works around it for shallow mode. If you ever want to test deep mode (this post does not), you would also need a WITH REPLICATION role grant and wal_level=logical, which Azure Flex supports but adds two static parameters and a server restart.
Not every cache-eligible query is worth caching. 92 cache-eligible queries replayed end-to-end produced a median 1x. Caches have memory
cost and a TTL refresh cost. Pick the queries that hurt Postgres most under concurrency, leave the rest as transparent proxy traffic.
Wrap up
The interesting result is not the 26x speedup on orderby_issues. The interesting result is that the same workload, on the same Azure region, on the same data, can run on a Postgres tier 4x smaller as long as Readyset is in front of it, making the cache selection decisions. The $367/mo saving is what falls out of treating shallow caching as a deliberate per-query decision, not a feature flag.
If you want to run the same setup yourself, the demo lives at github.com/vgrippa/mydemos under readyset-gitlab/. The Python load driver, the cache definitions, the benchmark scripts, and the run logs from this post are all there.
You can download Readyset here: github.com/readysettech/readyset
Authors