SPQR Architecture in 4 Minutes: Router, Coordinator, Shards

SPQR is a proxy-based sharding layer for PostgreSQL. Applications keep using the PostgreSQL protocol, while SPQR decides which shard should receive each transaction.

TL;DR

Think about SPQR as three layers:

Routers accept PostgreSQL client connections and route transactions.
Coordinator and QDB manage and distribute sharding metadata.
PostgreSQL shards store the actual application data.

This keeps topology logic out of application code, lets routers scale horizontally, and keeps PostgreSQL as the system of record on every shard.

The current design did not appear immediately. Earlier experiments used PostgreSQL extension mechanisms such as FDW and planner-level integration. Those approaches were natural, but they came with too much overhead at high shard counts, high QPS, or major PostgreSQL upgrade time. SPQR took a different path: keep PostgreSQL unchanged and put a lightweight router between applications and shards.

Router Instead of Custom Drivers

The router is the central idea. Applications connect to it over the regular PostgreSQL wire protocol. From the client side, the connection still looks PostgreSQL-like: no custom database driver, no application-side shard map, no new API.

Application connects to an SPQR router, which routes traffic to PostgreSQL shards — Applications talk to SPQR routers. Routers route transactions to PostgreSQL shards.

The router inspects the beginning of a transaction and chooses the shard that should execute the whole transaction. This is important: SPQR is optimized for OLTP workloads where a transaction can usually be routed by a stable shard key such as user_id, tenant_id, or a composite key.

If the key is present in the query, the router can derive the target shard automatically. If the key is not obvious from SQL, an application can pass it explicitly with an SPQR hint:

INSERT INTO orders(id, data) VALUES (10, '...')
/*__spqr__sharding_key: 1, 100*/;

For queries that cannot be mapped to a specific shard, the behavior is configurable. A deployment can route them to a default shard, reject them, or require a more explicit routing decision depending on how strict the workload needs to be.

Many Routers, One Metadata View

A single router would become both a bottleneck and a single point of failure. SPQR avoids that by allowing many routers to run in parallel. Routers are stateless with respect to application data: they do not store user rows and can be added or removed as traffic changes.

Several SPQR routers running in parallel in front of PostgreSQL shards — Multiple routers can serve application traffic at the same time.

The challenge is metadata. Every router needs to know the same sharding rules: which distributions exist, which key ranges belong to which shard, and where the actual PostgreSQL endpoints are. Managing that independently on every router would quickly become unsafe.

Coordinator and QDB

SPQR separates the data path from the control plane. Routers handle application traffic. The coordinator manages cluster metadata and writes it into QDB, a small metadata store backed by etcd in typical deployments.

SPQR coordinator stores metadata in QDB and distributes rules to routers — Coordinator and QDB keep routers aligned on the current sharding metadata.

This split lets operators add routers dynamically without copying configuration by hand. A router can keep sharding rules in memory for fast decisions, while the authoritative view remains in QDB and is coordinated through the control plane.

What Happens to PostgreSQL Shards

Shards remain ordinary PostgreSQL clusters. SPQR does not replace PostgreSQL storage and does not ask teams to adopt a new database engine. Each shard can use the operational patterns teams already know: managed PostgreSQL, Patroni, Stolon, or another high availability setup.

That is the practical trade-off. SPQR does not try to make every possible distributed SQL query look like it runs on a single PostgreSQL instance. Instead, it focuses on a common production shape: high-throughput OLTP traffic that can be routed to one shard by a key.

At this point, the natural question is how to get from one PostgreSQL cluster to 4, 8, 16, 32, 128 shards and beyond without turning operations into a manual migration project. That is where data movement, balancing, and shard management come in. We will cover those parts in the next articles. Stay tuned, don't hesitate to star us on GitHub, and peace.