Replication
Replication in CrateDB ensures data redundancy, high availability, and potential faster reads by keeping multiple copies of each shard across different nodes.
When replication is enabled:
Each shard has one primary and zero or more replicas.
Writes always go to the primary shard.
Reads can go to either the primary or any replica.
Replicas are kept in sync with primaries via shard recovery.
If a primary shard is lost (e.g., node failure), CrateDB promotes a replica to primary, minimizing the risk of data loss.
More replicas → Higher redundancy and availability But also → More disk usage and intra-cluster network traffic.
Replication can also improve read performance because more shards are distributed across nodes, enabling parallel query execution.
Table of Contents
Configuring Replication
You configure replication per table using the number_of_replicas
setting.
Example:
CREATE TABLE my_table (
first_column INTEGER,
second_column TEXT
) WITH (number_of_replicas = 0);
Fixed Replica Count
Set number_of_replicas
to an integer for a fixed number of replicas per primary.
Replica Ranges
You can specify a range using "min-max"
.
This allows CrateDB to adjust the number of replicas dynamically as cluster size changes.
0-1
Default. One node → 0 replicas. Multiple nodes → 1 replica.
2-4
Minimum 2 replicas for green health.
Max 4 replicas if enough nodes.
Fewer than 5 nodes → yellow health.
0-all
One replica per additional node (excluding the primary’s node).
If not set, CrateDB defaults to 0-1
.
You can change
number_of_replicas
any time withALTER TABLE
.
See also: CREATE TABLE: WITH clause
Shard Recovery
If a node becomes unavailable:
Promote replicas to primaries for lost primaries.
Rebuild missing replicas from the new primaries until the configured
number_of_replicas
is met.
This ensures continuous availability and partition tolerance, with some consistency trade-offs. See: CAP theorem
You can control shard placement with Allocation Settings.
Underreplication
Underreplication occurs when CrateDB cannot fully allocate the configured number of replicas because there aren’t enough suitable nodes.
Rules:
No node ever stores two copies of the same shard.
For 1 primary + n replicas, you need n + 1 nodes to fully replicate.
If not met → table enters underreplication → yellow health.
Live Scaling Example
Goal: Start with minimal replication and scale out while letting CrateDB automatically increase redundancy.
1. Start Small
CREATE TABLE orders (
id INT PRIMARY KEY,
customer TEXT,
amount DOUBLE
) WITH (
number_of_replicas = '0-2'
);
Cluster size: 1 node
Effective replicas: 0 (writes go only to the primary shard)
Reason: With one node, replicas can’t be placed elsewhere.
2. Add a Second Node
CrateDB automatically detects the new node and increases replicas within the configured range:
Cluster size: 2 nodes
Effective replicas: 1
Result: One primary + one replica per shard → green health.
Check:
SELECT table_name, COUNT(*) FILTER (WHERE primary = false) AS replicas
FROM sys.shards
WHERE table_name = 'orders'
GROUP BY table_name;
3. Add a Third Node
Cluster size: 3 nodes
Effective replicas: 2
Result: One primary + two replicas spread across three nodes.
The system automatically balances shard placement to avoid putting multiple copies of the same shard on a single node.
4. Node Failure Scenario
If one node goes offline:
CrateDB promotes replicas to primaries as needed.
Missing replicas are rebuilt from the primaries once the failed node returns or another node is available.
Check replication health:
SELECT health, severity, missing_shards
FROM sys.health
WHERE table_name = 'orders';
5. Back to Full Health
Once the cluster regains enough nodes, CrateDB restores full replication automatically, within the 0-2
limit.
✅ Outcome: By setting a replica range instead of a fixed count, replication adapts to cluster size without manual reconfiguration, improving both resilience and resource usage.
Last updated