Rolling Upgrade
CrateDB supports rolling upgrades to help you upgrade your cluster with zero downtime, by upgrading one node at a time.
A rolling upgrade is possible only between compatible versions—typically between consecutive feature releases. Some examples:
✅ You can do a rolling upgrade from X.5.z to X.6.0
✅ You can do a rolling upgrade from last X version release to first (X+1) version release
❌ You cannot upgrade directly from X.5.x to X.8.x unless explicitly stated in the release notes
Warning: Rolling upgrades are only supported for stable releases. If you are upgrading to a testing version, you must perform a full cluster restart. Always consult the release notes of your target version for specific upgrade guidance.
How It Works
Rolling upgrades involve stopping and upgrading one node at a time using CrateDB’s graceful stop mechanism. This ensures ongoing operations complete before the node shuts down.
Graceful Stop Behavior
The node stops accepting new requests
It completes all in-progress operations
It then reallocates shards based on your availability configuration
Note: Due to CrateDB’s distributed nature, some client requests may fail temporarily during a rolling upgrade.
Data Availability Options
CrateDB offers three levels of minimum data availability during a graceful stop, configurable via the cluster.graceful_stop.min_availability setting:
full
All primary and replica shards are moved off the node
Cluster stays green
primaries
Only primary shards are moved; replicas stay
Cluster may go yellow
none
No guarantees; node stops even if data becomes temporarily unavailable
Cluster may go red
Requirements
For full Minimum Availability
full Minimum AvailabilityYour cluster must have enough nodes and disk space to hold the full replica count even after one node shuts down.
Rule of thumb:
number_of_nodes > max_number_of_replicas + 1
Examples:
If a table has
1replica, you need at least 3 nodesIf a table allows a range of replicas (e.g.,
0-1), CrateDB uses the maximum number for allocation logic
If the requirements are not met, the graceful stop will fail.
For primaries Minimum Availability
primaries Minimum AvailabilityEnsure that enough shards remain to maintain write consistency
By default, CrateDB requires a quorum of active shards:
quorum = floor(replicas / 2) + 1
Note: If a table has 1 replica, a single active shard (primary or replica) is enough for writes to succeed.
Rolling Upgrade Procedure
Warning: Before starting, back up your data using snapshots.
Step 1: Disable Allocations (Optional)
To prevent CrateDB from reallocating shards while nodes are offline, temporarily restrict routing:
Skip this step if you are using min_availability = full, as CrateDB will handle shard movement internally.
Step 2: Gracefully Stop the Node
Use the DECOMMISSION SQL command to initiate a graceful shutdown:
Moves shards off the node according to the
min_availabilitysettingEnsures ongoing operations complete before the node shuts down
Avoid stopping nodes using TERM signals (e.g., Ctrl+C or systemctl stop) unless you want a non-graceful shutdown
Monitor Reallocation
You can track shard reallocation progress with:
Note: When using the Admin UI, you may briefly see a red cluster state during shutdown. This is usually a UI timing artifact, not an actual failure.
Step 3: Upgrade CrateDB
Once the node is stopped, perform the upgrade using your preferred method.
Examples:
Tarball:
RHEL/YUM:
Refer to your OS or package manager documentation for specific upgrade instructions.
Step 4: Restart the Node
After upgrading, restart CrateDB:
Tarball:
RHEL/YUM:
Step 5: Repeat for All Nodes
Repeat steps 2–4 for each remaining node in your cluster.
Step 6: Re-enable Allocations
Once all nodes are upgraded and running:
Final Notes
Always test the upgrade in a staging environment first
Monitor logs and metrics during and after each upgrade step
Consider enabling alerts for cluster health changes during upgrades
If using snapshots, verify their validity before beginning the upgrade
Last updated

