Planning

Before upgrading CrateDB, it’s essential to plan carefully to minimize risks and downtime. This guide outlines key preparation steps to ensure a smooth and reliable upgrade process.

This is not an exhaustive list. You should adapt these guidelines to your organization’s infrastructure, processes, and operational requirements.

Table of Contents:


1. Review Breaking Changes

Before upgrading, carefully review the release notes and documentation for the target version, as well as any intermediate versions between your current and target release.

This ensures you catch any breaking changes, deprecated features, configuration updates, or behavior changes introduced along the upgrade path.


2. Set Up a Test Environment

Create a realistic test environment that mirrors your production setup, including:

  • The same CrateDB version you plan to upgrade to

  • Similar hardware, operating system, and network conditions

  • A dataset representative of your production data

Conduct comprehensive testing:

  • Functional testing to ensure queries, APIs, and integrations work as expected

  • Non-functional testing to evaluate performance, resource usage, and stability


3. Back Up and Plan for Recovery

Before starting the upgrade, perform a full cluster-wide backup of your CrateDB data. For detailed steps, see the Snapshots Documentation.

Also consider:

  • Using a message queue (e.g., Kafka, RabbitMQ) for recent or incoming data, so it can be replayed if needed

  • Verifying that all snapshot jobs completed successfully


4. Define a Rollback Plan

A rollback plan is critical in case the upgrade fails or negatively impacts operations. Below is a generalized example. Adapt it to your environment and infrastructure.

Example Rollback Steps

1. Identify the Issue Determine what went wrong—this could involve:

  • Data corruption

  • Performance degradation

  • Application errors

  • Query failures

Assess whether the issue compromises system stability, availability, or security.

2. Communicate Notify all relevant stakeholders (engineering, ops, product, etc.) about the issue and the decision to initiate a rollback.

3. Execute the Rollback Rollback strategy depends on upgrade type:

  • Patch version rollback (e.g., X.Y.1 to X.Y.0): A simple in-place downgrade is usually sufficient if no data changes need to be reversed.

  • Major or minor version rollback (e.g., X1.Y to X2.Y): Requires restoring from a pre-upgrade backup.

  • If message queuing is in place, replay queued data after restoring.

4. Validate Data Ensure the integrity and completeness of the data post-rollback by:

  • Running validation scripts

  • Checking key metrics and dashboards

  • Manually inspecting critical records

5. Share Learnings and Plan Next Steps Document what went wrong and outline improvements for the next upgrade attempt. Share this with the relevant teams.


5. Choose an Upgrade Strategy

Depending on your availability requirements and deployment setup, choose one of the following upgrade methods:

🚀 Rolling Upgrade

  • Upgrade nodes one at a time

  • Ensures zero or minimal downtime

  • Ideal for high-availability clusters

  • Requires attention to node compatibility between versions

See Rolling Upgrade Guide for detailed steps.

🔁 Full Restart Upgrade

  • Shut down the entire cluster, then start it again on the new version

  • Suitable for small clusters, development environments, or cases where brief downtime is acceptable

  • Simpler but disruptive

See Full Restart Upgrade Guide for instructions.

Last updated