Shard allocation filtering

Shard allocation filters allow you to control where a table’s primary shards and replicas are stored, based on custom attributes assigned to nodes.

This feature can be used to:

  • Allocate specific tables to specific groups of nodes.

  • Separate workloads by hardware type, location, or role.

  • Restrict data to certain zones or storage types.

Note: Per-table shard allocation filtering works in conjunction with Cluster-Level Allocation. See Custom Attributes for details on how to assign attributes to nodes.


How It Works

  1. Assign attributes to nodes (e.g., storage=ssd, zone=zoneA).

  2. Configure allocation settings for a table using routing.allocation.* parameters.

  3. CrateDB schedules shards according to the filters.


Allocation Settings

The following dynamic settings can be defined:

  • When creating a table → control where shards are initially placed.

  • When altering a table → move shards to a different set of nodes.

Setting
Description

routing.allocation.include.{attribute}

Allocate shards to nodes where {attribute} has at least one of the given comma-separated values.

routing.allocation.require.{attribute}

Allocate shards to nodes where {attribute} has all of the given comma-separated values.

routing.allocation.exclude.{attribute}

Allocate shards to nodes where {attribute} has none of the given comma-separated values.

Example:

ALTER TABLE my_table
SET (
    routing.allocation.require.storage = 'ssd',
    routing.allocation.exclude.zone = 'zoneA'
);

This will:

  • Require nodes to have storage=ssd

  • Exclude nodes in zone=zoneA

Note: These settings can be combined. In the above example, both conditions must be met.


Special Attributes

CrateDB also supports the following built-in attributes:

Attribute
Matches…

_name

Node name

_host_ip

Host IP address (IP associated with hostname)

_publish_ip

Publish IP address

_ip

Either _host_ip or _publish_ip

_host

Hostname


Step-by-Step Example: Pin a Table to SSD Nodes

Scenario: You have a mixed cluster — some nodes have SSD storage, others HDD. You want a table to only store shards on SSD-backed nodes.


1. Assign a Custom Attribute to Nodes

On each SSD node, set an attribute in crate.yml:

node.attr.storage: ssd

Restart the node for the change to take effect.

You can verify attributes via:

SELECT name, attributes FROM sys.nodes;

2. Create a Table with Allocation Rules

CREATE TABLE customer_orders (
    id INT PRIMARY KEY,
    order_date TIMESTAMP,
    amount DOUBLE
)
CLUSTERED INTO 4 SHARDS
WITH (
    routing.allocation.require.storage = 'ssd'
);

This ensures only SSD nodes will host the table’s shards.


3. Verify Shard Placement

Use the sys.shards table:

SELECT table_name, shard_id, node['name'], node['attributes']
FROM sys.shards
WHERE table_name = 'customer_orders';

You should see that all shards are assigned to nodes with storage=ssd.

Last updated