Cluster-wide settings

You can dynamically inspect or adjust almost all CrateDB cluster-wide settings at runtime using:

SELECT * FROM sys.cluster.settings;

Settings documented as runtime‑changeable can be modified while the cluster is operational. Settings that are non-runtime must be preconfigured identically on all nodes for proper cluster behavior.

Non-runtime Settings
Statistics Collection
Shard Limits
Usage Data Collector (UDC)
Graceful Stop Behavior
Bulk Operations
Cluster Discovery & Setup
- Unicast, DNS, EC2
Routing & Shard Allocation
Shard Balancing
Attribute-Based Allocation
Disk-Based Allocation
Recovery Limits
Memory Management & Circuit Breakers
Thread Pools & Overload Protection
Metadata / Gateway Settings
Logical Replication Settings

Non-runtime Settings

Must be configured identically in each node’s configuration files.
Crucial: inconsistency leads to cluster malfunction.

🔢 Statistics Collection

Setting

Default

Runtime

Description

stats.enabled

true

Yes

Enables job/operation statistics collection. May incur slight performance overhead.

stats.jobs_log_size

10000

Yes

Max number of job entries per node. 0 disables job logging.

stats.jobs_log_expiration

0s (disabled)

Yes

Time-based eviction of job log entries. Overrides size-based limit.

stats.jobs_log_filter

true

Yes

Boolean expression to filter jobs logged. Example: $$ended - started > '5 minutes'::interval$$.

stats.jobs_log_persistent_filter

false

Yes

Filters jobs written into CrateDB logs. Should be scoped carefully to avoid blocking IO.

stats.operations_log_size

10000

Yes

Max operation logs per node. 0 disables operations logging.

stats.operations_log_expiration

0s (disabled)

Yes

Time-based expiry for operation logs.

stats.service.interval

24h

Yes

Frequency to refresh table stats for query planning.

stats.service.max_bytes_per_sec

40mb

Yes

Throttles ANALYZE traffic per node when gathering stats. 0 disables throttling.

🛡️ Shard Limits

cluster.max_shards_per_node: Default 1000, runtime changeable. Caps primary + replica shards per node; controls overall cluster max shards (limit × nodes). Exceeding rejects new shard-creating operations.
cluster.routing.allocation.total_shards_per_node: Default -1 (unlimited), runtime. Enforces shard cap per node; nodes exceeding limit are excluded from allocation.

UDC (Usage Data Collector)

Read-only settings (cannot change at runtime):

Setting

Default

Purpose

udc.enabled

true

Toggle UDC on/off

udc.initial_delay

10m

Delay before first ping post-start

udc.interval

24h

Ping interval

udc.url

https://udc.crate.io/

Ping destination URL

🚦 Graceful Stop Behavior

cluster.graceful_stop.min_availability: runtime, values none | primaries | full.
cluster.graceful_stop.timeout: runtime, max time to wait for shard reallocation.
cluster.graceful_stop.force: runtime, forces shutdown if timeout expires.

Note: Ignored if cluster has only one node.

🧱 Bulk Operations

bulk.request_timeout: Default 1m, runtime. Internal request timeout for large COPY, INSERT, or UPDATE jobs.

🧭 Cluster Discovery & Setup

Non-runtime only:
- discovery.seed_hosts: List of master-eligible nodes or host:port values.
- cluster.initial_master_nodes: Required for first startup elections.
- discovery.type: zen (multi-node) or single-node (default); single-node disables auto-joining.
Discovery Providers:
- discovery.seed_providers: srv or ec2.
- DNS: discovery.srv.query, discovery.srv.resolver.
- EC2: discovery.ec2.* settings including access key, filtering by security group/tags, host type, endpoint.

🚚 Routing & Shard Allocation

Setting

Default

Runtime

Behavior

cluster.routing.allocation.enable

all

Yes

Enables/disables alloc steps.

cluster.routing.rebalance.enable

all

Yes

Controls rebalancing primary vs replica shards.

cluster.routing.allocation.allow_rebalance

indices_all_active

Yes

Conditions under which rebalancing triggers.

cluster.routing.allocation.cluster_concurrent_rebalance

Yes

Global concurrent rebalance count.

cluster.routing.allocation.node_initial_primaries_recoveries

Yes

Concurrent local primary recoveries.

cluster.routing.allocation.node_concurrent_recoveries

Yes

General recovery concurrency per node.

⚖️ Shard Balancing

balance.shard, balance.index, balance.threshold: runtime floats controlling how aggressively the cluster balances shards across nodes and indices.

🌍 Attribute-Based Allocation

Non-runtime:
- routing.allocation.awareness.attributes
- routing.allocation.awareness.force.*.values
Runtime:
- allocation.include.*, allocation.exclude.*, allocation.require.*: filter nodes based on custom node attributes for shard allocation.

💾 Disk-Based Allocation

Runtime shard-allocation controls:
- disk.threshold_enabled
- disk.watermark.low (default 85%)
- disk.watermark.high (90%)
- disk.watermark.flood_stage (95%; triggers read-only block)
- allocation.total_shards_per_node
Use percentages or absolute bytes, not both; defaults updated every 30s via cluster.info.update.interval.

🔁 Recovery Limits

Runtime-recover settings:

indices.recovery.max_bytes_per_sec (default 40mb)
indices.recovery.retry_delay_state_sync, retry_delay_network
indices.recovery.internal_action_timeout, internal_action_long_timeout
indices.recovery.recovery_activity_timeout
indices.recovery.max_concurrent_file_chunks

🧠 Memory & Circuit Breakers

memory.allocation.type: Default on-heap, runtime. off-heap is experimental.
memory.operation_limit: Default 0; cluster-wide memory cap for operations. New sessions affected only.

Circuit Breakers:

indices.breaker.query.limit (default 60%)
indices.breaker.request.limit (60%)
indices.breaker.accounting.limit (deprecated)
stats.breaker.log.jobs.limit (5%)
stats.breaker.log.operations.limit (5%)
indices.breaker.total.limit (95%)

⚙️ Thread Pools & Overload Protection

Thread pools: write, search, get, refresh, generic, logical_replication.
- Runtime pool types fixed or scaling, configured via thread_pool.<name>.type.
- Fixed pools include thread_pool.<name>.size and .queue_size.
Overload Protection settings (runtime):
- overload_protection.dml.initial_concurrency
- min_concurrency, max_concurrency, queue_size

🧾 Metadata / Gateway

cluster.info.update.interval: Runtime stat collection cadence.
Deprecated gateway settings (non-runtime):
- gateway.expected_nodes, gateway.expected_data_nodes
- gateway.recover_after_nodes, gateway.recover_after_data_nodes
- gateway.recover_after_time

🔁 Logical Replication Settings

Runtime settings:

replication.logical.ops_batch_size
replication.logical.reads_poll_duration
replication.logical.recovery.chunk_size
replication.logical.recovery.max_concurrent_file_chunks

✅ Summary

View all settings at runtime via sys.cluster.settings.
Only runtime-configurable settings can be updated while the cluster operates.
Non-runtime settings must be uniform across nodes before node startup.
Careful with filters, logging, and memory limits, especially in production environments.

PreviousNode settings NextSession settings

Last updated 3 months ago

Good evening

Cluster-wide settings

Table of Contents

Non-runtime Settings

🔢 Statistics Collection

🛡️ Shard Limits

UDC (Usage Data Collector)

🚦 Graceful Stop Behavior

🧱 Bulk Operations

🧭 Cluster Discovery & Setup

🚚 Routing & Shard Allocation

⚖️ Shard Balancing

🌍 Attribute-Based Allocation

💾 Disk-Based Allocation

🔁 Recovery Limits

🧠 Memory & Circuit Breakers

⚙️ Thread Pools & Overload Protection

🧾 Metadata / Gateway

🔁 Logical Replication Settings

✅ Summary