JSON data (Daryl - Live)
The content has been updated since!
CrateDB combines the flexibility of NoSQL document stores with the power of SQL. It enables you to store, query, and index semi-structured JSON data using standard SQL, making it an excellent choice for applications that handle diverse or evolving schemas.
CrateDB’s support for dynamic objects, nested structures, and dot-notation querying brings the best of both relational and document-based data modeling—without leaving the SQL world.
Object (JSON) Columns
CrateDB allows you to define object columns that can store JSON-style data structures.
CREATE TABLE events (
id UUID PRIMARY KEY,
timestamp TIMESTAMP,
payload OBJECT(DYNAMIC)
);
This allows inserting flexible, nested JSON data into payload
:
{
"user": {
"id": 42,
"name": "Alice"
},
"action": "login",
"device": {
"type": "mobile",
"os": "iOS"
}
}
Column Policy: Strict vs Dynamic
You can control how CrateDB handles unexpected fields in an object column:
DYNAMIC
New fields are automatically added to the schema at runtime
STRICT
Only explicitly defined fields are allowed
IGNORED
Extra fields are stored but not indexed or queryable
Example with explicitly defined fields:
CREATE TABLE sensor_data (
id UUID PRIMARY KEY,
attributes OBJECT(STRICT) AS (
temperature DOUBLE,
humidity DOUBLE
)
);
Querying JSON Fields
Use bracket notation to access nested fields:
SELECT payload['user']['name'], payload['device']['os']
FROM events
WHERE payload['action'] = 'login';
CrateDB also supports filtering, sorting, and aggregations on nested values:
SELECT COUNT(*)
FROM events
WHERE payload['device']['os'] = 'Android';
Querying DYNAMIC OBJECTs
To support querying DYNAMIC OBJECTs using SQL, where keys may not exist within an OBJECT, CrateDB provides the error_on_unknown_object_key session setting. It controls the behaviour when querying unknown object keys to dynamic objects.
By default, CrateDB will raise an error if any of the queried object keys are unknown. When adjusting this setting to false
, it will return NULL
as the value of the corresponding key.
cr> CREATE TABLE testdrive (item OBJECT(DYNAMIC));
CREATE OK, 1 row affected (0.563 sec)
cr> SELECT item['unknown'] FROM testdrive;
ColumnUnknownException[Column item['unknown'] unknown]
cr> SET error_on_unknown_object_key = false;
SET OK, 0 rows affected (0.001 sec)
cr> SELECT item['unknown'] FROM testdrive;
+-----------------+
| item['unknown'] |
+-----------------+
+-----------------+
SELECT 0 rows in set (0.051 sec)
Arrays of OBJECTs
Store arrays of objects for multi-valued nested data:
CREATE TABLE products (
id UUID PRIMARY KEY,
name TEXT,
tags ARRAY(TEXT),
specs ARRAY(OBJECT AS (
name TEXT,
value TEXT
))
);
Query nested arrays with filters:
SELECT *
FROM products
WHERE 'outdoor' = ANY(tags);
You can also filter by object array fields:
SELECT *
FROM products
WHERE specs['name'] = 'battery' AND specs['value'] = 'AA';
Combining Structured & Semi-Structured Data
CrateDB supports hybrid schemas, mixing standard columns with JSON fields:
CREATE TABLE logs (
id UUID PRIMARY KEY,
service TEXT,
log_level TEXT,
metadata OBJECT(DYNAMIC),
created_at TIMESTAMP
);
This allows you to:
Query by fixed attributes (
log_level
)Flexibly store structured or unstructured metadata
Add new fields on the fly without migrations
Indexing Behavior
CrateDB automatically indexes object fields if:
Column policy is
DYNAMIC
Field type can be inferred at insert time
You can also explicitly define and index object fields:
CREATE TABLE metrics (
id UUID PRIMARY KEY,
data OBJECT(DYNAMIC) AS (
cpu DOUBLE INDEX USING FULLTEXT,
memory DOUBLE
)
);
To exclude fields from indexing, set:
data['some_field'] INDEX OFF
Aggregating JSON Fields
CrateDB allows full SQL-style aggregations on nested fields:
SELECT AVG(payload['temperature']) AS avg_temp
FROM sensor_readings
WHERE payload['location'] = 'room1';
CrateDB also supports GROUP BY
, HAVING
, and window functions on object fields.
Further Learning & Resources
Reference Manual:
Last updated