Creating tables

Tables are the core building blocks of a relational database. Each table stores multiple rows (records), where each row contains columns (fields), and each column holds a single value.

Quick Reference: CREATE TABLE Options

Clause / Parameter
Purpose
Example

schema.table

Define schema and table name. Defaults to doc if schema is omitted.

CREATE TABLE my_schema.my_table (...);

Column definitions

Specify columns with names and data types.

id INT, name TEXT

PRIMARY KEY

Set one or more columns as the primary key.

id INT PRIMARY KEY

PARTITIONED BY

Partition data based on column values.

PARTITIONED BY (country)

CLUSTERED BY

Define shard count and optional routing column.

CLUSTERED BY (id) INTO 6 SHARDS

WITH (...)

Set table parameters (replicas, refresh interval, durability, column policy, etc.).

WITH (number_of_replicas = 1)

column_policy

Allow (dynamic) or disallow (strict) new columns at runtime.

WITH (column_policy = 'dynamic')

Data types

Choose from CrateDB’s supported types (numeric, string, date/time, geo, array, object, etc.).

location GEO_POINT

Schema creation

Schemas are auto-created if they don’t exist (unless reserved).

CREATE TABLE sales.orders (...)


Table of Contents


Table Definition

To create a table, use the CREATE TABLE statement. At a minimum, you must define:

  • A table name

  • One or more columns, each with a name and a data type.

Example:

cr> CREATE TABLE my_table (
     first_column INTEGER,
     second_column TEXT
   );
CREATE OK, 1 row affected (... sec)

This creates a table my_table with two columns:

  • first_column of type INTEGER

  • second_column of type TEXT

Dropping a table

Use DROP TABLE to delete a table:

cr> DROP TABLE my_table;
DROP OK, 1 row affected (... sec)

If the table doesn’t exist, this returns an error. To avoid errors, use IF EXISTS:

cr> DROP TABLE IF EXISTS my_table;
DROP OK, 0 rows affected (... sec)

By default, CrateDB enforces the column definitions you specify (strict column policy). You can switch to a dynamic column policy by changing the column_policy table parameter, allowing INSERT, UPDATE, or COPY FROM statements to create new columns automatically.


Schemas

A schema is a namespace for tables. Schemas are created automatically when you create a table in them.

Example:

cr> CREATE TABLE my_schema.my_table (
     pk INT PRIMARY KEY,
     label TEXT,
     position GEO_POINT
   );
CREATE OK, 1 row affected (... sec)

Check the schema:

cr> SELECT table_schema, table_name 
     FROM information_schema.tables
     WHERE table_name = 'my_table';
+--------------+------------+
| table_schema | table_name |
+--------------+------------+
| my_schema    | my_table   |
+--------------+------------+

Reserved schema names

You cannot use the following schema names:

  • blob

  • information_schema

  • sys

Schema lifecycle

  • A schema exists as long as it contains at least one table.

  • When the last table in a schema is dropped, the schema is removed (except for blob and doc).

Example:

cr> DROP TABLE my_schema.my_table;
DROP OK, 1 row affected (... sec)

If you don’t specify a schema, CrateDB uses the default doc schema:

cr> CREATE TABLE my_doc_table (
     a_column INT,
     another_one GEO_POINT
   );
CREATE OK, 1 row affected (... sec)

Naming Restrictions

General rules

  • Identifiers (table, schema, column names) cannot be reserved keywords. See Lexical Structure for details.

  • Table and schema names:

    • Cannot contain: \ / * ? " < > | <whitespace> , # .

    • Must not exceed 255 bytes in UTF-8 (including schema-qualified names).

Column names

  • Cannot contain a dot (.) — conflicts with internal path definitions.

  • Cannot match virtual system column naming patterns.

  • Cannot use subscript notation patterns (e.g. col['id']).


Table Configuration

CrateDB provides multiple configuration options when creating a table.

Sharding

  • Data is split into shards (4 by default).

  • Use CLUSTERED BY to set the number of shards and specify a routing column.

  • Cluster settings control shard allocation and balancing (attribute-based or disk-based).

See: Tuning Sharding Performance

Replication

  • Configure replicas with number_of_replicas.

  • More replicas:

    • Increase data redundancy and fault tolerance.

    • Use more disk space and intra-cluster bandwidth.

  • Replication can also improve read performance through query parallelization.

Partitioning

  • Use PARTITIONED BY to divide data into partitions based on column values.

  • Each unique combination of partition column values creates a separate partition.

  • Operations such as optimization, import/export, and backups can target specific partitions.

See: Tuning Partitions for Insert Performance

Table parameters

  • Use WITH to set parameters such as:

    • Replication & sharding

    • Refresh intervals

    • Read/write operations

    • Soft deletes

    • Durability

    • Column policy

    • And more

Last updated