Generated columns

CrateDB supports generated columns in SQL DDL statements. A generated column is a virtual column whose value is computed automatically from other columns in the same row using a generation expression. Generated columns are useful for storing derived data without manually computing or maintaining it in application logic.

Syntax

column_name [data_type] GENERATED ALWAYS AS (expression)

The generation expression is evaluated in the context of the current row. It can reference other columns in the same row but cannot refer to other generated columns or use non-deterministic functions (with some exceptions noted below).


Examples

1. Canonical Example

Create a table with a quotient column automatically calculated from two other numeric columns:

CREATE TABLE computed (
  dividend DOUBLE PRECISION,
  divisor DOUBLE PRECISION,
  quotient GENERATED ALWAYS AS (dividend / divisor)
);

In this example, quotient will automatically store the result of dividing dividend by divisor for each row.

2. Using CURRENT_TIMESTAMP

You can use CURRENT_TIMESTAMP to automatically populate a column with the time of insertion:

CREATE TABLE computed_non_deterministic (
  id LONG,
  last_modified TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS CURRENT_TIMESTAMP
);

Note: While CURRENT_TIMESTAMP is generally non-deterministic, it is permitted in generated columns because it evaluates to the insertion time.

3. Partitioning with a Generated Column

Generated columns can also be used to define partitioning logic, allowing CrateDB to distribute data across shards efficiently:

CREATE TABLE computed_and_partitioned (
  huge_cardinality BIGINT,
  big_data TEXT,
  partition_value GENERATED ALWAYS AS (huge_cardinality % 10)
) PARTITIONED BY (partition_value);

Here, the partition_value is derived from huge_cardinality and used to partition the table into 10 shards.


Usage

Generated columns can be used in:

  • SELECT statements like any regular column

  • WHERE and ORDER BY clauses

  • Partitioning definitions (as shown above)

  • Indexing and constraints, if needed

They cannot:

  • Be written to directly (since they are derived)

  • Reference other generated columns

  • Use non-deterministic functions (other than CURRENT_TIMESTAMP)

Last updated