Vector search

Vector search retrieves the most semantically similar items to a query vector using Approximate Nearest Neighbor (ANN) algorithms (e.g., HNSW via Lucene).

CrateDB supports native vector search, enabling you to perform similarity-based retrieval directly in SQL, without needing a separate vector database or search engine.

Whether you're powering semantic search, recommendation engines, anomaly detection, or AI-enhanced applications, CrateDB lets you store, manage, and search vector embeddings at scale right alongside your structured, JSON, and full-text data.


FLOAT_VECTOR

Store embeddings up to 2048 dimensions

KNN_MATCH

SQL-native k-nearest neighbor function with _score support

VECTOR_SIMILARITY

Compute similarity scores between vectors in queries

Real-time indexing

Fresh vectors are immediately searchable

Hybrid queries

Combine vector search with filters, full-text, and JSON


Common Query Patterns

SELECT text, _score
FROM word_embeddings
WHERE KNN_MATCH(embedding, [0.3, 0.6, 0.0, 0.9], 3)
ORDER BY _score DESC;

Returns top 3 most similar embeddings.

Combine with Filters

Compute Similarity Score

Useful if combining scoring logic manually.


Real-World Examples

E-commerce Recommendations

Chat Memory Recall

Anomaly Detection


Further Learning & Resources

Last updated