MongoDB CDC (Juan - review)
CrateDB Cloud enables continuous data ingestion from MongoDB using Change Data Capture (CDC), providing seamless, real-time synchronization of your data.
The MongoDB CDC integration in CrateDB Cloud allows you to keep your data synchronized between your MongoDB Atlas cluster and your CrateDB Cloud cluster in real-time.
How It Works
The integration has two optional stages:
Initial Sync: The integration performs a complete scan of your selected MongoDB collection, importing all existing data into the CrateDB cluster of your choice.
Continuous Sync: The integration uses MongoDB Change Streams to monitor changes in your selected MongoDB collection and syncs these updates to your CrateDB Cloud cluster table in real-time, ensuring that your data remains current. This sync supports inserting new documents, updating already existing documents and deleting documents.
Data Consistency and Mode
For continuous sync, CrateDB Cloud uses MongoDB’s full document mode to ensure data consistency. This mode guarantees that MongoDB returns the latest majority-committed version of the updated document.
While receiving partial deltas is more efficient, full document mode provides robust functionality by allowing update events to insert documents that did not exist previously. It also helps performance by writting updates and inserts in batches.
Create a new Integration
A MongoDB integration allows you to sync a single collection from a MongoDB Atlas cluster. You can reuse an existing connection across multiple integrations to continuously sync data from multiple MongoDB Atlas collections.
Supported authentication methods:
MongoDB SCRAM Authentication
MongoDB X.509 Authentication
Set Up MongoDB Atlas Authentication
The following steps should be performed in the MongoDB Atlas UI.
Step 1: Create a Custom Role
Navigate to Database Access - In the MongoDB Atlas UI for the cluster you want to connect to CrateDB Cloud.
Add a Custom Role - Under Custom Roles, click Add New Custom Role. A form will appear.
Fill in the Custom Role Name - For example, use
CrateDB CDC integrationSet Up Read-Only Access - Assign the following actions or roles to the custom role:
find, to be found underCollection Actions/Query and Write ActionschangeStream, to be found underCollection Actions/Change Stream ActionscollStats, to be found underCollection Actions/Diagnostic Actions
Specify the databases and collections you want to sync for these actions. You can update access permissions in the MongoDB Atlas UI later if needed.
Step 2: Create a User¶
Depending on whether you plan to use SCRAM (password based) or X.509 (certificate based) authentication, create a database user with one of the following methods:
Navigate to Database Access - In the MongoDB Atlas UI and click Add New Database User.
Set Authentication Method - Choose Password as the authentication method and enter a username and password for the database user.
Assign the Role - Under Database User Privileges, select the custom role created in Step 1.
Copy User Credentials - Click Add User, and make sure to record the username and password. These credentials will be used later in the CrateDB Cloud Console.
Navigate to Database Access In the MongoDB Atlas UI, go to Database Access and click Add New Database User.
Set Authentication Method Choose Certificate as the authentication method.
Assign the Role Under Database User Privileges, select the custom role created in Step 1.
Save the Certificate Click Add User, and store the certificate securely. This will be required later in the CrateDB Cloud Console.
Step 3: Configure IP Access
To allow CrateDB Cloud to access your MongoDB Atlas cluster, you must add the CrateDB Cloud IP addresses to the IP Access List in MongoDB Atlas.
Navigate to Network Access - In the MongoDB Atlas UI, go to Network Access from the left navigation.
Add IP Address - Click Add IP Address and choose an IP address or range to allow access. For testing purposes, you can select Allow Access from Anywhere, but for production, it is recommended to specify only the required IPs. When you create a new Mongo CDC integration in CrateDB Cloud, the form will show you the specific IP addresses you need to allow for it to work.
Step 4: Access Connection String
You’ll need to provide the connection string for your MongoDB Atlas cluster so that CrateDB Cloud can connect to it.
Navigate to Your Cluster - In the MongoDB Atlas UI, navigate to the cluster you want to connect to CrateDB Cloud.
Click “Connect” - From the cluster view, click on Connect.
Select “Connect Your Application” - Choose Connect your application as the connection method.
Copy the Connection String - Copy the connection string provided in the MongoDB Atlas UI. It will look like this:
mongodb+srv://:@/?retryWrites=true&w=majorityLimitations
Column Name Restrictions
Column or property names containing square brackets ([]) are not supported and are replaced with __openbrk__ and __closebrk__ respectively. Likewise, column names containing dots (.) are not supported and are replaced with (__dot__).
Unsupported Data Types
The following MongoDB data types are not supported in the CrateDB Cloud MongoDB CDC integration:
Long Strings exceeding 32,766 characters are replaced with a placeholder value.
Binary data types other than UUIDs, which are converted to
TEXTand vectors, which are converted toARRAYs of numbers.The
Decimal128data type is not supported and is converted to a string, as CrateDB does not support a decimal data type.
Last updated

