Motivation

CockroachDB has native support for change data capture. It supports object storage sinks across all major cloud providers. At the time of writing, there are a couple of supported formats available like Avro and Newline Delimited JSON. Up until now, I’ve been avoiding Newline Delimited JSON because I don’t find it easy to use. Today, I’d like to look at DuckDB as a viable tool to parse the CDC-generated output in newline-delimited format.

High-Level Steps

Start a CockroachDB cluster
Parse CockroachDB newly-delimited changefeed output using DuckDB
Query CockroachDB tables using DuckDB
Conclusion

Step-By-Step Instructions

Start a CockroachDB Cluster

I am using a serverless instance of CockroachDB. It has enterprise change feeds enabled by default. You can sign up for a free instance.

Leave a Reply

Your email address will not be published. Required fields are marked *