Motivation
CockroachDB has native support for change data capture. It supports object storage sinks across all major cloud providers. At the time of writing, there are a couple of supported formats available like Avro and Newline Delimited JSON. Up until now, I’ve been avoiding Newline Delimited JSON because I don’t find it easy to use. Today, I’d like to look at DuckDB as a viable tool to parse the CDC-generated output in newline-delimited format.
High-Level Steps
Start a CockroachDB cluster
Parse CockroachDB newly-delimited changefeed output using DuckDB
Query CockroachDB tables using DuckDB
Conclusion
Step-By-Step Instructions
Start a CockroachDB Cluster
I am using a serverless instance of CockroachDB. It has enterprise change feeds enabled by default. You can sign up for a free instance.