left arrow Back to posts

A developer's reference to Postgres change data capture (CDC)

Eric Goldman
11 min read

As a developer researching change data capture (CDC) on Postgres, what do you need to know?

We’ve worked with hundreds of developers as they implement CDC specifically on Postgres. We’ve written extensively on the topic. This post aims to synthesize our learnings to help you build a better understanding of CDC, weigh different implementation options, and find the best CDC implementation for your use case.

What is Change Data Capture (CDC)

As the name suggests, change data capture is a process that detects every incremental change to a database, and then delivers (i.e. captures) those changes in a downstream system. When done correctly, the other system can process these changes to perfectly replicate the original data set.

In practice, this means that every insert, update, and delete in the Postgres database needs to be captured and delivered in order. No alteration can be missed. The physics of never missing a change and guaranteeing delivery makes building a reliable CDC system challenging.

Luckily, PostgreSQL 10+ comes with logical replication built in. The database records every change in the Write Ahead Log (WAL). You can tap into the WAL and decode the messages to capture every insert, update, and delete. Moreover, Postgres writes messages to the WAL during normal operation, so processing changes through the WAL adds no additional overhead to the database.

While the WAL is the most reliable way to interface with Postgres to power CDC, it isn’t the only option nor is it necessarily the easiest to work with.

When to reach for CDC

Almost every data intensive application includes a CDC component. (In fact, we’ve found that almost every company with 10 or more engineers has either built or bought a CDC solution.)

As systems grow, teams begin to move work away from a single Postgres instance towards specialized tools. In this effort, common use cases begin to emerge that require CDC:

  • Database replication: Reliably mirror data across databases with greater flexibility than traditional Postgres read-replicas. CDC enables the primary database to operate independently of replicas while providing powerful tools for transforming data during transmission.
  • Real-time analytics: Stream changes directly to OLAP databases and data warehouses, enabling immediate insights and analytics without impacting production database performance.
  • Cache management: Ensure application caches and search indexes remain perfectly synchronized with source data, eliminating staleness issues while reducing unnecessary refresh operations.
  • Microservice consistency: Preserve data integrity across distributed systems by propagating changes to specialized services while maintaining strong transactional guarantees.
  • Event-driven automation: Trigger workflows and jobs with immediate, transaction-guaranteed execution, enabling responsive systems that act on data changes as they occur.
  • Audit logging: Capture every database change for compliance requirements or to power user-facing features like detailed history views and precision rollbacks.

Beyond use cases, CDC also helps reduce technical debt while setting a sturdy foundation for scale.

For instance, imagine that one system is constantly polling a table for changes while another uses triggers to log changes from the same table. The load on the database is unnecessarily high, and the process of testing and building new features becomes unnecessarily complex. CDC provides a consistent pattern to detect changes on the same table independently - without any additional burden on the database.

Postgres CDC performance considerations

A good change data capture system will provide a set of guarantees around reliability, performance, and data integrity. The right CDC system will completely abstract a set of edge cases and bugs so your team can build with confidence.

Here are some important characteristics to consider as you build or buy a CDC solution:

Delivery guarantees: Can you expect every change in the database to be delivered. In many CDC implementations, it’s possible for a message to be dropped (at-most-once delivery) or delivered more than once (at-least-once delivery). Ideally, the CDC system will provide an “exactly-once processing guarantee” by providing sophisticated deduplication and a method for acknowledging when a change has been delivered and processed. For Postgres CDC use cases where consistency is critical (i.e. replication, cache invalidation, etc), then an at-least-once delivery guarantees are a bare minimum, but exactly-once processing is ideal.

Throughput: What volume of change events can the CDC system handle within a given timeframe. This is best measured as bandwidth (i.e. MB/sec), but operations per second can be a simple proxy. Importantly, your CDC solution should be able to keep pace with the peak IOPS of your Postgres database, and ideally with room to spare. If your CDC solution is too slow, it’s not uncommon for CDC solutions to fall behind during peak load, and then never recover.

Latency: The time delay between a change being made in your database and then appearing in your target system. For some use cases (like analytics or reporting) a latency of hours or even days is sufficient. For other customer-facing use cases (i.e. updating a search index) millisecond latency ensures that results are accurate. It’s best to measure the p95 or p99 latency of your CDC service to understand the tolerances you can tune your system to. It’s also important to evaluate latency in the context of throughput to understand how your CDC system can fall behind, which is the ultimate impact to latency.

Retries and error handling: How does your system handle transient errors. One poison pill message can immediately jam and crash your CDC pipeline - which in some circumstances (i.e. replication) can be ideal. Other implementations will queue the errant message into a dead letter queue (DLQ), log an alert, and attempt to redeliver with exponential retries.

Ordering: Maintaining the correct order of change events is critical to avoid data corruption or inconsistencies. This is particularly true when there are interdependent operations that can affect the state of record. Enforcing strict ordering often comes at the expense of throughput and latency as enforcing ordering often makes parallel processing harder. Great CDC options allow you to configure what kinds of changes need to be processed in order to strike the right balance between speed and structure.

Schema evolution: The ability to handle modifications to the PostgreSQL database schema, such as adding, altering, or removing columns. These kinds of changes can break downstream services and either need to be handled gracefully (i.e. propagate the change down stream) or should proactively halt the system.

Snapshots: How does the CDC service capture the initial state of the Postgres table. Snapshotting (a.k.a backfilling) isn’t just about initializing the CDC process - it’s often the first thing you’ll need to do after an incident. The ability to both completely snapshot a table or target just a subset of the table (using SQL) is very helpful. A great CDC system needs to be able to snapshot a massive table while simultaneously capturing new changes.

Database load: How much CPU and storage capacity does the CDC solution consume on the database. Systems that poll or use triggers to detect changes can slow the database down significantly. Others that read through the WAL add virtually no overhead to the database in normal operation, but if they disconnect from the replication slot can rapidly consume available storage.

Monitoring and observability: Change data capture often becomes a critical component of your application - so you’ll want to be able to measure and observe each of the characteristics mentioned above. Great CDC solutions can plug into your existing monitoring and observability tooling. The difference between cryptic, convoluted logs and clear traceable errors can make all the difference.

Before building or buying a CDC solution, consider what performance characteristics are required for your use case. Often, picking a great CDC tool from the start can quickly set you up with a simpler architecture, higher performance, and an easier build.

Build your own Postgres CDC

For developers looking to implement Change Data Capture with PostgreSQL directly, several methods are available, each with its own set of advantages and disadvantages. These approaches offer varying levels of control and complexity, as detailed in our guide All the ways to capture changes in Postgres.

One common method involves using triggers with an audit table. This approach entails creating database triggers that activate upon data modification events (INSERT, UPDATE, DELETE) on the tables of interest. When a change occurs, the trigger fires and typically records the details of the change in a separate audit log table. Triggers are reliable as they operate within the SQL system and offer real-time capture with a high degree of customization for various event types. However, they can introduce performance overhead on the primary database and require careful management of the audit log table.

Another straightforward method is polling for changes. This involves adding a dedicated timestamp column to the tables and periodically querying for records that have been modified since the last check, based on the timestamp. This approach is simple to implement for basic scenarios and doesn't require complex database configurations. However, it necessitates the presence of a timestamp column, can be resource-intensive with frequent polling, and cannot reliably capture DELETE events unless soft deletions are employed.

Depending on your use case, you might be able to get away with Listen/Notify. This built-in PostgreSQL feature implements a publish-subscribe pattern where database sessions can listen on channels and receive real-time notifications of data changes. By setting up triggers that fire on INSERT, UPDATE, or DELETE operations, you can broadcast JSON payloads containing change details to listening applications. While simple to implement and offering immediate notification capabilities, Listen/Notify has key limitations: it provides only "at-most-once" delivery semantics (requiring active connections when notifications occur), limits payloads to 8000 bytes, and offers no persistence for missed messages. These constraints make it suitable primarily for lightweight change detection scenarios or as a complement to more robust CDC methods, rather than for mission-critical systems requiring guaranteed delivery.

Finally, logical replication (via the WAL) stands out as a robust and efficient method. It's a built-in feature of PostgreSQL that allows for the selective replication of data changes at the table level. Leveraging log-based CDC, it provides real-time, event-driven capture of all change types with minimal impact on the database and broad support across different PostgreSQL environments. Logical replication has become the favored approach for PostgreSQL CDC due to its optimal balance of efficiency, reliability, and comprehensive change capture capabilities.

While each of these approaches captures changes in the database, that is just the first step in building a CDC system. You then need to deliver those changes to other systems with sufficient performance guarantees (see above). This often adds significant complexity and work, which is why most teams choose an off the shelf approach.

Off the shelf Postgres CDC

To implement PostgreSQL CDC without the complexities of building a solution from scratch, a variety of robust tools and platforms are available, each offering unique features and capabilities. A detailed overview of these options can be found at https://blog.sequinstream.com/choosing_the_right_real_time_postgres_cdc_platform/.

Open source

Among the open-source tools, Sequin and Debezium are the go to options.

Sequin is a fast, simple, modern open source CDC platform tuned specifically to Postgres. It is known as the fastest, highest throughput change data capture solution for Postgres. It supports many different destinations with exactly-once processing guarantees. It’s easy to self-host via Docker and provides prometheus endpoints for monitoring and observation. It comes with an easy to use web console, CLI, and API as well as helpful developer tools. If you don’t want to self-host, you can also use Sequin Cloud.

Debezium is a widely adopted distributed platform specifically designed for CDC. Built on top of Apache Kafka and Kafka Connect, Debezium captures changes in real-time using logical replication and streams them to Kafka topics, ensuring exactly-once processing. However, it requires the setup and management of Kafka, which can introduce operational complexity. It’s notoriously difficult to configure, maintain, and debug - and its future is unclear since it’s been moved from Red Hat support to the Commonhause Foundation.

Hosted

For teams seeking the benefits of CDC without the operational overhead, several hosted solutions provide managed CDC capabilities.

Sequin Cloud is a fully hosted, highly available deployment of Sequin. It’s immediately configured to scale and runs in regions all around the world. It’s priced to be the most economical CDC solution for Postgres and comes with additional features around tracing, alerting, team management, and disaster recovery. With Sequin Cloud, you pay for just the data you process through Sequin - completely abstracting the underlying infrastructure.

Decodable offers a hosted Debezium plus Apache Flink stack, making it ideal for organizations already invested in these technologies but wanting them managed. It captures every database change with strong deliverability guarantees and includes monitoring and schema management tools. While easier than self-hosting, using Decodable still requires learning how to configure pipelines and apply SQL transforms in their dashboard, with Flink pipelines presenting a moderate learning curve. Cost-wise, Decodable is moderately expensive, with one pipeline potentially comprising several "tasks" that can each cost hundreds of dollars monthly.

Confluent provides two options: Confluent Debezium and Direct JDBC Postgres connector. The Debezium version is a fully hosted implementation with enterprise security, compliance, monitoring, and scaling tools. While you won't need to manage the deployment directly, you'll still handle the complex configuration of Debezium and Kafka topics. The Direct JDBC connector uses polling rather than logical replication, meaning it only captures inserts and updates (not deletes) with some delay. While easier to set up, it's less powerful. Both options come with enterprise pricing, trading engineering time for an expensive infrastructure product.

Estuary is a fully managed CDC solution that simplifies real-time data pipelines. It provides a wide selection of pre-built connectors that allow you to quickly connect your sources and destinations. Notably, Estuary loads data from into an intermediary data source so that subsequent backfills are fast with minimal overhead on the database.

Streamkap is a serverless streaming platform designed for real-time CDC. It leverages Apache Kafka and Debezium with connectors to enable fast and efficient data ingestion. Streamkap specializes in delivering CDC data to various destinations, including ClickHouse, Snowflake, and others.

Striim positions itself as an enterprise solution with high reliability but at premium cost. It provides CDC with transformations, schema management, delivery guarantees, and security features designed for Fortune 1,000 companies. Using Striim requires learning their proprietary TQL (Transform Query Language) for data transformations and StreamApps framework for pipeline configuration. While well-documented, these tools have a steep learning curve and are unique to Striim's ecosystem. Budget-wise, Striim requires an all-in enterprise contract.

ETL Providers

Traditional ETL providers like Fivetran and Airbyte offer CDC capabilities, but it's important to note that they typically provide batch CDC rather than real-time solutions. Changes may take minutes to appear in your stream or queue, and they may not maintain atomic change records.

These solutions are primarily intended for non-operational, analytics use cases rather than real-time operational requirements. They offer scheduled, batch delivery of database changes to various destinations, including streams like Kafka. While they're relatively easy to set up, they offer limited configuration options when used for CDC.

Cost can be a significant factor with these providers, as pricing is typically based on row count, which can quickly become expensive for high-volume databases. If your use case can tolerate batch processing and you're primarily focused on analytics, these tools provide a simple solution with minimal technical expertise required.

Cloud providers

The major cloud infrastructure providers offer CDC products that work within their ecosystems. AWS DMS (Database Migration Service), GCP Datastream, and Azure Data Factory can all be configured to stream changes from Postgres to other infrastructure within their respective platforms.

These solutions can be effective if your organization is already committed to a specific cloud provider and comfortable with their tooling. They support logical replication with real-time capture of inserts, updates, and deletes, though delivery guarantees can vary based on configuration and provider.

Setting up CDC through your cloud provider typically requires navigating their web console, permissions systems, tooling, and logging to establish pipelines. You'll need familiarity with the provider's settings and configurations. From a cost perspective, these solutions typically charge for compute hours and data transfer, which can be difficult to predict and may accumulate quickly. Additionally, these setups can become brittle over time as settings and dependencies are distributed across different services.

While convenient for teams already invested in a particular cloud ecosystem, these solutions may not offer the same level of specialized functionality as dedicated CDC platforms.

Conclusion

Change Data Capture has become an essential component of modern data architectures, enabling real-time integration across increasingly specialized and distributed systems. It’s evolved from a bespoke approach to data integration, to a more specialized piece of infrastructure with defined performance characteristics and features. This post helps you map your use case and requirements to CDC implementation options and tools. Reach out if you have more questions about Postgres CDC!