Types of Postgres CDC – Pros And Cons

3 min read

PostgreSQL or simply Postgres, is among the most widely used open-source relational databases. For users, it is the ideal platform for carrying out various functions such as analytics, OLTP workloads, and data warehousing.

Change Data Capture (CDC) is a software design pattern that tracks changes in a database and takes specific action based on those changes. This function, based on Postgres CDC has multiple benefits.

  • Data warehouses and downstream systems can be synced with PostgreSQL by capturing change events in real-time.
  • Postgres CDC reduces the load on PostgreSQL since only the change events are captured.
  • Changes made in PostgreSQL can be easily accessed without modifying the application code.

Click here for more information on Postgres CDC.

Let us now explore the three main approaches to Postgres CDC and their relative benefits and disadvantages.

# Trigger-based Postgres CDC

In this model, changes such as Insert, Update, and Delete in the table of interest can be identified. For each change, a row has to be inserted into a change table, thereby creating a changelog. This approach of Postgres CDC stores captured change events in PostgreSQL only. 

Change Data Capture (CDC) to PostgreSQL without Coding | BryteFlow

Pros

  • Change events can be processed in real-time as they are instantly captured.
  • The triggers automatically add required metadata to the change events.

Cons

  • Triggers increase the execution time of the original statement, thereby adversely impacting the performance of Postgres.
  • Triggers will only work when changes are made to the Postgres database.

# Query-based Postgres CDC

Here, there is a timestamp column that shows when a row has been last changed in the tracked database. However, Postgres has to be repeatedly queried using that column to know about all the modified records since the last query. The Postgres CDC here captures only Insert and Update events and not Delete changes.  

Pros

Since the schema has a timestamp column indicating the time of modification of the rows, query-based CDC can be implemented without making any changes to PostgreSQL. 

Cons

  • More stress is put on PostgreSQL as this form of PostgreSQL uses the query layer for data extraction.
  • Since repeated refresh of the tracked table is required, resources are wasted if no changes have taken place.
  • Delete changes are not captured by this form of Postgres CDC.

# Logical Replication-based Postgres CDC

The Logical Replication-based Postgres CDC can quickly replicate data between different PostgreSQL instances, even though they run on different systems. Basically, it is a write-ahead log on a disk that reflects all change events like Delete, Update, and Insert. The logical replication of Postgres CDC is enabled through changes made to the configuration file.  

Pros

  • All types of changes are captured by this form of Postgres CDC.
  • Using this form of CDC allows direct access to file systems and hence does not affect the performance of the PostgreSQL database.

Cons

Versions of PostgreSQL older than 9.4 do not support logical replication.  

In comparison, the logical replication comes out on top since it supports all types of changes.

In case you have found a mistake in the text, please send a message to the author by selecting the mistake and pressing Ctrl-Enter.
Lily Mia 2
Joined: 1 year ago
Comments (0)

    No comments yet

You must be logged in to comment.

Sign In