The site “TCapture Replication Server Documentation” explains the concept and architecture of TCapture. Note that the current documentation doesn’t cover all aspects of the prototype implementation .

Overview

Is a ‘data movement’ product, which, well, moves data from one place to another. More specifically, it captures database transactions in one database and then applies these to another database. Because TCapture captures only data changes, there is no impact on the applications already running on that database. This means that can be used in pretty much any database system.

Common uses

Some common uses of T-Capture are to facilitate migrations of databases or applications that require significant downtime, by setting up a replica with T-Capture, this downtime can be almost fully eliminated.

Because of its application transparency, may also be seen as a mechanism to integrate different applications, by synchronising the data changes.

How it works

T-Capture replicates committed database transactions.
A Java engine continuously monitors the database transaction log for newly committed transactions that have been marked for replication.
When it finds one, the replication sends it to the T-Capture structures that runs somewhere outside primary db.
T-Capture will then forward the transaction to the designated replicate database and apply it.
It should be noted that that replicate database does have to be Postgresql

Structure of a TCapture cluster

T-Capture® RepSrv uses a publish-subscribe model of the queueing tables. On the primary side, the entity to be replicated is queued in the published queue table, and this can be subscribed to by one or more replicate databases.
Such an entity can be an individual database table, or an entire database.
The latter is often used for ‘warm standby’ configurations where all changes in a database are replicated to a standby.

.

Asynchronous

TC performs asynchronous replication. This means that when the database transaction in the primary database commits, it will be picked up and applied to the replicate database shortly afterwards, decoupled from the primary transaction.
Consequently, when using asynchronous replication, there is always some latency: some time will pass before a primary transaction gets been applied to the replicate side
The challenge is to keep this latency as short as possible.
In a well-designed and well-tuned replication system, latencies of no more than single-digit seconds are typical, even when the replicate database is on a WAN.

.

Networked environment

Another aspect worth discussing briefly is that a typical replication system operates in a networked environment. For example, let’s assume the primary database runs on an server in New York;
the T-Capture® RepSrv runs on a Linux box in the same data center; one replicate database run on an server in Houston, Texas; and another replicate database runs on Linux in London, UK.

T-Capture® RepSrv is designed under the assumption that it is normal for network connections to temporarily fail, and then come back into service.
Once a connection fails, T-Capture® RepSrv will automatically keep retrying until the connection is back, and then proceed replication where it left off.

.

Components

The main component of TCapture is the Replication Server. This is a java program which mainly coordinates varous modules.
It arranges and maintains the connection to the database cluster system, to the backends added to Postgres which process local transactions as well as to the modules which process remote transactions. In TCapture Replication Server there are modules which handle transactions. Each module can only process one transaction at a time. To replay transactions from remote nodes the replication manager starts consumer modules which are under the control of the replication server.

With T-Capture RepSrv you replicate committed transactional data from source tables to target tables by using two program modules:
Producer and Consumer.
The Producer reads the recovery logs for changed source data and writes the changes to queues.The Consumer retrieves captured changes from queues and writes the changes to targets.

The Producer and the Consumer use a set of control tables to track the information that they require to do their tasks and to store information that they generate themselves, such as information that you can use to find out how well they are performing.

You create these tables when you tell T-Capture what your replication sources and targets are. Before you can publish or replicate data, you must create control tables and functions for a T-Capture® RepSrv database

The Producer program uses a set of control tables contain information about your replication sources, the targets that correspond to them, and which queue manager and queues are being used by the Producer program.
These tables also contain data that you can use to check and monitor the performance, such as data about the program’s current position in the transaction log.

The schema that is associated with a set of control tables identifies the program that uses those control tables.

.

Lifecycle of a Replicated Transaction

Read-only transactions are handled locally and are treated no different than in standard single node operation of Postgres. As soon as a transaction writes data (update, insert or delete SQL commands) the new data is collected into a changeset. The local backend begin to processes the transaction and continuously collects the changes in the changeset as it receives the commit reques tfrom the client .After confirmed the commit from the client the local backend sends the changeset to the TC replication structures,which in turn sends it out to all other nodes using a reliable postgres replication slot mechanism. In addition it mantein the ordering of this transaction within a totally ordered series of transactions to apply. The local backend which started the transaction can commit and continue scan the streaming of local modifications. On another node, which receives the changeset for replay,the replication server passes the changeset on to a applier module,
which replays the transactions in the agreed order from the data in the changeset.

.