PGMP Logical Replication in PostgreSQL 15

4 minute read

PGMP supports the real-time logical replication of PostgreSQL tables into ETS out of the box on Erlang/OTP 25. It includes the new features: row filters and column lists, which are part of the recent PostgreSQL 15 release.

Logical replication is a method of replicating data objects and their changes, based upon their replication identity. In pgmp tables with a single (primary) or multiple columns (composite) keys are supported.

Logical replication uses a publish and subscribe model with one or more subscribers subscribing to one or more publications on a publisher node. PGMP acts as a subscriber to one or publications offered by a PostgreSQL database.

The process initiated by pgmp creates a replication slot for each subscribed publication. As part of this a transactional identifier is created by the PostgreSQL database. A snapshot of the published data at that point in time is created by pgmp into a ETS replica. Once that is done, the changes on the publisher are sent to pgmp as they occur in real-time via a socket. The subscriber applies the data in the same order as the publisher so that transactional consistency is guaranteed for publications within a single subscription.

Enabling Logical Replication In PostgreSQL

If you have an existing PostgreSQL 15 database, your postgresql.conf must contain to enable replication:

wal_level = logical

For the remainder of this article we will be using PostgreSQL 15 running via docker:

docker run \
    --rm \
    --name postgres \
    --detach \
    --publish 5432:5432 \
    --env POSTGRES_PASSWORD=postgres \
    postgres:15 \
    -c wal_level=logical

You can then run a SQL shell with:

docker exec \
    --interactive \
    --tty \
    postgres \
    psql \
    postgres \
    postgres

You should see the following prompt:

psql (15.0 (Debian 15.0-1.pgdg110+1))
Type "help" for help.

postgres=#

Building PGMP

We will be using Erlang/OTP 25 running via docker:

docker run \
    --rm \
    --interactive \
    --tty \
    erlang:25 \
    /bin/bash

You should see an interactive prompt:

root@df8e90c4cefd:/#

Where df8e90c4cefd will be different for your container. Clone pgmp with the following command:

git clone https://github.com/shortishly/pgmp.git

Building:

cd pgmp && make

PostgreSQL 15 New Logical Replication Features

The new logical replication features in the recent PostgreSQL 15 release, are: row filters and column lists. Their usage with pgmp are described in this section.

Column Lists

From PostgreSQL 15, each publication can specify which columns of each table are replicated to subscribers.

Create a table t1 to be used in the following example:

create table t1 (id int,
                 a text,
                 b text,
                 c text,
                 d text,
                 e text,
                 primary key(id));

Create a publication p1, with a column list to reduce the number of columns that will be replicated:

create publication p1
  for table t1 (id, a, b, d);

Insert some test data into the newly created t1 table:

insert into t1
   values
     (1, 'a-1', 'b-1', 'c-1', 'd-1', 'e-1'),
     (2, 'a-2', 'b-2', 'c-2', 'd-2', 'e-2'),
     (3, 'a-3', 'b-3', 'c-3', 'd-3', 'e-3');

Start pgmp with replication of the p1 publication:

PGMP_REPLICATION_LOGICAL_PUBLICATION_NAMES=p1 \
    PGMP_DATABASE_HOSTNAME=host.docker.internal \
    PGMP_DATABASE_USER=postgres \
    PGMP_DATABASE_PASSWORD=postgres \
    make shell

As part of the replication process pgmp introspects the table metadata for t1 determining that column id is the primary key. This is the column that is used as the primary key in ETS replica of t1. The contents of the replicated table:

2> lists:sort(ets:tab2list(t1)).
[{1, <<"a-1">>, <<"b-1">>, <<"d-1">>},
 {2, <<"a-2">>, <<"b-2">>, <<"d-2">>},
 {3, <<"a-3">>, <<"b-3">>, <<"d-3">>}]

The rows from the initial replication snapshot are in an ETS replica called t1. Note that only the columns listed in the publication are present.

Any changes on the PostgreSQL database will now be replicated in real-time to the ETS replica.

Back to PostgreSQL, inserting more data will be replicated in real-time:

insert into t1
  values (4, 'a-4', 'b-4', 'c-4', 'd-4', 'e-4');

Meanwhile in pgmp:

3> lists:sort(ets:tab2list(t1)).
[{1,<<"a-1">>,<<"b-1">>,<<"d-1">>},
 {2,<<"a-2">>,<<"b-2">>,<<"d-2">>},
 {3,<<"a-3">>,<<"b-3">>,<<"d-3">>},
 {4,<<"a-4">>,<<"b-4">>,<<"d-4">>}]

Row Filters

By default, all data from all published tables will be replicated to the appropriate subscribers. The replicated data can be reduced by using a row filter. A user might choose to use row filters for behavioral, security or performance reasons. If a published table sets a row filter, a row is replicated only if its data satisfies the row filter expression. This allows a set of tables to be partially replicated. The row filter is defined per table.

Create a table t2 to be used in the following example:

create table t2 (a int,
                 b int,
                 c text,
                 primary key(a, c));

Creating a publication p2 with a row filter:

create publication p2
  for table t2
  where (a > 5 and c = 'NSW');

With some initial data:

insert into t2
  values
    (2, 102, 'NSW'),
    (3, 103, 'QLD'),
    (4, 104, 'VIC'),
    (5, 105, 'ACT'),
    (6, 106, 'NSW'),
    (7, 107, 'NT'),
    (8, 108, 'QLD'),
    (9, 109, 'NSW');

Start pgmp with replication of the p2 publication:

PGMP_REPLICATION_LOGICAL_PUBLICATION_NAMES=p2 \
    PGMP_DATABASE_HOSTNAME=host.docker.internal \
    PGMP_DATABASE_USER=postgres \
    PGMP_DATABASE_PASSWORD=postgres \
    make shell

Only those rows that match the filter are replicated where (a > 5 and c = 'NSW'):


1> lists:sort(ets:tab2list(t2)).
[{{6, <<"NSW">>}, 109},
 {{9, <<"NSW">>}, 109}]

Updates are streamed in real-time into ETS:

update t2 SET b = 999 where a = 6;

2> lists:sort(ets:tab2list(t2)).
[{{6, <<"NSW">>}, 999},
 {{9, <<"NSW">>}, 109}]

Changing data so that it meets the row filter will be replicated in real-time into ETS:

update t2 set a = 555 where a = 2;

3> lists:sort(ets:tab2list(t2)).

[{{6, <<"NSW">>}, 999},
 {{9, <<"NSW">>}, 109},
 {{555, <<"NSW">>}, 102}]

Similarly changing data so that it no longer meets the row filter will be replicated in real-time into ETS:

update t2 set c = 'VIC' where a = 9;

4> lists:sort(ets:tab2list(t2)).

[{{6, <<"NSW">>}, 999},
 {{555, <<"NSW">>}, 102}]

This article has shown two new features of logical replication in PostgreSQL 15 being used by the pgmp in Erlang/OTP: row filters and column lists.