Postgres 9.4 feature highlight: REPLICA IDENTITY and logical replication

Among the many things to say about logical replication features added in PostgreSQL 9.4, REPLICA IDENTITY is a new table-level parameter that can be used to control the information written to WAL to identify tuple data that is being deleted or updated (an update being a succession of an insert and a delete in MVCC).

This parameter has 4 modes:

  • DEFAULT
  • USING INDEX index
  • FULL
  • NOTHING

First let‘s set up an environment using some of the instructions in a previous post dealing with some basics of logical decoding to set up a server using test_decoding in a replication slot.

=# SELECT * FROM pg_create_logical_replication_slot(‘my_slot‘, ‘test_decoding‘);
 slot_name | xlog_position
-----------+---------------
 my_slot   | 0/16CB0F8
(1 row)

The replication slot used here will be used in combination with pg_logical_slot_get_changes to consume each change of the slot (to compare with pg_logical_slot_peek_changes that can be used to view the changes but not consume them).

In the case of DEFAULT, old tuple data is only identified with the primary key of the table. This data is written into WAL only when at least one column of the primary key is updated. Columns that are not part of the primary key do not have their old value written.

=# CREATE TABLE aa (a int, b int, c int, PRIMARY KEY (a, b));
CREATE TABLE
=# INSERT INTO aa VALUES (1,1,1);
INSERT 0 1
=# [ ... Clean up of slot information up to now ... ]
=# UPDATE aa SET c = 3 WHERE (a, b) = (1, 1);
UPDATE 1
=# SELECT * FROM pg_logical_slot_get_changes(‘my_slot‘, NULL, NULL);
 location  | xid  |                              data
-----------+------+-----------------------------------------------------------------
 0/1728D50 | 1013 | BEGIN 1013
 0/1728D50 | 1013 | table public.aa: UPDATE: a[integer]:1 b[integer]:1 c[integer]:3
 0/1728E70 | 1013 | COMMIT 1013
(3 rows)
=# UPDATE aa SET a = 2 WHERE (a, b) = (1, 1);
UPDATE 1
=# SELECT * FROM pg_logical_slot_get_changes(‘my_slot‘, NULL, NULL);
 location  | xid  |                                                     data
-----------+------+---------------------------------------------------------------------------------------------------------------
 0/1728EA8 | 1014 | BEGIN 1014
 0/1728EA8 | 1014 | table public.aa: UPDATE: old-key: a[integer]:1 b[integer]:1 new-tuple: a[integer]:2 b[integer]:1 c[integer]:3
 0/1728FF0 | 1014 | COMMIT 1014
(3 rows)

Ît is important to know that REPLICA IDENTITY can only be changed using ALTER TABLE, and that the parameter value is only viewable with ‘\d+‘ only if default behavior is not used. Also, after creating a table, REPLICA IDENTITY is set to DEFAULT (Surprise!).

=# \d+ aa
                          Table "public.aa"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 a      | integer | not null  | plain   |              |
 b      | integer | not null  | plain   |              |
 c      | integer |           | plain   |              |
Indexes:
    "aa_pkey" PRIMARY KEY, btree (a, b)
=# ALTER TABLE aa REPLICA IDENTITY FULL;
ALTER TABLE
=# \d+ aa
                          Table "public.aa"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 a      | integer | not null  | plain   |              |
 b      | integer | not null  | plain   |              |
 c      | integer |           | plain   |              |
Indexes:
    "aa_pkey" PRIMARY KEY, btree (a, b)
Replica Identity: FULL
=# [ ... Replication slot changes are consumed here ... ]

In the case of FULL, all the column values are written to WAL all the time. This is the most verbose, and as well the most resource-consuming mode. Be careful here particularly for heavily-updated tables.

=# UPDATE aa SET c = 4 WHERE (a, b) = (2, 1);
UPDATE 1
=# SELECT * FROM pg_logical_slot_get_changes(‘my_slot‘, NULL, NULL);
 location  | xid  |                                                            data
-----------+------+----------------------------------------------------------------------------------------------------------------------------
 0/172EC70 | 1016 | BEGIN 1016
 0/172EC70 | 1016 | table public.aa: UPDATE: old-key: a[integer]:2 b[integer]:1 c[integer]:3 new-tuple: a[integer]:2 b[integer]:1 c[integer]:4
 0/172EE00 | 1016 | COMMIT 1016

On the contrary, NOTHING prints... Nothing. (Note: operation done after an appropriate ALTER TABLE and after consuming replication slot information).

=# UPDATE aa SET c = 4 WHERE (a, b) = (2, 1);
UPDATE 1
=# SELECT * FROM pg_logical_slot_get_changes(‘my_slot‘, NULL, NULL);
 location  | xid  |                              data
-----------+------+-----------------------------------------------------------------
 0/1730F58 | 1018 | BEGIN 1018
 0/1730F58 | 1018 | table public.aa: UPDATE: a[integer]:2 b[integer]:1 c[integer]:4
 0/1731100 | 1018 | COMMIT 1018

Finally, there is USING INDEX, which writes to WAL the values of the index defined with this option. The index needs to be unique, cannot contain expressions and must contain NOT NULL columns.

=# ALTER TABLE aa ALTER COLUMN c SET NOT NULL;
ALTER TABLE
=# CREATE unique INDEX aai on aa(c);
CREATE INDEX
=# ALTER TABLE aa REPLICA IDENTITY USING INDEX aai;
ALTER TABLE
=# [ ... Consuming all information from slot ... ]
=# UPDATE aa SET c = 5 WHERE (a, b) = (2, 1);
UPDATE 1
=# SELECT * FROM pg_logical_slot_get_changes(‘my_slot‘, NULL, NULL);
 location  | xid  |                                               data
-----------+------+--------------------------------------------------------------------------------------------------
 0/1749A68 | 1029 | BEGIN 1029
 0/1749A68 | 1029 | table public.aa: UPDATE: old-key: c[integer]:4 new-tuple: a[integer]:2 b[integer]:1 c[integer]:5
 0/1749D40 | 1029 | COMMIT 1029
(3 rows)

Note that in this case the primary key information is not decoded, only the NOT NULL column c that the index covers.

REPLICA IDENTITY should be chosen carefully for each table of a given application, knowing that for example FULL generates an extra amount of WAL that may not be necessary, NOTHING may forget about essential information. In most of the cases, DEFAULT provides a good cover though.

REPLICA IDENTITY

This form changes the information which is written to the write-ahead log to identify rows which are updated or deleted. This option has no effect except when logical replication is in use. DEFAULT (the default for non-system tables) records the old values of the columns of the primary key, if any. USING INDEX records the old values of the columns covered by the named index, which must be unique, not partial, not deferrable, and include only columns marked NOT NULLFULL records the old values of all columns in the row. NOTHING records no information about the old row. (This is the default for system tables.) In all cases, no old values are logged unless at least one of the columns that would be logged differs between the old and new versions of the row.

参考:

http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replica-identity-logical-replication/

http://www.postgresql.org/docs/devel/static/sql-altertable.html

时间: 2024-07-30 10:18:22

Postgres 9.4 feature highlight: REPLICA IDENTITY and logical replication的相关文章

【ASP.NET Identity教程】ASP.NET Identity入门

注:本文是[ASP.NET Identity系列教程]的第一篇.本系列教程详细.完整.深入地介绍了微软的ASP.NET Identity技术,描述了如何运用ASP.NET Identity实现应用程序的用户管理,以及实现应用程序的认证与授权等相关技术,译者希望本系列教程成为掌握ASP.NET Identity技术的一份完整而有价值的资料,希望得到广大园友的高度推荐. 13 Getting Started with Identity 13 Identity入门 Identity is a new

Postgres 主从复制搭建步骤

系统版本: CentOS Linux release 7.5.1804 (Core) 数据库 psql (PostgreSQL) 10.5 2台机器ip : 172.17.0.3 /172.17.0.4 具体步骤: 一. 首先先在这两台机器上把postgres这2个机器的基础数据库都装好.具体就不解释了. 9之后的版本安装有4个包,按照lib,PGDG, server ,contrib这样的顺序安装. 二. 主库环境: 1.创建一个用户复制的用户replica CREATE ROLE repli

postgres 开始wal归档

postgres 开始wal归档: a.修改wal_level参数:alter system set wal_level= 'replica'; postgres=# alter system set wal_level= 'replica';ALTER SYSTEM b.修改archive_mode参数: alter system set archive_mode= 'on'; postgres=# alter system set archive_mode= 'on';ALTER SYSTE

Identity column 和 not for replication

Identity column的值是SQL Server Engine自动生成的,具有唯一和递增的特性,默认情况下,用户不能显式插入数值.在Replication中,如果ID列需要被同步到其他Subscriber中,那么如何使两个table的ID 列数值保持一致? 答案是在创建table时,为ID 列 指定 not for replication 属性.当distribution agent 执行Insert 命令时,ID 列能够被显式赋值,并且ID列的标识值不会自增,跟普通的整数型column

Column属性:RowGUIDCol 和 Identity

Column 两个特殊的属性 $ROWGUID,$IDENTITY, $ROWGUID用于引用被属性 RowGUIDCol 标识的UniqueIdentifier类型的 column,$IDENTITY 用于引用被属性Identity 标识的整数类型(int,bigint,tinyint,smallint,decimal(p,0),numeric(p,0))类型的 column. 在每个table中,只能有个一Column被标识为RowGUIDCol,只能有一个Column被标识为Identit

列属性:RowGUIDCol 和 Identity

Table Column有两个特殊的属性RowGUIDCol 和 Identity,用于标记数据列: $ROWGUID 用于引用被属性 RowGUIDCol 标识的UniqueIdentifier 类型的 column: $IDENTITY 用于引用被属性 Identity 标识的整数类型(int,bigint,tinyint,smallint,decimal(p,0))的 column: 在每个table中,只能有一列被标识为RowGUIDCol,只能有一列被标识为Identity: 一,属性

Replication的犄角旮旯(五)--关于复制identity列

原文:Replication的犄角旮旯(五)--关于复制identity列 <Replication的犄角旮旯>系列导读 Replication的犄角旮旯(一)--变更订阅端表名的应用场景 Replication的犄角旮旯(二)--寻找订阅端丢失的记录 Replication的犄角旮旯(三)--聊聊@bitmap Replication的犄角旮旯(四)--关于事务复制的监控 Replication的犄角旮旯(五)--关于复制identity列 Replication的犄角旮旯(六)-- 一个D

mongodb replica sets复制集详解

一.replica sets介绍 一个复制集是一组包含相同数据集的mongod实例.一个复制集只能有一个是primary节点,其它的节点为secondary节点. 和主从复制的原理一样,复制集也是通过读取oplog来进行数据传输.oplog是一个capped collection即固定表,创建表的时候可以指定其大小,当oplog满的时候会删除旧的数据.所以设置oplog的大小非常重要,如果oplog在primary节点被覆盖而尚未被secondary节点读取的话就要重新resync. 一般的使用

第五部分 架构篇 第十四章 MongoDB Replica Sets 架构(自动故障转移/读写分离实践)

说明:该篇内容部分来自红丸编写的MongoDB实战文章. 1.简介 MongoDB支持在多个机器中通过异步复制达到故障转移和实现冗余,多机器中同一时刻只有一台是用于写操作,正是由于这个情况,为了MongoDB提供了数据一致性的保障,担当primary角色的服务能把读操作分发给Slave(详情请看前两篇关于Replica Set成员组成和理解). MongoDB高可用分为两种: Master-Slave主从复制:只需要在某一个服务启动时加上-master参数,而另外一个服务加上-slave与-so