14 - How to check replication status

The people using PostgreSQL and the Streaming Replication feature seem to ask many of the same questions:

1. How best to monitor Streaming Replication?

2. What is the best way to do that?

3. Are there alternatives, when monitoring on Standby, to using the pg_stat_replication view on Master?

4. How should I calculate replication lag-time, in seconds, minutes, etc.?

In light of these commonly asked questions, I thought a blog would help. The following are some methods I’ve found to be useful.

Monitoring is critical for large infrastructure deployments where you have Streaming Replication for:

1. Disaster recovery

2. Streaming Replication is for High Availability

3. Load balancing, when using Streaming Replication with Hot Standby

PostgreSQL has some building blocks for replication monitoring, and the following are some important functions and views which can be use for monitoring the replication:

1. pg_stat_replication view on master/primary server.

This view helps in monitoring the standby on Master. It gives you the following details:


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

pid:              Process id of walsender process

usesysid:         OID of user which is used for Streaming replication.

usename:          Name of user which is used for Streaming replication

application_name: Application name connected to master

client_addr:      Address of standby/streaming replication

client_hostname:  Hostname of standby.

client_port:      TCP port number on which standby communicating with WAL sender

backend_start:    Start time when SR connected to Master.

state:            Current WAL sender state i.e streaming

sent_location:    Last transaction location sent to standby.

write_location:   Last transaction written on disk at standby

flush_location:   Last transaction flush on disk at standby.

replay_location:  Last transaction flush on disk at standby.

sync_priority:    Priority of standby server being chosen as synchronous standby

sync_state:       Sync State of standby (is it async or synchronous).

e.g.:


1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

postgres=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+---------------------------------

pid              | 1114

usesysid         | 16384

usename          | repuser

application_name | walreceiver

client_addr      | 172.17.0.3

client_hostname  |

client_port      | 52444

backend_start    | 15-MAY-14 19:54:05.535695 -04:00

state            | streaming

sent_location    | 0/290044C0

write_location   | 0/290044C0

flush_location   | 0/290044C0

replay_location  | 0/290044C0

sync_priority    | 0

sync_state       | async

2. pg_is_in_recovery() : Function which tells whether standby is still in recovery mode or not.

e.g.


1

2

3

4

5

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

3. pg_last_xlog_receive_location: Function which tells location of last transaction log which was streamed by Standby and also written on standby disk.

e.g.


1

2

3

4

5

postgres=# select pg_last_xlog_receive_location();

 pg_last_xlog_receive_location

-------------------------------

 0/29004560

(1 row)

4. pg_last_xlog_replay_location: Function which tells last transaction replayed during recovery process. e.g is given below:


1

2

3

4

5

postgres=# select pg_last_xlog_replay_location();

 pg_last_xlog_replay_location

------------------------------

 0/29004560

(1 row)

5. pg_last_xact_replay_timestamp: This function tells about the time stamp of last transaction which was replayed during recovery. Below is an example:


1

2

3

4

5

postgres=# select pg_last_xact_replay_timestamp();

  pg_last_xact_replay_timestamp

----------------------------------

 15-MAY-14 20:54:27.635591 -04:00

(1 row)

Above are some important functions/views, which are already available in PostgreSQL for monitoring the streaming replication.

So, the logical next question is, “What’s the right way to monitor the Hot Standby with Streaming Replication on Standby Server?”

If you have Hot Standby with Streaming Replication, the following are the points you should monitor:

1. Check if your Hot Standby is in recovery mode or not:

For this you can use pg_is_in_recovery() function.

2.Check whether Streaming Replication is working or not.

And easy way of doing this is checking the pg_stat_replication view on Master/Primary. This view gives information only on master if Streaming Replication is working.

3. Check If Streaming Replication is not working and Hot standby is recovering from archived WAL file.

For this, either the DBA can use the PostgreSQL Log file to monitor it or utilize the following functions provided in PostgreSQL 9.3:


1

2

pg_last_xlog_replay_location();

pg_last_xact_replay_timestamp();

4. Check how far off is the Standby from Master.

There are two ways to monitor lag for Standby.



   i. Lags in Bytes: For calculating lags in bytes, users can use thepg_stat_replication view on the master with the functionpg_xlog_location_diff function. Below is an example:


1

pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location)

which gives the lag in bytes.

  ii. Calculating lags in Seconds. The following is SQL, which most people uses to find the lag in seconds:


1

2

3

4

SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()

              THEN 0

            ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())

       END AS log_delay;

Including the above into your repertoire can give you good monitoring for PostgreSQL.

I will in a future post include the script that can be used for monitoring the Hot Standby with PostgreSQL streaming replication.

时间: 2024-11-04 07:44:32

14 - How to check replication status的相关文章

java.lang.IllegalStateException: Failed to check the status of the service

java.lang.IllegalStateException: Failed to check the status of the service com.pinyougou.sellergoods.service.BrandService. No provider available for the service com.pinyougou.sellergoods.service.BrandService from the url zookeeper://192.168.25.129:21

关于Failed to check the status of the service com.taotao.service.ItemService. No provider available for the service【已解决】

项目中用dubbo发生: Failed to check the status of the service com.taotao.service.ItemService. No provider available for the service 原因: Dubbo缺省会在启动时检查依赖的服务是否可用,不可用时会抛出异常,阻止Spring初始化完成,以便上线时,能及早发现问题,默认check=true. 如果你的Spring容器是懒加载的,或者通过API编程延迟引用服务,请关闭check,否则

Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR

I0930 21:23:15.115576 30918 solver.cpp:281] Learning Rate Policy: multistepF0930 21:23:17.263314 31011 math_functions.cu:121] Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR*** Check failure stack trace: ***F0930

jmeter测试dubbo接口遇到 Failed to check the status of the service

Exception in thread "main" java.lang.IllegalStateException: Failed to check the status of the service com.******. No provider available for the service com.***.IProxyCertApi:1.0 from the url zookeeper://10.8.*.*:2181/com.alibaba.dubbo.registry.R

check web status

#!/bin/bash sendmail() { /usr/local/bin/sendEmail -f [email protected] -t [email protected] -s smtp.163.com -u "SERVER 192.168.31.$i" -xu [email protected] -xp "123456" -m "Alert: web 192.168.31.$i can't access" } while : do

caffe报错:cudnn.hpp:86] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM 原因

在实际项目中出现的该问题,起初以为是cudnn版本的问题,后来才定位到在网络进行reshape操作的时候 input_layer->Reshape({(int)imgin.size(), input_layer->shape(1), input_layer->shape(2), input_layer->shape(3)}); 如上所示,第一个参数是输入图片的尺寸,在实际的视频中,输入的图片尺寸有可能为0,那么在reshape操作的时候就会报错. 在外层加一个保护就好了. 原文地址

Failed to check the status of the service报错解决

报这个错误是因为我的application_context.service.xml 文件里的的dubbo声明暴露口时的ref属性写错了. <dubbo:service interface="cn.e3mall.content.service.ContentCategoryService" ref="contentCategoryServiceImpl" timeout="600000"/> <dubbo:service inte

Change the File Store Location for Lync Server 2013 Pool

In the event that you need to remove the file server that is currently acting as the file store for your Microsoft Lync Server 2013 deployment or make other changes that would make the current file store unavailable, you need to create a new share. T

Mysql:FAQ:A.14 Replication

A.14 MySQL 5.7 FAQ: Replication In the following section, we provide answers to questions that are most frequently asked about MySQL Replication. A.14.1. Must the slave be connected to the master all the time? A.14.2. Must I enable networking on my m