greenplum（5.10）生产系统主备节点切换 / 憋错料

集群安装信息参考：
系统初始化：http://blog.51cto.com/michaelkang/2167195
集群安装配置：http://blog.51cto.com/michaelkang/2170627

本文对敏感信息进行了替换！！！！

集群主备节点

master  => gpnode615.kjh.com
standby => gpnode616.kjh.com

状态查看工具 gpstate

命令     参数   作用
gpstate -b =》 显示简要状态
gpstate -c =》 显示主镜像映射
gpstart -d =》 指定数据目录（默认值：$MASTER_DATA_DIRECTORY）
gpstate -e =》 显示具有镜像状态问题的片段
gpstate -f =》 显示备用主机详细信息
gpstate -i =》 显示GRIPLUM数据库版本
gpstate -m =》 显示镜像实例同步状态
gpstate -p =》 显示使用端口
gpstate -Q =》 快速检查主机状态
gpstate -s =》 显示集群详细信息
gpstate -v =》 显示详细信息

查看集群备用节点状态

 gpstate -f

=>]:-Standby master details
=>]:-----------------------
=>]:-   Standby address          = gpnode616.kjh.com
=>]:-   Standby data directory   = /usr/local/gpdata/gpmaster/gpseg-1
=>]:-   Standby port             = 5432
=>]:-   Standby PID              = 45634
=>]:-   Standby status           = Standby host passive
=>]:--------------------------------------------------------------
=>]:--pg_stat_replication
=>]:--------------------------------------------------------------
=>]:--WAL Sender State: streaming
=>]:--Sync state: sync
=>]:--Sent Location: 0/C0006C0
=>]:--Flush Location: 0/C0006C0
=>]:--Replay Location: 0/C0006C0
=>]:--------------------------------------------------------------

以上信息显示standby状态正常！

切换步骤如下：

定义：
master => gpnode615.kjh.com （server1）
standby => gpnode616.kjh.com (server2）

MASTER_DATA_DIRECTORY => /usr/local/gpdata/gpmaster/gpseg-1

模拟master节点故障（master节点执行）

pg_ctl stop -D $(MASTER_DATA_DIRECTORY)
or
pg_ctl stop -D /usr/local/gpdata/gpmaster/gpseg-1

再次查看standby节点状态

（master节点执行）

$ gpstate -f
=> gpadmin-[INFO]:-Starting gpstate with args: -f
=> gpadmin-[INFO]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.10.2 build => ‘
=> gpadmin-[CRITICAL]:-gpstate failed. (Reason=‘could not connect to server: Connection refused
        Is the server running on host "localhost" (::1) and accepting
        TCP/IP connections on port 5432?
could not connect to server: Connection refused
        Is the server running on host "localhost" (127.0.0.1) and accepting
        TCP/IP connections on port 5432?
‘) exiting...

集群进入异常状态，无法获取信息；

激活备用节点

设置gpadmin账户的环境并使之生效

（standby节点，gpadmin账户操作）

修改.bashrc

su - gpadmin
cat >>/home/gpadmin/.bashrc<<-EOF
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/usr/local/gpdata/gpmaster/gpseg-1
export PGPORT=5432
EOF

修改.bash_profile

su - gpadmin
cat >>/home/gpadmin/.bash_profile<<-EOF
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/usr/local/gpdata/gpmaster/gpseg-1
export PGPORT=5432
EOF

使之生效

source ~/.bashrc
source ~/.bash_profile

激活standby节点

（standby节点，gpadmin账户操作）

gpactivatestandby -d /usr/local/gpdata/gpmaster/gpseg-1

=》]:------------------------------------------------------
=》]:-Standby data directory    = /usr/local/gpdata/gpmaster/gpseg-1
=》]:-Standby port              = 5432
=》]:-Standby running           = yes
=》]:-Force standby activation  = no
=》]:------------------------------------------------------

确认切换：

Do you want to continue with standby master activation? Yy|Nn (default=N):
> y    <== !!!

初始化输出信息如下：

=》]:-found standby postmaster process
。。。。。。。。。。。。。。。。。
=》]:-Writing the gp_dbid file - /usr/local/gpdata/gpmaster/gpseg-1/gp_dbid...
=》]:-But found an already existing file.
=》]:-Hence removed that existing file.
=》]:-Creating a new file...
=》]:-Wrote dbid: 1 to the file.
=》]:-Now marking it as read only...
=》]:-Verifying the file...
=》]:------------------------------------------------------
=》]:-The activation of the standby master has completed successfully.
=》]:-cndh1322-6-16 is now the new primary master.
=》]:-You will need to update your user access mechanism to reflect
=》]:-the change of master hostname.
=》]:-operational, this could result in database corruption!
=》]:-MASTER_DATA_DIRECTORY is now /usr/local/gpdata/gpmaster/gpseg-1 if
=》]:-this has changed as a result of the standby master activation, remember
=》]:-to change this in any startup scripts etc, that may be configured
=》]:-to set this value.
=》]:-MASTER_PORT is now 5432, if this has changed, you
=》]:-Do not re-start the failed master while the fail-over master is
=》]:-may need to make additional configuration changes to allow access
=》]:-to the Greenplum instance.
。。。。。。。。。。。。

再次查看standby节点状态

（standby节点，gpadmin账户操作）

gpstate -f
=>]:-Starting gpstate with args: -f
=>]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.10.2 build =>
=>]:-Obtaining Segment details from master...
=>]:-Standby master instance not configured
=>]:--------------------------------------------------------------
=>]:--pg_stat_replication
=>]:--------------------------------------------------------------
=>]:-No entries found.
=>]:--------------------------------------------------------------

集群可以查看状态，无法获取信息备用节点信息，此时集群可以访问；

将原有主节点添加为standby （master节点，gpadmin账户操作）

注意这里添加的时候，会check master的数据库目录，要把原master节点的的目录删除或者重命名

mv /usr/local/gpdata/gpmaster/gpseg-1 /usr/local/gpdata/gpmaster/gpseg-1.bak

将原有主节点先添加为standby 节点

（standby节点，gpadmin账户操作）

 gpinitstandby -s gpnode615.kjh.com 

=>]:-Validating environment and parameters for standby initialization...
=>]:-Checking for filespace directory /usr/local/gpdata/gpmaster/gpseg-1 on gpnode615.kjh.com
=>]:------------------------------------------------------
=>]:-Greenplum standby master initialization parameters
=>]:------------------------------------------------------
=>]:-Greenplum master hostname               = gpnode616.kjh.com
=>]:-Greenplum master data directory         = /usr/local/gpdata/gpmaster/gpseg-1
=>]:-Greenplum master port                   = 5432
=>]:-Greenplum standby master hostname       = gpnode615.kjh.com
=>]:-Greenplum standby master port           = 5432
=>]:-Greenplum standby master data directory = /usr/local/gpdata/gpmaster/gpseg-1
=>]:-Greenplum update system catalog         = On
=>]:------------------------------------------------------
=>]:- Filespace locations
=>]:------------------------------------------------------
=>]:-pg_system -> /usr/local/gpdata/gpmaster/gpseg-1

#确认添加从节点

Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y

=>]:-Syncing Greenplum Database extensions to standby
=>]:-The packages on gpnode615.kjh.com are consistent.
=>]:-Adding standby master to catalog...
=>]:-Database catalog updated successfully.
=>]:-Updating pg_hba.conf file...
=>]:-pg_hba.conf files updated successfully.
=>]:-Updating filespace flat files...
=>]:-Filespace flat file updated successfully.
=>]:-Starting standby master
=>]:-Checking if standby master is running on host: gpnode615.kjh.com  in directory: /usr/local/gpdata/gpmaster/gpseg-1
=>:-Cleaning up pg_hba.conf backup files...
=>:-Backup files of pg_hba.conf cleaned up successfully.
=>:-Successfully created standby master on gpnode615.kjh.com

查看集群备用节点状态

 gpstate -f

=》]:-Obtaining Segment details from master...
=》]:-Standby master details
=》]:-----------------------
=》]:-   Standby address          = gpnode615.kjh.com
=》]:-   Standby data directory   = /usr/local/gpdata/gpmaster/gpseg-1
=》]:-   Standby port             = 5432
=》]:-   Standby PID              = 29968
=》]:-   Standby status           = Standby host passive
=》]:--------------------------------------------------------------
=》]:--pg_stat_replication
=》]:--------------------------------------------------------------
=》]:--WAL Sender State: streaming
=》]:--Sync state: sync
=》]:--Sent Location: 0/140000A8
=》]:--Flush Location: 0/140000A8
=》]:--Replay Location: 0/140000A8
=》]:--------------------------------------------------------------

以上信息显示新添加 gpnode615.kjh.com standby状态正常！

将master 切换至原节点

gpnode615.kjh.com

操作方法按照主备切换方法反向操作即可；

#### 现集群状态：

standby => gpnode615.kjh.com
master  => gpnode616.kjh.com      

以下操作，按照现集群角色配置！！！

#### 模拟master节点故障
（gpnode616.kjh.com  节点执行）

pg_ctl stop -D /usr/local/gpdata/gpmaster/gpseg-1     

#### 激活standby节点
（gpnode615.kjh.com 节点，gpadmin账户操作）

gpactivatestandby -d /usr/local/gpdata/gpmaster/gpseg-1     

#### 添加standby
（gpnode616.kjh.com节点，gpadmin账户操作）

注意这里添加的时候，会check master的数据库目录，要把原master节点的的目录删除或者重命名

mv /usr/local/gpdata/gpmaster/gpseg-1 /usr/local/gpdata/gpmaster/gpseg-1.bak

#### 将原有主节点先添加为standby 节点,
（gpnode615.kjh.com节点，gpadmin账户操作）

gpinitstandby -s gpnode616.kjh.com 

#### 验证

gpstate -f

Standby address          =
Standby data directory   = /usr/local/gpdata/gpmaster/gpseg-1
Standby port             = 5432
Standby PID              = 136489
Standby status           = Standby host passive

原文地址：http://blog.51cto.com/michaelkang/2170637

时间： 2024-10-05 06:13:06

greenplum（5.10）生产系统主备节点切换

集群主备节点

状态查看工具 gpstate

查看集群备用节点状态

切换步骤如下：

模拟master节点故障（master节点执行）

再次查看standby节点状态

激活备用节点

设置gpadmin账户的环境并使之生效

修改.bashrc

修改.bash_profile

使之生效

激活standby节点

确认切换：

初始化输出信息如下：

再次查看standby节点状态

将原有主节点添加为standby （master节点，gpadmin账户操作）

将原有主节点先添加为standby 节点

查看集群备用节点状态

将master 切换至原节点

greenplum（5.10）生产系统主备节点切换的相关文章

Zookeeper 05 示例代码-主备节点切换

Redis主备自动切换

思科核心交换主备引擎切换与测试

keepalived实现对mysql主从复制的主备自动切换

MySQL主备库切换演练与总结

A10负载配置及主备手动切换命令

DG_Oracle Dataguard Primary/Standby主备节点安装实践（案例）

haproxy+keepalived主备与双主模式配置

DRBD主备切换

greenplum（5.10）生产系统主备节点切换

集群主备节点

状态查看工具 gpstate

查看集群备用节点状态

切换步骤如下：

模拟master节点故障 （master节点执行）

再次查看standby节点状态

激活备用节点

设置gpadmin账户的环境并使之生效

修改.bashrc

修改.bash_profile

使之生效

激活standby节点

确认切换：

初始化输出信息如下：

再次查看standby节点状态

将原有主节点添加为standby （master节点，gpadmin账户操作）

将原有主节点先添加为standby 节点

查看集群备用节点状态

将master 切换至原节点

greenplum（5.10）生产系统主备节点切换的相关文章

模拟master节点故障（master节点执行）