greenplum(5.10)生产系统主备节点切换

集群安装信息参考:
系统初始化:http://blog.51cto.com/michaelkang/2167195
集群安装配置:http://blog.51cto.com/michaelkang/2170627

本文对敏感信息进行了替换!!!!

集群主备节点

master  => gpnode615.kjh.com
standby => gpnode616.kjh.com

状态查看工具 gpstate

命令     参数   作用
gpstate -b =》 显示简要状态
gpstate -c =》 显示主镜像映射
gpstart -d =》 指定数据目录(默认值:$MASTER_DATA_DIRECTORY)
gpstate -e =》 显示具有镜像状态问题的片段
gpstate -f =》 显示备用主机详细信息
gpstate -i =》 显示GRIPLUM数据库版本
gpstate -m =》 显示镜像实例同步状态
gpstate -p =》 显示使用端口
gpstate -Q =》 快速检查主机状态
gpstate -s =》 显示集群详细信息
gpstate -v =》 显示详细信息

查看集群备用节点状态

 gpstate -f

=>]:-Standby master details
=>]:-----------------------
=>]:-   Standby address          = gpnode616.kjh.com
=>]:-   Standby data directory   = /usr/local/gpdata/gpmaster/gpseg-1
=>]:-   Standby port             = 5432
=>]:-   Standby PID              = 45634
=>]:-   Standby status           = Standby host passive
=>]:--------------------------------------------------------------
=>]:--pg_stat_replication
=>]:--------------------------------------------------------------
=>]:--WAL Sender State: streaming
=>]:--Sync state: sync
=>]:--Sent Location: 0/C0006C0
=>]:--Flush Location: 0/C0006C0
=>]:--Replay Location: 0/C0006C0
=>]:--------------------------------------------------------------

以上信息显示standby状态正常!

切换步骤如下:

定义:
master => gpnode615.kjh.com (server1)
standby => gpnode616.kjh.com (server2)

MASTER_DATA_DIRECTORY => /usr/local/gpdata/gpmaster/gpseg-1

模拟master节点故障 (master节点执行)

pg_ctl stop -D $(MASTER_DATA_DIRECTORY)
or
pg_ctl stop -D /usr/local/gpdata/gpmaster/gpseg-1

再次查看standby节点状态

(master节点执行)

$ gpstate -f
=> gpadmin-[INFO]:-Starting gpstate with args: -f
=> gpadmin-[INFO]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.10.2 build => ‘
=> gpadmin-[CRITICAL]:-gpstate failed. (Reason=‘could not connect to server: Connection refused
        Is the server running on host "localhost" (::1) and accepting
        TCP/IP connections on port 5432?
could not connect to server: Connection refused
        Is the server running on host "localhost" (127.0.0.1) and accepting
        TCP/IP connections on port 5432?
‘) exiting...

集群进入异常状态,无法获取信息;

激活备用节点

设置gpadmin账户的环境并使之生效

(standby节点,gpadmin账户操作)

修改.bashrc
su - gpadmin
cat >>/home/gpadmin/.bashrc<<-EOF
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/usr/local/gpdata/gpmaster/gpseg-1
export PGPORT=5432
EOF
修改.bash_profile
su - gpadmin
cat >>/home/gpadmin/.bash_profile<<-EOF
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/usr/local/gpdata/gpmaster/gpseg-1
export PGPORT=5432
EOF
使之生效
source ~/.bashrc
source ~/.bash_profile

激活standby节点

(standby节点,gpadmin账户操作)

gpactivatestandby -d /usr/local/gpdata/gpmaster/gpseg-1

=》]:------------------------------------------------------
=》]:-Standby data directory    = /usr/local/gpdata/gpmaster/gpseg-1
=》]:-Standby port              = 5432
=》]:-Standby running           = yes
=》]:-Force standby activation  = no
=》]:------------------------------------------------------

确认切换:

Do you want to continue with standby master activation? Yy|Nn (default=N):
> y    <== !!!

初始化输出信息如下:

=》]:-found standby postmaster process
。。。。。。。。。。。。。。。。。
=》]:-Writing the gp_dbid file - /usr/local/gpdata/gpmaster/gpseg-1/gp_dbid...
=》]:-But found an already existing file.
=》]:-Hence removed that existing file.
=》]:-Creating a new file...
=》]:-Wrote dbid: 1 to the file.
=》]:-Now marking it as read only...
=》]:-Verifying the file...
=》]:------------------------------------------------------
=》]:-The activation of the standby master has completed successfully.
=》]:-cndh1322-6-16 is now the new primary master.
=》]:-You will need to update your user access mechanism to reflect
=》]:-the change of master hostname.
=》]:-operational, this could result in database corruption!
=》]:-MASTER_DATA_DIRECTORY is now /usr/local/gpdata/gpmaster/gpseg-1 if
=》]:-this has changed as a result of the standby master activation, remember
=》]:-to change this in any startup scripts etc, that may be configured
=》]:-to set this value.
=》]:-MASTER_PORT is now 5432, if this has changed, you
=》]:-Do not re-start the failed master while the fail-over master is
=》]:-may need to make additional configuration changes to allow access
=》]:-to the Greenplum instance.
。。。。。。。。。。。。

再次查看standby节点状态

(standby节点,gpadmin账户操作)

gpstate -f
=>]:-Starting gpstate with args: -f
=>]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.10.2 build =>
=>]:-Obtaining Segment details from master...
=>]:-Standby master instance not configured
=>]:--------------------------------------------------------------
=>]:--pg_stat_replication
=>]:--------------------------------------------------------------
=>]:-No entries found.
=>]:--------------------------------------------------------------

集群可以查看状态,无法获取信息备用节点信息,此时集群可以访问;

将原有主节点添加为standby (master节点,gpadmin账户操作)

注意这里添加的时候,会check master的数据库目录,要把原master节点的的目录删除或者重命名

mv /usr/local/gpdata/gpmaster/gpseg-1 /usr/local/gpdata/gpmaster/gpseg-1.bak

将原有主节点先添加为standby 节点

(standby节点,gpadmin账户操作)

 gpinitstandby -s gpnode615.kjh.com 

=>]:-Validating environment and parameters for standby initialization...
=>]:-Checking for filespace directory /usr/local/gpdata/gpmaster/gpseg-1 on gpnode615.kjh.com
=>]:------------------------------------------------------
=>]:-Greenplum standby master initialization parameters
=>]:------------------------------------------------------
=>]:-Greenplum master hostname               = gpnode616.kjh.com
=>]:-Greenplum master data directory         = /usr/local/gpdata/gpmaster/gpseg-1
=>]:-Greenplum master port                   = 5432
=>]:-Greenplum standby master hostname       = gpnode615.kjh.com
=>]:-Greenplum standby master port           = 5432
=>]:-Greenplum standby master data directory = /usr/local/gpdata/gpmaster/gpseg-1
=>]:-Greenplum update system catalog         = On
=>]:------------------------------------------------------
=>]:- Filespace locations
=>]:------------------------------------------------------
=>]:-pg_system -> /usr/local/gpdata/gpmaster/gpseg-1

#确认添加从节点

Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y

=>]:-Syncing Greenplum Database extensions to standby
=>]:-The packages on gpnode615.kjh.com are consistent.
=>]:-Adding standby master to catalog...
=>]:-Database catalog updated successfully.
=>]:-Updating pg_hba.conf file...
=>]:-pg_hba.conf files updated successfully.
=>]:-Updating filespace flat files...
=>]:-Filespace flat file updated successfully.
=>]:-Starting standby master
=>]:-Checking if standby master is running on host: gpnode615.kjh.com  in directory: /usr/local/gpdata/gpmaster/gpseg-1
=>:-Cleaning up pg_hba.conf backup files...
=>:-Backup files of pg_hba.conf cleaned up successfully.
=>:-Successfully created standby master on gpnode615.kjh.com

查看集群备用节点状态

 gpstate -f

=》]:-Obtaining Segment details from master...
=》]:-Standby master details
=》]:-----------------------
=》]:-   Standby address          = gpnode615.kjh.com
=》]:-   Standby data directory   = /usr/local/gpdata/gpmaster/gpseg-1
=》]:-   Standby port             = 5432
=》]:-   Standby PID              = 29968
=》]:-   Standby status           = Standby host passive
=》]:--------------------------------------------------------------
=》]:--pg_stat_replication
=》]:--------------------------------------------------------------
=》]:--WAL Sender State: streaming
=》]:--Sync state: sync
=》]:--Sent Location: 0/140000A8
=》]:--Flush Location: 0/140000A8
=》]:--Replay Location: 0/140000A8
=》]:--------------------------------------------------------------

以上信息显示新添加 gpnode615.kjh.com standby状态正常!

将master 切换至原节点

gpnode615.kjh.com

操作方法按照主备切换方法反向操作即可;

#### 现集群状态:

standby => gpnode615.kjh.com
master  => gpnode616.kjh.com      

以下操作,按照现集群角色配置!!!

#### 模拟master节点故障
(gpnode616.kjh.com  节点执行)

pg_ctl stop -D /usr/local/gpdata/gpmaster/gpseg-1     

#### 激活standby节点
(gpnode615.kjh.com 节点,gpadmin账户操作)

gpactivatestandby -d /usr/local/gpdata/gpmaster/gpseg-1     

#### 添加standby
(gpnode616.kjh.com节点,gpadmin账户操作)

注意这里添加的时候,会check master的数据库目录,要把原master节点的的目录删除或者重命名

mv /usr/local/gpdata/gpmaster/gpseg-1 /usr/local/gpdata/gpmaster/gpseg-1.bak

#### 将原有主节点先添加为standby 节点,
(gpnode615.kjh.com节点,gpadmin账户操作)

gpinitstandby -s gpnode616.kjh.com 

#### 验证

gpstate -f

Standby address          =
Standby data directory   = /usr/local/gpdata/gpmaster/gpseg-1
Standby port             = 5432
Standby PID              = 136489
Standby status           = Standby host passive

原文地址:http://blog.51cto.com/michaelkang/2170637

时间: 2024-10-05 06:13:06

greenplum(5.10)生产系统主备节点切换的相关文章

Zookeeper 05 示例代码-主备节点切换

主备节点的切换,是分布式应用的基本要求.现在用 Zookeeper 实现主备节点自动切换功能. 基本思路: 1 多个服务启动后,都尝试在 Zookeeper中创建一个 EPHEMERAL 类型的节点,Zookeeper本身会保证,只有一个服务会创建成功,其他服务抛出异常. 2 成功创建节点的服务,作为主节点,继续运行 3 其他服务设置一个Watcher监控节点状态, 4 如果主节点消失,其他服务会接到通知,再次尝试创建EPHEMERAL 类型的节点. public class Master im

Redis主备自动切换

Sentinel(哨兵)是用于监控redis集群中Master状态的工具. 一.Sentinel作用  1.Master状态检测   2.如果Master异常,则会进行Master-Slave切换,将其中一个Slave作为Master,将之前的Master作为Slave;  3.Master-Slave切换后,master_redis.conf.slave_redis.conf和sentinel.conf的内容都会发生改变,即master_redis.conf中会多一行slaveof的配置,se

思科核心交换主备引擎切换与测试

某省级大型国企业,某项目涉及到引擎切换测试.做了些记录还有心得与大家共享.注:现场实际环境 1,设备硬件 Cisco WS-C6509-E Component name Description 板卡1 WS-X6724-SFP 24 port 1000mb SFP 板卡2 WS-X6748-GE-TX 48 port 10/100/1000mb Ethernet 板卡1 WS-X4648-GE-TX 48 port 10/100/1000mb Ethernet 引擎1 WS-SUP720-3B

keepalived实现对mysql主从复制的主备自动切换

备注:君子性非议也,善假于物也. 温故而知新,可以为师矣. 使用MySQL+keepalived是一种非常好的解决方案,在MySQL-HA环境中,MySQL互为主从关系,这样就保证了两台 MySQL数据的一致性,然后用keepalived实现虚拟IP,通过keepalived自带的服务监控功能来实现MySQL故障时自动切换. 实验环境中用两台主机搭建了一个mysql主从复制的环境,两台机器分别安装了keepalived,用一个虚IP实现mysql服务器的主备自动切换功能. 模拟环境: VIP:1

MySQL主备库切换演练与总结

演练包括被动切换和主动切换两部分.被动切换是主库宕机,主动切换是人工手动触发. 演练步骤大致如下: 1 先停掉主库,模拟主库宕机 2 mha将vip切到备库,备库变成主库,应用可以正常读写数据库 3 重新启动宕机的原主库 4 在原主库上建立同步关系(根据宕机时,日志记录的binlog的文件名和偏移量,恢复从这里开始) 5 mha手动切换主库,还原到最初状态,应用可以正常读写数据库 6 演练结束 演练过程问题总结: 1 mha每次自动切换之后都会结束自身进程,并在日志目录如/app/mha/xxx

A10负载配置及主备手动切换命令

添加一个负载完整的过程为: 配置slb server 即所谓的real server,后端真实提供服务的主机 配置slb service-group 定义服务组,里面包含哪些real server主机 配置slb virtual-server 即virtual server,对外提供服务的地址 贴一个简单的配置供参考: 1.配置server,当然得是在configure模式下 slb server apache01 172.16.2.10   port 8080 tcp slb server a

DG_Oracle Dataguard Primary/Standby主备节点安装实践(案例)

2014-09-09 Created By BaoXinjian 一.摘要 二. Thanks and Regards http://blog.csdn.net/giianhui/article/details/7199682 http://blog.csdn.net/l106439814/article/details/8560119 http://blog.csdn.net/l106439814/article/details/8560119

haproxy+keepalived主备与双主模式配置

Haproxy+Keepalived主备模式 主备节点设置 主备节点上各安装配置haproxy,配置内容且要相同 global log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon defaults #defaults段默认值对frontend和backend和listen段生效 mode http #运行模式

DRBD主备切换

DRBD主备切换 在系统维护的时候,或者在高可用群集中,当主用节点出现故障的时候,就需要将主备节点的角色互换,主备节点切换有两种形式,分别是停止DRBD服务切换和正常切换 (1) 停止DRBD服务切换 关闭主用节点服务,此时挂载的DRBD分区就自动在主用节点卸载了,操作如下, /etc/init.d/drbd stop cat /proc/drbd (细节请看http://liumingyuan.blog.51cto.com/9065923/1712824) 从输出结果可以看到,现在主用节点的状