GaussDB T分布式集群部署以及升级指南

本文用四节点部署GaussDB T 1.0.1分布式集群,部署完成后再将其升级到1.0.2版本(直接安装1.0.2版本,在安装过程中会遇到segment fault报错,目前尚未解决)。前期操作系统准备工作参考之前的几篇文章。

1、部署分布式集群

1.1 节点信息

各节点信息如下表所示:

1.2 集群参数文件

根据实际情况修改集群参数,或者通过database manager工具生成,内容如下:

[[email protected] db]# vi clusterconfig.xml
<?xml version="1.0" encoding="UTF-8"?><ROOT>
 <CLUSTER>
  <PARAM name="clusterName" value="GT100"/>
  <PARAM name="nodeNames" value="hwd08,hwd09,hwd10,hwd11"/>
  <PARAM name="gaussdbAppPath" value="/opt/huawei/gaussdb/app"/>
  <PARAM name="gaussdbLogPath" value="/var/log/huawei/gaussdb"/>
  <PARAM name="archiveLogPath" value="/opt/huawei/gaussdb/arch_log"/>
  <PARAM name="redoLogPath" value="/opt/huawei/gaussdb/redo_log"/>
  <PARAM name="tmpMppdbPath" value="/opt/huawei/gaussdb/temp"/>
  <PARAM name="gaussdbToolPath" value="/opt/huawei/gaussdb/gaussTools/wisequery"/>
  <PARAM name="datanodeType" value="DN_ZENITH_HA"/>
  <PARAM name="WhetherDoFailoverAuto" value="OFF"/>
  <PARAM name="clusterType" value="mutil-AZ"/>
  <PARAM name="coordinatorType" value="CN_ZENITH_ZSHARDING"/>
  <PARAM name="SetDoubleIPForETCD" value="false"/>
 </CLUSTER>
 <DEVICELIST>
  <DEVICE sn="1000001">
   <PARAM name="name" value="hwd08"/>
   <PARAM name="azName" value="AZ1"/>
   <PARAM name="azPriority" value="1"/>
   <PARAM name="backIp1" value="192.168.120.29"/>
   <PARAM name="sshIp1" value="192.168.120.29"/>
   <PARAM name="innerManageIp1" value="192.168.120.29"/>
   <PARAM name="cmsNum" value="1"/>
   <PARAM name="cmServerPortBase" value="21000"/>
   <PARAM name="cmServerListenIp1" value="192.168.120.29,192.168.120.30,192.168.120.31"/>
   <PARAM name="cmServerHaIp1" value="192.168.120.29,192.168.120.30,192.168.120.31"/>
   <PARAM name="cmServerlevel" value="1"/>
   <PARAM name="cmServerRelation" value="hwd08,hwd09,hwd10"/>
   <PARAM name="cmDir" value="/opt/huawei/gaussdb/data/data_cm"/>
   <PARAM name="dataNum" value="1"/>
   <PARAM name="dataPortBase" value="40000"/>
   <PARAM name="dataNode1" value="/opt/huawei/gaussdb/data_db/dn1,hwd09,/opt/huawei/gaussdb/data_db/dn1,hwd10,/opt/huawei/
gaussdb/data_db/dn1"/>
   <PARAM name="quorumAny1" value="1"/>
   <PARAM name="gtsNum" value="1"/>
   <PARAM name="gtsPortBase" value="7000"/>
   <PARAM name="gtsDir1" value="/opt/huawei/gaussdb/data/gts,hwd09,/opt/huawei/gaussdb/data/gts"/>
   <PARAM name="cooNum" value="1"/>
   <PARAM name="cooPortBase" value="8000"/>
   <PARAM name="cooListenIp1" value="192.168.120.29"/>
   <PARAM name="cooDir1" value="/opt/huawei/gaussdb/data/data_cn"/>
   <PARAM name="etcdNum" value="1"/>
   <PARAM name="etcdListenPort" value="2379"/>
   <PARAM name="etcdHaPort" value="2380"/>
   <PARAM name="etcdListenIp1" value="192.168.120.29"/>
   <PARAM name="etcdHaIp1" value="192.168.120.29"/>
   <PARAM name="etcdDir1" value="/opt/huawei/gaussdb/data_etcd1/data"/>
  </DEVICE>
  <DEVICE sn="1000002">
   <PARAM name="name" value="hwd09"/>
   <PARAM name="azName" value="AZ1"/>
   <PARAM name="azPriority" value="1"/>
   <PARAM name="backIp1" value="192.168.120.30"/>
   <PARAM name="sshIp1" value="192.168.120.30"/>
   <PARAM name="innerManageIp1" value="192.168.120.30"/>
   <PARAM name="dataNum" value="1"/>
   <PARAM name="dataPortBase" value="40000"/>
   <PARAM name="dataNode1" value="/opt/huawei/gaussdb/data_db/dn2,hwd10,/opt/huawei/gaussdb/data_db/dn2,hwd11,/opt/huawei/
gaussdb/data_db/dn2"/>
   <PARAM name="quorumAny1" value="1"/>
   <PARAM name="cooNum" value="1"/>
   <PARAM name="cooPortBase" value="8000"/>
   <PARAM name="cooListenIp1" value="192.168.120.30"/>
   <PARAM name="cooDir1" value="/opt/huawei/gaussdb/data/data_cn"/>
   <PARAM name="etcdNum" value="1"/>
   <PARAM name="etcdListenPort" value="2379"/>
   <PARAM name="etcdHaPort" value="2380"/>
   <PARAM name="etcdListenIp1" value="192.168.120.30"/>
   <PARAM name="etcdHaIp1" value="192.168.120.30"/>
   <PARAM name="etcdDir1" value="/opt/huawei/gaussdb/data_etcd1/data"/>
  </DEVICE>
  <DEVICE sn="1000003">
   <PARAM name="name" value="hwd10"/>
   <PARAM name="azName" value="AZ1"/>
   <PARAM name="azPriority" value="1"/>
   <PARAM name="backIp1" value="192.168.120.31"/>
   <PARAM name="sshIp1" value="192.168.120.31"/>
   <PARAM name="innerManageIp1" value="192.168.120.31"/>
   <PARAM name="dataNum" value="1"/>
   <PARAM name="dataPortBase" value="40000"/>
   <PARAM name="dataNode1" value="/opt/huawei/gaussdb/data_db/dn3,hwd11,/opt/huawei/gaussdb/data_db/dn3,hwd08,/opt/huawei/
gaussdb/data_db/dn3"/>
   <PARAM name="quorumAny1" value="1"/>
   <PARAM name="cooNum" value="1"/>
   <PARAM name="cooPortBase" value="8000"/>
   <PARAM name="cooListenIp1" value="192.168.120.31"/>
   <PARAM name="cooDir1" value="/opt/huawei/gaussdb/data/data_cn"/>
   <PARAM name="etcdNum" value="1"/>
   <PARAM name="etcdListenPort" value="2379"/>
   <PARAM name="etcdHaPort" value="2380"/>
   <PARAM name="etcdListenIp1" value="192.168.120.31"/>
   <PARAM name="etcdHaIp1" value="192.168.120.31"/>
   <PARAM name="etcdDir1" value="/opt/huawei/gaussdb/data_etcd1/data"/>
  </DEVICE>
  <DEVICE sn="1000004">
   <PARAM name="name" value="hwd11"/>
   <PARAM name="azName" value="AZ1"/>
   <PARAM name="azPriority" value="1"/>
   <PARAM name="backIp1" value="192.168.120.49"/>
   <PARAM name="sshIp1" value="192.168.120.49"/>
   <PARAM name="innerManageIp1" value="192.168.120.49"/>
   <PARAM name="cooNum" value="1"/>
   <PARAM name="cooPortBase" value="8000"/>
   <PARAM name="cooListenIp1" value="192.168.120.49"/>
   <PARAM name="cooDir1" value="/opt/huawei/gaussdb/data/data_cn"/>
  </DEVICE>
 </DEVICELIST>
</ROOT>

1.3 准备安装用户及环境

将安装包解压后,使用gs_preinstall准备好安装环境,如下:

[[email protected] script]# ./gs_preinstall -U omm -G dbgrp -X /mnt/Huawei/db/clusterconfig.xml
Parsing the configuration file.
Successfully parsed the configuration file.
Installing the tools on the local node.
Successfully installed the tools on the local node.
Are you sure you want to create trust for root (yes/no)? yes
Please enter password for root.
Password:
Creating SSH trust for the root permission user.
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key on the remote node.
Successfully appended authorized_key on all remote node.
Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
Successfully distributed SSH trust file to all node.
Verifying SSH trust on all hosts.
Successfully verified SSH trust on all hosts.
Successfully created SSH trust.
Successfully created SSH trust for the root permission user.
All host RAM is consistent
Pass over configuring LVM
Distributing package.
Successfully distributed package.
Are you sure you want to create the user[omm] and create trust for it (yes/no)? yes
Please enter password for cluster user.
Password:
Please enter password for cluster user again.
Password:
Creating [omm] user on all nodes.
Successfully created [omm] user on all nodes.
Installing the tools in the cluster.
Successfully installed the tools in the cluster.
Checking hostname mapping.
Successfully checked hostname mapping.
Creating SSH trust for [omm] user.
Please enter password for current user[omm].
Password:
Checking network information.
All nodes in the network are Normal.
Successfully checked network information.
Creating SSH trust.
Creating the local key file.
Successfully created the local key files.
Appending local ID to authorized_keys.
Successfully appended local ID to authorized_keys.
Updating the known_hosts file.
Successfully updated the known_hosts file.
Appending authorized_key on the remote node.
Successfully appended authorized_key on all remote node.
Checking common authentication file content.
Successfully checked common authentication content.
Distributing SSH trust file to all node.
Successfully distributed SSH trust file to all node.
Verifying SSH trust on all hosts.
Successfully verified SSH trust on all hosts.
Successfully created SSH trust.
Successfully created SSH trust for [omm] user.
Checking OS version.
Successfully checked OS version.
Creating cluster‘s path.
Successfully created cluster‘s path.
Setting SCTP service.
Successfully set SCTP service.
Set and check OS parameter.
Successfully set NTP service.
Setting OS parameters.
Successfully set OS parameters.
Set and check OS parameter completed.
Preparing CRON service.
Successfully prepared CRON service.
Preparing SSH service.
Successfully prepared SSH service.
Setting user environmental variables.
Successfully set user environmental variables.
Configuring alarms on the cluster nodes.
Successfully configured alarms on the cluster nodes.
Setting the dynamic link library.
Successfully set the dynamic link library.
Fixing server package owner.
Successfully fixed server package owner.
Create logrotate service.
Successfully create logrotate service.
Setting finish flag.
Successfully set finish flag.
check time consistency(maximum execution time 10 minutes).
Time consistent is running(1/20)...
Time consistent has been completed.
Preinstallation succeeded.

1.4 执行安装

首先切换到omm用户,对操作系统进行检查,如果有报错,根据报错信息检查并修复:

[[email protected] script]# su - omm
[[email protected] ~]$ gs_checkos -i A12 -h hwd08,hwd09,hwd10,hwd11 -X /mnt/Huawei/db/clusterconfig.xml
Checking items
    A12.[ Time consistency status ]                             : Normal
Total numbers:1. Abnormal numbers:0. Warning numbers:0.

如果无报错,执行下面的脚本进行安装:

[[email protected] ~]$ gs_install -X /mnt/Huawei/db/clusterconfig.xml
Parsing the configuration file.
Check preinstall on every node.
Successfully checked preinstall on every node.
Creating the backup directory.
Successfully created the backup directory.
Check the time difference between hosts in the cluster.
Installing the cluster.
Installing applications on all nodes.
Successfully installed APP.
Distribute etcd communication keys.
Successfully distrbute etcd communication keys.
Initializing cluster instances
.4861s
Initializing cluster instances is completed.
Configuring standby datanode.
...................1309s
Successfully configure datanode.
Cluster installation is completed.
.Configuring.
Load cluster configuration file.
Configuring the cluster.
Successfully configuring the cluster.
Configuration is completed.
Start cm agent.
Successfully start cm agent and ETCD in cluster.
Starting the cluster.
==============================================
..32s
Successfully starting the cluster.
==============================================

根据实际环境,这个安装过程耗时不等,这里耗时了半个小时。安装完成后,执行下面的命令验证集群状态:

[[email protected] ~]$ gs_om  -t status
Set output to terminal.
--------------------------------------------------------------------Cluster Status--------------------------------------------------------------------
az_state :      single_az
cluster_state : Normal
balanced :      true
----------------------------------------------------------------------AZ Status-----------------------------------------------------------------------
AZ:AZ1                ROLE:primary            STATUS:ONLINE
---------------------------------------------------------------------Host Status----------------------------------------------------------------------
HOST:hwd08            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.29
HOST:hwd09            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.30
HOST:hwd10            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.31
HOST:hwd11            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.49
----------------------------------------------------------------Cluster Manager Status----------------------------------------------------------------
INSTANCE:CM1          ROLE:slave              STATUS:ONLINE       HOST:hwd08            ID:601
INSTANCE:CM2          ROLE:slave              STATUS:ONLINE       HOST:hwd09            ID:602
INSTANCE:CM3          ROLE:primary            STATUS:ONLINE       HOST:hwd10            ID:603
---------------------------------------------------------------------ETCD Status----------------------------------------------------------------------
INSTANCE:ETCD1        ROLE:follower           STATUS:ONLINE       HOST:hwd08            ID:701      PORT:2379         DataDir:/opt/huawei/gaussdb/data_etcd1/data
INSTANCE:ETCD2        ROLE:follower           STATUS:ONLINE       HOST:hwd09            ID:702      PORT:2379         DataDir:/opt/huawei/gaussdb/data_etcd1/data
INSTANCE:ETCD3        ROLE:leader             STATUS:ONLINE       HOST:hwd10            ID:703      PORT:2379         DataDir:/opt/huawei/gaussdb/data_etcd1/data
----------------------------------------------------------------------CN Status-----------------------------------------------------------------------
INSTANCE:cn_401       ROLE:no role            STATUS:ONLINE       HOST:hwd08            ID:401      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
INSTANCE:cn_402       ROLE:no role            STATUS:ONLINE       HOST:hwd09            ID:402      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
INSTANCE:cn_403       ROLE:no role            STATUS:ONLINE       HOST:hwd10            ID:403      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
INSTANCE:cn_404       ROLE:no role            STATUS:ONLINE       HOST:hwd11            ID:404      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
----------------------------------------------------------------------GTS Status----------------------------------------------------------------------
INSTANCE:GTS1         ROLE:primary            STATUS:ONLINE       HOST:hwd08            ID:441      PORT:7000         DataDir:/opt/huawei/gaussdb/data/gts
INSTANCE:GTS2         ROLE:standby            STATUS:ONLINE       HOST:hwd09            ID:442      PORT:7000         DataDir:/opt/huawei/gaussdb/data/gts
---------------------------------------------------------Instances Status in Group (group_1)----------------------------------------------------------
INSTANCE:DB1_1        ROLE:primary            STATUS:ONLINE       HOST:hwd08            ID:1        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn1
INSTANCE:DB1_2        ROLE:standby            STATUS:ONLINE       HOST:hwd09            ID:2        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn1
INSTANCE:DB1_3        ROLE:standby            STATUS:ONLINE       HOST:hwd10            ID:3        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn1
---------------------------------------------------------Instances Status in Group (group_2)----------------------------------------------------------
INSTANCE:DB2_4        ROLE:primary            STATUS:ONLINE       HOST:hwd09            ID:4        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn2
INSTANCE:DB2_5        ROLE:standby            STATUS:ONLINE       HOST:hwd10            ID:5        PORT:40042        DataDir:/opt/huawei/gaussdb/data_db/dn2
INSTANCE:DB2_6        ROLE:standby            STATUS:ONLINE       HOST:hwd11            ID:6        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn2
---------------------------------------------------------Instances Status in Group (group_3)----------------------------------------------------------
INSTANCE:DB3_9        ROLE:standby            STATUS:ONLINE       HOST:hwd08            ID:9        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn3
INSTANCE:DB3_7        ROLE:primary            STATUS:ONLINE       HOST:hwd10            ID:7        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn3
INSTANCE:DB3_8        ROLE:standby            STATUS:ONLINE       HOST:hwd11            ID:8        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn3
-----------------------------------------------------------------------Manage IP----------------------------------------------------------------------
HOST:hwd08            IP:192.168.120.29
HOST:hwd09            IP:192.168.120.30
HOST:hwd10            IP:192.168.120.31
HOST:hwd11            IP:192.168.120.49
-------------------------------------------------------------------Query Action Info------------------------------------------------------------------
HOSTNAME: hwd08     TIME: 2020-03-19 14:04:51.763762
------------------------------------------------------------------------Float Ip------------------------------------------------------------------
HOST:hwd10    DB3_7:192.168.120.31    IP:
HOST:hwd08    DB1_1:192.168.120.29    IP:
HOST:hwd09    DB2_4:192.168.120.30    IP:

2、升级集群

GaussDB T分布式集群是通过执行gs_upgradectl命令对数据库版本进行升级的,升级有两种方式:离线升级和在线滚动升级。对于GaussDB T 1.0.1版本仅支持离线升级,离线升级类型也仅支持离线二进制升级和离线小版本升级两种类型。对于1.0.2版本的升级,要求升级源版本和升级所使用的目标版本必须同时满足【升级版本规则】和【升级白名单】时,才可执行升级操作。
GaussDB T 1.0.1版本仅支持离线升级,需要注意以下事项:(升级前,请严格阅读以下相关注意事项,并确认)

  • 集群内所有节点Python版本要求3.7.2及以上。
  • 集群用户互信正常。
  • 升级前需保证集群健康状态正常,所有CN、DN和GTS的状态正常。
  • 仅支持离线升级,离线升级仅支持离线小版本升级和离线二进制升级。
  • 仅支持从低版本往高版本升级。
  • 离线升级前请停止所有业务,且保证升级过程中没有任何业务在执行。
  • 离线升级前必须确保GTS组和各DN组中主备实例的信息完全同步,且保持升级前长时间稳定同步。
  • 升级操作不能和主机替换、扩容、节点替换等其他om侧操作同时执行。
  • 升级期间不能直接使用cm命令对集群进行操作,比如switchover等。
  • 升级前需要在目标安装包解压出的script目录下执行前置脚本gs_preinstall。
  • 离线升级过程中如果已经升级到集群拉起操作(离线二进制升级)或DN正常拉起操作时(离线小版本升级),则不可执行回滚操作。
  • 升级命令中传入的xml配置文件必须与当前运行集群的配置结构完全相同。
  • 二进制升级后,未执行commit-upgrade进行升级提交的情况下仍可以执行回滚操作,如果验证确认升级成功后,可以执行commit-upgrade命令以删除升级临时文件。
  • 小版本升级最后,如果CN和DN均处于READ ONLY模式下,则可以执行回滚;如果CN或主DN处于READ WRITE模式,则不可执行回滚。其中,CN、DN的模式可通过查询DV_DATABASE的OPEN_STATUS字段获取。如果验证确认升级成功,可以执行commit-upgrade命令以删除升级临时文件。
  • 升级命令执行成功后,如果已执行commit-upgrade命令进行升级提交,则无法再通过调用回滚接口auto-rollback或binary-rollback、systable-rollback回退到老版本。

2.1 安装python3

另外,GaussDB T 1.0.2要求python版本为3.7.2及以上,如果系统没有python3环境,请先安装配置python3环境再操作。各个集群节点都要进行安装操作。

[[email protected] ~]# tar -zxvf Python-3.8.1.tgz
[[email protected] Python-3.8.1]# ./configure
[[email protected] Python-3.8.1]# make && make install

2.2 数据库版本检查

[[email protected] ~]$ rlwrap zsql omm/[email protected]:8000 -q
SQL> select *from dv_version;

VERSION
----------------------------------------------------------------
GaussDB_100_1.0.1T1.B002 Release 3d95f6d
ZENGINE
3d95f6d                                                         

3 rows fetched.

2.3 准备软件包

[[email protected] ~]# mkdir /opt/software/newversion;cd /opt/software/newversion
[[email protected] newversion]# tar -xzf GaussDB_T_1.0.2-REDHAT7.5-X86.tar.gz
[[email protected] newversion]# tar -xzf GaussDB_T_1.0.2-CLUSTER-REDHAT-64bit.tar.gz 

2.4 升级预检查

[[email protected] newversion]# cd script/
[[email protected] script]# ./gs_preinstall -U omm -G dbgrp  -X /mnt/Huawei/db/clusterconfig.xml  --alarm-type=1 --operation=upgrade
Parsing the configuration file.
Successfully parsed the configuration file.
Do preinstall for upgrade.
Check environment for upgrade preinstall.
Successfully check environment for upgrade preinstall.
Installing the tools on the local node.
Successfully installed the tools on the local node.
Distributing package.
Successfully distributed package.
Check old environment on all nodes.
Successfully check old environment on all nodes.
Installing the tools in the cluster.
Successfully installed the tools in the cluster.
Creating conninfo directory.
Successfully created conninfo directory.
Fixing server package owner.
Successfully fixed server package owner.
Add sudo permission for omm.
Successfully add sudo permission for omm
Preinstallation succeeded.

2.5 升级类型检查

以omm用户,执行gs_upgradectl命令检查升级类型。systable-upgrade为离线小版本升级,binary-upgrade为离线二进制升级。本次升级为离线小版本升级。

[[email protected] ~]$ gs_upgradectl -t upgrade-type -X /mnt/Huawei/db/clusterconfig.xml
Checking upgrade type.
Successfully checked upgrade type.
Upgrade type: systable-upgrade.

2.6 执行升级

[[email protected] ~]$ gs_upgradectl -t offline-upgrade -X /mnt/Huawei/db/clusterconfig.xml
Performing systable-upgrade.
Checking zengine parameters.
Successfully check zengine parameters.
Checking cluster health.
Successfully checked cluster health.
Checking database status.
Successfully checked database status.
Checking space for backup files.
Check need size of [/opt/huawei/gaussdb/temp/binary_upgrade] in [hwd08] is [61329113088 Byte].
Check need size of [/opt/huawei/gaussdb/temp/binary_upgrade] in [hwd09] is [61329113088 Byte].
Check need size of [/opt/huawei/gaussdb/temp/binary_upgrade] in [hwd10] is [61329113088 Byte].
Check need size of [/opt/huawei/gaussdb/temp/binary_upgrade] in [hwd11] is [46265270272 Byte].
Successfully checked space for backup files.
Change the primary dn and cn to read only status.
Successfully changed the primary dn and cn to read only status.
Checking database read only status.
Successfully checked database read only status.
Checking sync info for dns.
Successfully checking sync info for dns.
Generating upgrade sql file.
Successfully generated upgrade sql file.
Generating combined upgrade sql file.
Successfully generated combined upgrade sql file.
Backing up current application and configurations.
Checking ztools path in each host.
Successfully checked ztools path in each host.
Successfully backed up current application and configurations.
Successfully record protection mode
Saving system tabls path.
Successfully saved system tables path.
Saving redo log file path.
Successfully saved redolog file path.
Saving undo log file path.
Successfully saved redolog file path.
Stopping the cluster.
Successfully stopped the cluster.
Stopping the etcd and agent.
Successfully stopped the etcd and agent.
Update etcd keys.
Successfully update etcd keys.
Starting all dns to open status for backuping system cntl and redolog.
Successfully started all dns to open status for backuping system cntl and redolog.
Backing up current system tables.
Successfully backed up system tables.
Backing up current cntl files.
Successfully backed up cntl files.
Backing up current redolog files.
Successfully backed up redolog files.
Backing up current undolog files.
Successfully backed up undolog files.
Shutdowning all dns for backuping system cntl and redolog.
Successfully shutdown all dns for backuping system cntl and redolog.
Upgrading application.
Successfully upgraded application.
Starting the restrict mode cluster.
Successfully started the restrict mode cluster.
Upgrading the system table
Successfully upgraded the system table
Shutting down the restrict mode cluster
Successfully shut down the restrict mode cluster
Starting the cns, dns to open status.
Successfully started the cns, dns to open status.
Converting the cns, primary dns to read write status.
Successfully converted the cns, primary dns to read write status.
Shutting down the open status cns, dns.
Successfully shutted down the open status cns, dns.
Starting the etcd.
Successfully started the etcd.
Loading the json.
Successfully loaded the json.
Starting cm agent.
Successfully started cm agent.
Starting the cluster.
Successfully started the cluster.
Commit systable-upgrade succeeded.

2.7 升级后版本确认以及集群状态检查

[[email protected] gaussdb]$ gs_om -V
gs_om GaussDB_T_1.0.2 build XXXX compiled at 2020-02-22 08:17:40
[[email protected] ~]$ gs_om -t status
Set output to terminal.
--------------------------------------------------------------------Cluster Status--------------------------------------------------------------------
az_state :      single_az
cluster_state : Normal
balanced :      true
----------------------------------------------------------------------AZ Status-----------------------------------------------------------------------
AZ:AZ1                ROLE:primary            STATUS:ONLINE
---------------------------------------------------------------------Host Status----------------------------------------------------------------------
HOST:hwd08            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.29
HOST:hwd09            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.30
HOST:hwd10            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.31
HOST:hwd11            AZ:AZ1                  STATUS:ONLINE       IP:192.168.120.49
----------------------------------------------------------------Cluster Manager Status----------------------------------------------------------------
INSTANCE:CM1          ROLE:slave              STATUS:ONLINE       HOST:hwd08            ID:601
INSTANCE:CM2          ROLE:slave              STATUS:ONLINE       HOST:hwd09            ID:602
INSTANCE:CM3          ROLE:primary            STATUS:ONLINE       HOST:hwd10            ID:603
---------------------------------------------------------------------ETCD Status----------------------------------------------------------------------
INSTANCE:ETCD1        ROLE:follower           STATUS:ONLINE       HOST:hwd08            ID:701      PORT:2379         DataDir:/opt/huawei/gaussdb/data_etcd1/data
INSTANCE:ETCD2        ROLE:leader             STATUS:ONLINE       HOST:hwd09            ID:702      PORT:2379         DataDir:/opt/huawei/gaussdb/data_etcd1/data
INSTANCE:ETCD3        ROLE:follower           STATUS:ONLINE       HOST:hwd10            ID:703      PORT:2379         DataDir:/opt/huawei/gaussdb/data_etcd1/data
----------------------------------------------------------------------CN Status-----------------------------------------------------------------------
INSTANCE:cn_401       ROLE:no role            STATUS:ONLINE       HOST:hwd08            ID:401      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
INSTANCE:cn_402       ROLE:no role            STATUS:ONLINE       HOST:hwd09            ID:402      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
INSTANCE:cn_403       ROLE:no role            STATUS:ONLINE       HOST:hwd10            ID:403      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
INSTANCE:cn_404       ROLE:no role            STATUS:ONLINE       HOST:hwd11            ID:404      PORT:8000         DataDir:/opt/huawei/gaussdb/data/data_cn
----------------------------------------------------------------------GTS Status----------------------------------------------------------------------
INSTANCE:GTS1         ROLE:primary            STATUS:ONLINE       HOST:hwd08            ID:441      PORT:7000         DataDir:/opt/huawei/gaussdb/data/gts
INSTANCE:GTS2         ROLE:standby            STATUS:ONLINE       HOST:hwd09            ID:442      PORT:7000         DataDir:/opt/huawei/gaussdb/data/gts
---------------------------------------------------------Instances Status in Group (group_1)----------------------------------------------------------
INSTANCE:DB1_1        ROLE:primary            STATUS:ONLINE       HOST:hwd08            ID:1        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn1
INSTANCE:DB1_2        ROLE:standby            STATUS:ONLINE       HOST:hwd09            ID:2        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn1
INSTANCE:DB1_3        ROLE:standby            STATUS:ONLINE       HOST:hwd10            ID:3        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn1
---------------------------------------------------------Instances Status in Group (group_2)----------------------------------------------------------
INSTANCE:DB2_4        ROLE:primary            STATUS:ONLINE       HOST:hwd09            ID:4        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn2
INSTANCE:DB2_5        ROLE:standby            STATUS:ONLINE       HOST:hwd10            ID:5        PORT:40042        DataDir:/opt/huawei/gaussdb/data_db/dn2
INSTANCE:DB2_6        ROLE:standby            STATUS:ONLINE       HOST:hwd11            ID:6        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn2
---------------------------------------------------------Instances Status in Group (group_3)----------------------------------------------------------
INSTANCE:DB3_9        ROLE:standby            STATUS:ONLINE       HOST:hwd08            ID:9        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn3
INSTANCE:DB3_7        ROLE:primary            STATUS:ONLINE       HOST:hwd10            ID:7        PORT:40000        DataDir:/opt/huawei/gaussdb/data_db/dn3
INSTANCE:DB3_8        ROLE:standby            STATUS:ONLINE       HOST:hwd11            ID:8        PORT:40021        DataDir:/opt/huawei/gaussdb/data_db/dn3
-----------------------------------------------------------------------Manage IP----------------------------------------------------------------------
HOST:hwd08            IP:192.168.120.29
HOST:hwd09            IP:192.168.120.30
HOST:hwd10            IP:192.168.120.31
HOST:hwd11            IP:192.168.120.49
-------------------------------------------------------------------Query Action Info------------------------------------------------------------------
HOSTNAME: hwd08     TIME: 2020-03-20 08:24:57.429450
------------------------------------------------------------------------Float Ip------------------------------------------------------------------
HOST:hwd08    DB1_1:192.168.120.29    IP:
HOST:hwd09    DB2_4:192.168.120.30    IP:
HOST:hwd10    DB3_7:192.168.120.31    IP:

到此,已完成GaussDB T 1.0.1分布式集群的升级。

2.8 升级后健康检查

使用下面的命令对集群做一次健康检查,如下:

[[email protected] gaussdb]$ gs_upgradectl -t postcheck -X /mnt/Huawei/db/clusterconfig.xml --upgrade-type=offline-upgrade
Starting check.
Checking cluster health.
Successfully checked cluster health.
Warning: REPL_AUTH is not TRUE for all instances, please use gs_gucZenith tool to set it.
Finished check.
Check result: OK. All check items is normal.

2.9 升级失败后回滚

如果在升级过程中失败需要回滚,则以omm用户,执行gs_upgradectl命令回退。如下:

[[email protected] gaussdb]$ gs_upgradectl -t offline-rollback -X /mnt/Huawei/db/clusterconfig.xml

原文地址:https://blog.51cto.com/candon123/2485288

时间: 2024-11-09 00:44:40

GaussDB T分布式集群部署以及升级指南的相关文章

solr 集群(SolrCloud 分布式集群部署步骤)

SolrCloud 分布式集群部署步骤 安装软件包准备 apache-tomcat-7.0.54 jdk1.7 solr-4.8.1 zookeeper-3.4.5 注:以上软件都是基于 Linux 环境的 64位 软件,以上软件请到各自的官网下载. 服务器准备 为搭建这个集群,准备三台服务器,分别为 192.168.0.2 -- master 角色192.168.0.3 -- slave 角色192.168.0.4 -- slave 角色 搭建基础环境 安装 jdk1.7 - 这个大家都会安装

solrCloud 4.9 分布式集群部署及注意事项

环境搭建 一.zookeeper 参考:http://blog.chinaunix.net/uid-25135004-id-4214399.html 现有4台机器 10.14.2.201 10.14.2.202 10.14.2.203 10.14.2.204 安装zookeeper集群 在所有机器上进行 1.下载安装包解压 tar xvf zookeeper-3.4.5.tar.gz -C /export/ cd /export/ ln -s zookeeper-3.4.5 zookeeper

超详细从零记录Hadoop2.7.3完全分布式集群部署过程

超详细从零记录Ubuntu16.04.1 3台服务器上Hadoop2.7.3完全分布式集群部署过程.包含,Ubuntu服务器创建.远程工具连接配置.Ubuntu服务器配置.Hadoop文件配置.Hadoop格式化.启动.(首更时间2016年10月27日) 主机名/hostname IP 角色 hadoop1 192.168.193.131 ResourceManager/NameNode/SecondaryNameNode hadoop2 192.168.193.132 NodeManager/

GaussDB T 1.0.2分布式集群部署故障总结

之前安装GaussDB T 1.0.2分布式集群的时候,安装过程中会报segmentation fault错误,如下: [[email protected] ~]$ gs_install -X /mnt/Huawei/db/clusterconfig.xml Parsing the configuration file. Check preinstall on every node. Successfully checked preinstall on every node. Creating

SolrCloud分布式集群部署步骤

http://www.mamicode.com/info-detail-892923.html Solr及SolrCloud简介 Solr是一个独立的企业级搜索应用服务器,它对外提供类似于Web-service的API接口.用户可以通过http请求,向搜索引擎服务器提交一定格式的XML文件,生成索引:也可以通过Http Get操作提出查找请求,并得到XML格式的返回结果.   SolrCloud是Solr4.0版本以后基于Solr和Zookeeper的分布式搜索方案,它的主要思想是使用Zooke

170825、SolrCloud 分布式集群部署步骤

安装软件包准备 apache-tomcat-7.0.54 jdk1.7 solr-4.8.1 zookeeper-3.4.5 注:以上软件都是基于 Linux 环境的 64位 软件,以上软件请到各自的官网下载. 服务器准备 为搭建这个集群,准备三台服务器,分别为 192.168.0.2 -- master 角色192.168.0.3 -- slave 角色192.168.0.4 -- slave 角色 搭建基础环境 安装 jdk1.7 - 这个大家都会安装,就不费键盘了. 配置主机 /etc/h

Hadoop及Zookeeper+HBase完全分布式集群部署

Hadoop及HBase集群部署 一. 集群环境 系统版本 虚拟机:内存 16G CPU 双核心 系统: CentOS-7 64位 系统下载地址: http://124.202.164.6/files/417500000AB646E7/mirrors.163.com/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso 软件版本 hadoop-2.8.1.tar.gz hbase-1.3.1-bin.tar.gz zookeeper-3.4.10.t

基于winserver的Apollo配置中心分布式&amp;集群部署实践(正确部署姿势)

前言 前几天对Apollo配置中心的demo进行一个部署试用,现公司已决定使用,这两天进行分布式部署的时候,每一步都踩着坑过来的.因此写文档与需要的朋友分享. 此篇文章不代表官方部署流程,只是自己的部署的实践方式,屏蔽了一些官方的多余的部署讲解.如果有问题还请到Apollo的wiki文档进行查看:https://github.com/ctripcorp/apollo/wiki/%E5%88%86%E5%B8%83%E5%BC%8F%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D

大数据系列之Hadoop分布式集群部署

本节目的:搭建Hadoop分布式集群环境 环境准备 LZ用OS X系统 ,安装两台Linux虚拟机,Linux系统用的是CentOS6.5:Master Ip:10.211.55.3 ,Slave Ip:10.211.55.4 各虚拟机环境配置好Jdk1.8(1.7+即可) 资料准备 hadoop-2.7.3.tar.gz 虚拟机配置步骤 以下操作都在两台虚拟机 root用户下操作,切换至root用户命令 配置Master hostname 为Master ; vi /etc/sysconfi