GreenPlum 大数据平台--segment 失效问题恢复

1,问题检查

[[email protected] conf]$ psql -c "select * from gp_segment_configuration where status=‘d‘"
 dbid | content | role | preferred_role | mode | status | port  |  hostname   |   address   | replication_por
t
------+---------+------+----------------+------+--------+-------+-------------+-------------+----------------
--
   12 |       2 | m    | m              | s    | d      | 43002 | greenplum03 | greenplum03 |            4400
2
    7 |       5 | m    | p              | s    | d      |  6001 | greenplum03 | greenplum03 |            3400
1
(2 rows)发现状态的
[[email protected] conf]$ gpstate -m
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -m
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44‘
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: ‘PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15‘
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:--Current GPDB mirror list and status
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:--Type = Group
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   Mirror        Datadir  Port    Status              Data Status
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg0  43000   Passive             Synchronized
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg1  43001   Passive             Synchronized
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[WARNING]:-greenplum03   /greenplum/data2/mirror/gpseg2  43002   Failed                                <<<<<<<< 这个出现问题了
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg3  43003   Passive             Synchronized
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg4  43000   Passive             Synchronized
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg5  43001   Acting as Primary   Change Tracking
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg6  43002   Passive             Synchronized
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg7  43003   Passive             Synchronized
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[WARNING]:-1 segment(s) configured as mirror(s) are acting as primaries
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[WARNING]:-1 segment(s) configured as mirror(s) have failed        ------------看这里
20190711:17:06:51:025238 gpstate:greenplum01:gpadmin-[WARNING]:-1 mirror segment(s) acting as primaries are in change tracking

01,连接问题

首先解决连接是否成功,ping 相应的主机看返回是否是成功状态

ping greenplum03

02,激活失效的segment

gprecoverseg

恢复过程会启动失效的Segment并且确定需要同步的已更改文件
在gprecoverseg完成后,系统会进入到Resynchronizing模式并且开始复制更改过的文件。这个过程在后台运行,而系统处于在线状态并且能够接受数据库请求。
当重新同步过程完成时,系统状态是Synchronized

需要恢复两个

日志:

 1 [[email protected] conf]$ gprecoverseg
 2 20190711:17:10:44:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Starting gprecoverseg with args:
 3 20190711:17:10:44:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44‘
 4 20190711:17:10:44:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-master Greenplum Version: ‘PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15‘
 5 20190711:17:10:44:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Checking if segments are ready to connect
 6 20190711:17:10:44:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
 7 20190711:17:10:44:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
 8 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Heap checksum setting is consistent between master and the segments that are candidates for recoverseg
 9 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Greenplum instance recovery parameters
10 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:----------------------------------------------------------
11 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Recovery type              = Standard
12 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:----------------------------------------------------------
13 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Recovery 1 of 2
14 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:----------------------------------------------------------
15 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Synchronization mode                        = Incremental
16 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance host                        = greenplum03
17 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance address                     = greenplum03
18 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance directory                   = /greenplum/data2/mirror/gpseg2
19 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance port                        = 43002
20 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance replication port            = 44002
21 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance host               = greenplum02
22 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance address            = greenplum02
23 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance directory          = /greenplum/data2/primary/gpseg2
24 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance port               = 6002
25 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance replication port   = 34002
26 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Target                             = in-place
27 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:----------------------------------------------------------
28 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Recovery 2 of 2
29 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:----------------------------------------------------------
30 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Synchronization mode                        = Incremental
31 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance host                        = greenplum03
32 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance address                     = greenplum03
33 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance directory                   = /greenplum/data/primary/gpseg5
34 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance port                        = 6001
35 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Failed instance replication port            = 34001
36 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance host               = greenplum02
37 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance address            = greenplum02
38 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance directory          = /greenplum/data/mirror/gpseg5
39 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance port               = 43001
40 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Source instance replication port   = 44001
41 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-   Recovery Target                             = in-place
42 20190711:17:10:45:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:----------------------------------------------------------
43
44 Continue with segment recovery procedure Yy|Nn (default=N):
45 > Y
46 20190711:17:11:31:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-2 segment(s) to recover
47 20190711:17:11:31:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Ensuring 2 failed segment(s) are stopped
48
49 20190711:17:11:32:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up for stopped segments
50 updating flat files
51 20190711:17:11:32:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Updating configuration with new mirrors
52 20190711:17:11:33:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Updating mirrors
53 .
54 20190711:17:11:34:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Starting mirrors
55 20190711:17:11:34:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-era is 24a58010f9c5a05a_190711113124
56 20190711:17:11:34:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
57 ..
58 20190711:17:11:36:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Process results...
59 20190711:17:11:36:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Updating configuration to mark mirrors up
60 20190711:17:11:36:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Updating primaries
61 20190711:17:11:36:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Commencing parallel primary conversion of 2 segments, please wait...
62 .
63 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Process results...
64 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Done updating primaries
65 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-******************************************************************
66 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Updating segments for resynchronization is completed.
67 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-For segments updated successfully, resynchronization will continue in the background.
68 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-
69 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-Use  gpstate -s  to check the resynchronization progress.
70 20190711:17:11:37:025375 gprecoverseg:greenplum01:gpadmin-[INFO]:-******************************************************************

03, 检测同步

gpstate -m
[[email protected] conf]$ gpstate -m
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -m
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: ‘postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44‘
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: ‘PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15‘
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:--Current GPDB mirror list and status
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:--Type = Group
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   Mirror        Datadir                          Port    Status              Data Status
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg0    43000   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg1    43001   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg2   43002   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg3   43003   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg4    43000   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg5    43001   Acting as Primary   Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg6   43002   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg7   43003   Passive             Synchronized
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
20190711:17:12:10:025484 gpstate:greenplum01:gpadmin-[WARNING]:-1 segment(s) configured as mirror(s) are acting as primaries

发现恢复出来了

04,恢复初始化状态

  因为宕机一个主segment,镜像会激活另一个,并且成为主segment。运行gprecoverseg之后,主segment依旧没变化,失效的segment没有正式加进来,所以需要让他变成初始化的时候的segment状态,让所有segment重新恢复平衡系统

检查这个segment的状态gpstate -e

  

运行gpstate -m来确保所有镜像都是Synchronized。

gpstate -m

一直在运行了

假如有Resynchronizing模式 ,需要耐心等待

用-r选项运行gprecoverseg,让Segment回到它们的首选角色。
gprecoverseg -r

在重新平衡之后,运行gpstate -e来确认所有的Segment都处于它们的首选角色。
gpstate -e

这个就没问题了

原文地址:https://www.cnblogs.com/kingle-study/p/11171027.html

时间: 2024-11-11 07:03:19

GreenPlum 大数据平台--segment 失效问题恢复的相关文章

GreenPlum 大数据平台--安装

1. 环境准备 01, 安装包准备: Greenplum :  >>>>链接地址 Pgadmin客户端 :  >>>链接地址 greenplum-cc-web监控: >>>>链接地址 02,节点说明 服务器ip 主机名 角色 系统版本 192.168.0.221 greenplum01 master centos7 192.168.0.222 greenplum02 Segment/standby centos7 192.168.0.223

美团大数据平台架构实践

今天给大家介绍的内容主要包括以下四个部分首先是介绍一下美团大数据平台的架构,然后回顾一下历史,看整个平台演进的时间演进线,每一步是怎么做的,以及一些挑战和应对策略,最后总结一下,聊一聊我对平台化的看法. 谢语宸是来自美团的大数据构建平台的架构师.他在QCon2016北京站分享了一些整体上构建大数据平台的方法,除了聚焦在某一个点上的还有构建整体的大数据,以及各种各样技术的应用,希望能给大家一些关于大数据方面的启迪.   非常感谢给我这个机会给大家带来这个演讲,我是2011年加入美团,最开始负责统计

大数据平台常见异常-zookeeper

本文主要阐述大数据平台环境zookeeper常见异常和解决方案 1.Connection reset by peer异常 异常说明 我们现在项目有个任务OneMinuteDataSync是用spark将实时数据同步插入到hbase中,程序已经稳定运行很长一段时间,不过最近数据量增加比较多,任务运行一段时间后,突然僵死几个小时后,有恢复正常继续运行,如下图,任务正常运行情况下耗时15s左右,但2017-07-11 04:33:00这个批次运行了9486s,而凌晨数据量很少的,才13w左右,白天峰值

大数据平台规划

背景 1."云大开物",四大热门信息技术 1.1 业务的发展越来越受到技术进步的影响.业务创新离不开技术创新.技术为业务服务? 2.大数据技术栈全景: 分布式编程 分布式文件系统 列数据库(HBase.Cassandra.BigTable) 柱数据库(Greenplum.BigQuery) 键值数据库(Redis.Amazon DynamoDB.Bolt) 文档数据库(MongoDB.RethinkDB) 关系数据库 新SQL数据库(HANA) 时间序列数据库 SQL引擎(Hive.P

Ambari——大数据平台的搭建利器之进阶篇

前言 本文适合已经初步了解 Ambari 的读者.对 Ambari 的基础知识,以及 Ambari 的安装步骤还不清楚的读者,可以先阅读基础篇文章<Ambari--大数据平台的搭建利器>. Ambari 的现状 目前 Apache Ambari 的最高版本是 2.0.1,最高的 Stack 版本是 HDP 2.2.未来不久将会发布 Ambari 2.1 以及 HDP 2.3(本文也将以 Ambari 2.0.1 和 HDP 2.2 为例进行讲解).其实在 Ambari trunk 的 code

阿里云HBase全新发布X-Pack 赋能轻量级大数据平台

一.八年双十一,造就国内最大最专业HBase技术团队 阿里巴巴集团早在2010开始研究并把HBase投入生产环境使用,从最初的淘宝历史交易记录,到蚂蚁安全风控数据存储.持续8年的投入,历经8年双十一锻炼.4个PMC,6个committer,造就了国内最大最专业的HBase技术团队,其中HBase内核中超过200+重要的feature是阿里贡献.集团内部超过万台的规模,单集群超过千台,全球领先. 二.HBase技术团队重磅发布X-Pack,重新赋能轻量级大数据平台 阿里云自从17年8月提供HBas

大数据知识点分享:大数据平台应用 17 个知识点汇总

一.大数据中的数据仓库和Mpp数据库如何选型? 在Hadoop平台中,一般大家都把hive当做数据仓库的一种选择,而Mpp数据库的典型代表就是impala,presto.Mpp架构的数据库主要用于即席查询场景,暨对数据查询效率有较高要求的场景,而对数据仓库的查询效率要求无法做大MPP那样,所以更多地适用与离线分析场景. Hadoop已经是大数据平台的实时标准,其中Hadoop生态中有数据仓库Hive,可以作为大数据平台的标准数据仓库, 对于面向应用的MPP数据库,可以选择MYCAT(mySql的

(转)我所经历的大数据平台发展史(三):互联网时代 ? 上篇

编者按:本文是松子(李博源)的大数据平台发展史系列文章的第二篇(共四篇),本系列以独特的视角,比较了非互联网和互联网两个时代以及传统与非传统两个行业.是对数据平台发展的一个回忆,对非互联网.互联网,从数据平台的用户角度.数据架构演进.模型等进行了阐述. 前言,本篇幅将进入大家熟知的互联网时代,数据平台发展史仅是自己经历过由传统数据平台到互联网数据平台发展一些简单回忆,在这一篇章中将引用部分互联网数据平台架构,在这里仅作案例. 我相信很多从传统行业转到互联网时是各种不适应,适应短则几个月,长则一年

深入大数据平台心脏:饿了么调度系统全解

随着饿了么在大数据应用的不断深入,需要解决任务数量增长快.任务多样化.任务关系复杂.任务执行效率低及任务失败不可控等问题. 饿了么大数据平台现状:每天完成大数据任务计算 54000+;节点集群 85 台. 开源解决方案 Ooize Ooize 基于工作流调度引擎,是雅虎的开源项目,属于 Java Web 应用程序.由 Oozie Client 和 Oozie Server 两个组件构成. Oozie Server 运行于 Java Servlet 容器(Tomcat)中的 Web 程序.工作流必