hbase hmaster故障分析及解决方案:Timedout 300000ms waiting for namespace table to be assigned

最近生产环境hbase集群出现停掉集群之后hmaster无法启动现象,master日志报异常:Timedout 300000ms waiting for namespace table to be assigned,导致整个集群起不来。

2016-12-12 18:04:12,243 FATAL [adfnn2:16020.activeMasterManager] master.HMaster: Failed to become active master
java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned
        at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
        at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:868)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:719)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:165)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1425)

首先怀疑regionserver加载namespace region出现异常,导致namespace表加载失败,但是查看所有regionserver日志,没有异常出现。

那么就分析master日志,更改log4j日志为debug,看看能不能找到关键信息,有没可能是region太多,在五分钟之内没有分配完导致namespace表还没加载,看下面的region分配日志,2016-12-12 17:59:36,571之后再也没有region分配日志,说明所有region已经分配完成,再看看上面异常日志,2016-12-12 18:04:12,243报的异常,17:59:36,571至18:04:12,243期间没有region分配,的确说明region已经全部分配完成。

2016-12-12 17:59:36,492 INFO  [AM.ZK.Worker-pool2-t8] master.RegionStates: Onlined 11d3cd3c7166f4f81286d9d7ac854421 on adfdn14,16020,1481536735081
2016-12-12 17:59:36,492 INFO  [AM.ZK.Worker-pool2-t8] master.RegionStates: Offlined 11d3cd3c7166f4f81286d9d7ac854421 from adfdn20,16020,1481525544193
2016-12-12 17:59:36,532 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/region-in-transition/2df9448afa2470e488b92feb50f65e20
2016-12-12 17:59:36,533 DEBUG [AM.ZK.Worker-pool2-t2] master.AssignmentManager: Handling RS_ZK_REGION_OPENED, server=adfdn14,16020,1481536735081, region=2df9448afa2470e488b92feb50f65e20, current_state={2df9448afa2470e488b92feb50f65e20 state=OPENING, ts=1481536776150, server=adfdn14,16020,1481536735081}
2016-12-12 17:59:36,533 INFO  [AM.ZK.Worker-pool2-t2] master.RegionStates: Transition {2df9448afa2470e488b92feb50f65e20 state=OPENING, ts=1481536776150, server=adfdn14,16020,1481536735081} to {2df9448afa2470e488b92feb50f65e20 state=OPEN, ts=1481536776533, server=adfdn14,16020,1481536735081}
2016-12-12 17:59:36,533 DEBUG [AM.ZK.Worker-pool2-t2] coordination.ZkOpenRegionCoordination: Handling OPENED of 2df9448afa2470e488b92feb50f65e20 from adfnn2,16020,1481536732124; deleting unassigned node
2016-12-12 17:59:36,535 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/region-in-transition/2df9448afa2470e488b92feb50f65e20
2016-12-12 17:59:36,535 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/region-in-transition
2016-12-12 17:59:36,535 DEBUG [AM.ZK.Worker-pool2-t2] zookeeper.ZKAssign: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Deleted unassigned node 2df9448afa2470e488b92feb50f65e20 in expected state RS_ZK_REGION_OPENED
2016-12-12 17:59:36,536 DEBUG [AM.ZK.Worker-pool2-t2] master.AssignmentManager: Znode THB_GPRS_FLOW_201610,b2bb,1476080598693.2df9448afa2470e488b92feb50f65e20. deleted, state: {2df9448afa2470e488b92feb50f65e20 state=OPEN, ts=1481536776533, server=adfdn14,16020,1481536735081}
2016-12-12 17:59:36,536 INFO  [AM.ZK.Worker-pool2-t2] master.RegionStates: Onlined 2df9448afa2470e488b92feb50f65e20 on adfdn14,16020,1481536735081
2016-12-12 17:59:36,536 INFO  [AM.ZK.Worker-pool2-t2] master.RegionStates: Offlined 2df9448afa2470e488b92feb50f65e20 from adfdn20,16020,1481525544193
2016-12-12 17:59:36,568 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Received ZooKeeper Event, type=NodeDataChanged, state=SyncConnected, path=/hbase/region-in-transition/07d3e0689ce6b0f62ad9cf7084d82588
2016-12-12 17:59:36,569 DEBUG [AM.ZK.Worker-pool2-t5] master.AssignmentManager: Handling RS_ZK_REGION_OPENED, server=adfdn14,16020,1481536735081, region=07d3e0689ce6b0f62ad9cf7084d82588, current_state={07d3e0689ce6b0f62ad9cf7084d82588 state=OPENING, ts=1481536776166, server=adfdn14,16020,1481536735081}
2016-12-12 17:59:36,569 INFO  [AM.ZK.Worker-pool2-t5] master.RegionStates: Transition {07d3e0689ce6b0f62ad9cf7084d82588 state=OPENING, ts=1481536776166, server=adfdn14,16020,1481536735081} to {07d3e0689ce6b0f62ad9cf7084d82588 state=OPEN, ts=1481536776569, server=adfdn14,16020,1481536735081}
2016-12-12 17:59:36,569 DEBUG [AM.ZK.Worker-pool2-t5] coordination.ZkOpenRegionCoordination: Handling OPENED of 07d3e0689ce6b0f62ad9cf7084d82588 from adfnn2,16020,1481536732124; deleting unassigned node
2016-12-12 17:59:36,571 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/region-in-transition/07d3e0689ce6b0f62ad9cf7084d82588
2016-12-12 17:59:36,571 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, path=/hbase/region-in-transition
2016-12-12 17:59:36,571 DEBUG [AM.ZK.Worker-pool2-t5] zookeeper.ZKAssign: master:16020-0x258dd1d8ea30c00, quorum=adfdn02:2181,adfdn04:2181,adfdn03:2181, baseZNode=/hbase Deleted unassigned node 07d3e0689ce6b0f62ad9cf7084d82588 in expected state RS_ZK_REGION_OPENED
2016-12-12 17:59:36,571 DEBUG [AM.ZK.Worker-pool2-t5] master.AssignmentManager: Znode THB_GPRS_WAP_201612,b24e,1480493880237.07d3e0689ce6b0f62ad9cf7084d82588. deleted, state: {07d3e0689ce6b0f62ad9cf7084d82588 state=OPEN, ts=1481536776569, server=adfdn14,16020,1481536735081}
2016-12-12 17:59:36,571 INFO  [AM.ZK.Worker-pool2-t5] master.RegionStates: Onlined 07d3e0689ce6b0f62ad9cf7084d82588 on adfdn14,16020,1481536735081
2016-12-12 17:59:36,571 INFO  [AM.ZK.Worker-pool2-t5] master.RegionStates: Offlined 07d3e0689ce6b0f62ad9cf7084d82588 from adfdn20,16020,1481525544193
2016-12-12 17:59:55,765 DEBUG [adfnn2:16020.archivedHFileCleaner] cleaner.CleanerChore: Removing: hdfs://qhcm/hbase/archive/data/default/THB_USER_FLOW_201610/00afb5e7f30354607a5433ce66c7ea94/F/93527541ef45469f98bb728470d7dc86 from archive
2016-12-12 17:59:55,831 DEBUG [adfnn2:16020.archivedHFileCleaner] cleaner.CleanerChore: Removing: hdfs://qhcm/hbase/archive/data/default/THB_USER_FLOW_201610/00afb5e7f30354607a5433ce66c7ea94/recovered.edits/8509.seqid from archive
2016-12-12 17:59:55,853 DEBUG [adfnn2:16020.archivedHFileCleaner] cleaner.CleanerChore: Removing: hdfs://qhcm/hbase/archive/data/default/THB_USER_FLOW_201610/06cc7d526ab24268202c6cf3bfd1a820/F/9a12d8109b8a44a3bebd743eacb9c071 from archive
2016-12-12 17:59:55,862 DEBUG [adfnn2:16020.archivedHFileCleaner] cleaner.CleanerChore: Removing: hdfs://qhcm/hbase/archive/data/default/THB_USER_FLOW_201610/06cc7d526ab24268202c6cf3bfd1a820/recovered.edits/8503.seqid from archive

提取region分配日志,cat hbase-ocdc-master-adfnn2.log|grep "deleted, state:"|awk ‘{print $7}‘|awk -F ‘,‘ ‘{print $1}‘|sort|uniq,看看哪些表已经被分配

SYSTEM.SEQUENCE
THB_GPRS_CHARGE_201610
THB_GPRS_CHARGE_201611
THB_GPRS_CHARGE_201612
THB_GPRS_FLOW_201610
THB_GPRS_FLOW_201611
THB_GPRS_FLOW_201612
THB_GPRS_WAP_201610
THB_GPRS_WAP_201612
THB_USER_FLOW_201610
THB_USER_FLOW_201611
THB_USER_FLOW_201612
THB_USER_INFO_201609
THB_USER_INFO_201610
THB_USER_INFO_201611
THB_USER_INFO_201612
THB_USER_INFO_DAY
aidata:user_phone_info_201508
aidata:user_phone_info_201511
aidata:user_phone_info_201512
hbase:meta

hbase:namespace表没有被分配?为什么没有被分配?难道是namespace表有问题导致不能分配?于是让同事把生产环境namespace表目录取下来放到自己的开发环境,namespace表目录比较乱,namespace还嵌namespace,我相信肯定不是维护人员干的,即使namespace嵌套,也不应该影响namepace的region加载。

[email protected] $ hadoop fs -lsr /namespace
lsr: DEPRECATED: Please use ‘ls -R‘ instead.
-rw-r--r-- 1 wangkai8 supergroup 6148 2016-12-12 15:33 /namespace/.DS_Store
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/.tabledesc
-rw-r--r-- 1 wangkai8 supergroup 312 2016-12-12 15:33 /namespace/.tabledesc/.tableinfo.0000000001
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/.tmp
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8
-rw-r--r-- 1 wangkai8 supergroup 6148 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/.DS_Store
-rw-r--r-- 1 wangkai8 supergroup 42 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/.regioninfo
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/info
-rw-r--r-- 1 wangkai8 supergroup 4989 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/info/5b15bf56f4ea443ba6b7e212760f8644
-rw-r--r-- 1 wangkai8 supergroup 4971 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/info/f1c4c097645344269e475ad32cf79913
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/recovered.edits
-rw-r--r-- 1 wangkai8 supergroup 341 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/recovered.edits/0000000000000000480
-rw-r--r-- 1 wangkai8 supergroup 0 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/recovered.edits/11.seqid
-rw-r--r-- 1 wangkai8 supergroup 0 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/recovered.edits/2.seqid
-rw-r--r-- 1 wangkai8 supergroup 0 2016-12-12 15:33 /namespace/83739a75edfb427d8501a43c553caad8/recovered.edits/479.seqid
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/namespace
-rw-r--r-- 1 wangkai8 supergroup 6148 2016-12-12 15:33 /namespace/namespace/.DS_Store
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/namespace/.tabledesc
-rw-r--r-- 1 wangkai8 supergroup 312 2016-12-12 15:33 /namespace/namespace/.tabledesc/.tableinfo.0000000001
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/namespace/.tmp
drwxr-xr-x - wangkai8 supergroup 0 2016-12-12 15:33 /namespace/namespace/51c319c701fe6fdf6b77da884541a4dd
-rw-r--r-- 1 wangkai8 supergroup 42 2016-12-12 15:33 /namespace/namespace/51c319c701fe6fdf6b77da884541a4dd/.regioninfo

把生产环境/hbase1.0.0/data/hbase/namespace/namespace删掉之后再启动集群,结果还是报一样的异常错误,好吧,那就把/hbase1.0.0/data/hbase/namespace/83739a75edfb427d8501a43c553caad8/recovered*删掉再启动集群,结果还是一样。

为了看namespace表到底有哪些记录,把/hbase1.0.0/data/hbase/namespace/83739a75edfb427d8501a43c553caad8/info下两个HFile考到自己开发环境namespace目录下,然后重启hbase集群,可以看到生产环境namespace有哪些记录。

[email protected] ~$ hadoop fs -lsr /hbase1.0.0/data/hbase/namespace
lsr: DEPRECATED: Please use ‘ls -R‘ instead.
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/.tabledesc
-rw-r--r--   1 wangkai8 supergroup        312 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/.tabledesc/.tableinfo.0000000001
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/.tmp
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 20:30 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79
-rw-r--r--   1 wangkai8 supergroup         42 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/.regioninfo
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 20:30 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info
-rw-r--r--   3 wangkai8 supergroup       5041 2016-12-12 20:30 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info/89830776b101405e9cb2e13d8a770ca6
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-15 15:58 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/recovered.edits
-rw-r--r--   3 wangkai8 supergroup          0 2016-12-15 15:58 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/recovered.edits/22.seqid
[email protected] ~$ cd Downloads/
[email protected] Downloads$ cd namespace
[email protected] namespace$ cd 83739a75edfb427d8501a43c553caad8/
[email protected] 83739a75edfb427d8501a43c553caad8$ ls
info            recovered.edits
[email protected] 83739a75edfb427d8501a43c553caad8$ hadoop fs -put info/* /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info/
[email protected] 83739a75edfb427d8501a43c553caad8$ hadoop fs -lsr /hbase1.0.0/data/hbase/namespace
lsr: DEPRECATED: Please use ‘ls -R‘ instead.
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/.tabledesc
-rw-r--r--   1 wangkai8 supergroup        312 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/.tabledesc/.tableinfo.0000000001
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/.tmp
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-12 20:30 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79
-rw-r--r--   1 wangkai8 supergroup         42 2016-12-12 19:16 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/.regioninfo
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-15 16:01 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info
-rw-r--r--   1 wangkai8 supergroup       4989 2016-12-15 16:01 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info/5b15bf56f4ea443ba6b7e212760f8644
-rw-r--r--   3 wangkai8 supergroup       5041 2016-12-12 20:30 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info/89830776b101405e9cb2e13d8a770ca6
-rw-r--r--   1 wangkai8 supergroup       4971 2016-12-15 16:01 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/info/f1c4c097645344269e475ad32cf79913
drwxr-xr-x   - wangkai8 supergroup          0 2016-12-15 15:58 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/recovered.edits
-rw-r--r--   3 wangkai8 supergroup          0 2016-12-15 15:58 /hbase1.0.0/data/hbase/namespace/be6712b3ac638d312c7a4544514a6f79/recovered.edits/22.seqid

hbase(main):002:0> scan ‘hbase:namespace‘
ROW                                                 COLUMN+CELL
 default                                            column=info:d, timestamp=1481541361405, value=\x0A\x07default
 hbase                                              column=info:d, timestamp=1481541361412, value=\x0A\x05hbase
 ocnosql                                            column=info:d, timestamp=1481541506852, value=\x0A\x07ocnosql
3 row(s) in 0.0980 seconds

hbase(main):003:0> scan ‘hbase:namespace‘
ROW                                                 COLUMN+CELL
 aidata                                             column=info:d, timestamp=1441764317452, value=\x0A\x06aidata
 default                                            column=info:d, timestamp=1481541361405, value=\x0A\x07default
 hbase                                              column=info:d, timestamp=1481541361412, value=\x0A\x05hbase
 ocnosql                                            column=info:d, timestamp=1481541506852, value=\x0A\x07ocnosql
4 row(s) in 0.0180 seconds

可以看到namespace多了一条记录,说明namespace hfile也有没问题,那么问题到底在哪?难道hbase:meta表有问题,又让维护同事把meta目录取下来,meta表的HFile就不能直接放到hbase meta目录下了,否则生产环境的region在开发环境肯定找不到导致启动失败。可以通过HFile工具查看HFile里面的数据:

[email protected] info$ hbase org.apache.hadoop.hbase.io.hfile.HFile -e -p -f /meta/1588230740/info/8e498a36f195450d91e07eb8ad4208c1|grep ‘hbase:namespace‘K: hbase:namespace,,1481192140855.83739a75edfb427d8501a43c553caad8./info:regioninfo/1481247939963/Put/vlen=41/seqid=10742 V: PBUF\x08\xB7\xE0\xB9\xEF\x8D+\x12\x12\x0A\x05hbase\x12\x09namespace\x1A\x00"\x00(\x000\x008\x00
[email protected] info$ hbase org.apache.hadoop.hbase.io.hfile.HFile -e -p -f /meta/1588230740/info/bc669ae58e5142d69b6fb38979fcd20c|grep ‘hbase:namespace‘
[email protected] info$

可以看到只有一条记录,且只有一个列,正常namespace表如下

[email protected] ~$ echo ‘scan "hbase:meta"‘|hbase shell|grep "hbase:namespace"
 hbase:namespace,,1481791022097.bd5c5e4af06be7540ce00e24c8949399. column=info:regioninfo, timestamp=1481791022263, value={ENCODED => bd5c5e4af06be7540ce00e24c8949399, NAME => ‘hbase:namespace,,1481791022097.bd5c5e4af06be7540ce00e24c8949399.‘, STARTKEY => ‘‘, ENDKEY => ‘‘}
 hbase:namespace,,1481791022097.bd5c5e4af06be7540ce00e24c8949399. column=info:seqnumDuringOpen, timestamp=1481791022325, value=\x00\x00\x00\x00\x00\x00\x00\x02
 hbase:namespace,,1481791022097.bd5c5e4af06be7540ce00e24c8949399. column=info:server, timestamp=1481791022325, value=wangkai8dembp:61020
 hbase:namespace,,1481791022097.bd5c5e4af06be7540ce00e24c8949399. column=info:serverstartcode, timestamp=1481791022325, value=1481791009329

也就是说生产环境namespace表缺少重要的info:server信息,这将导致namespace region在被AssignmentManager分配时跳过。再次查看hbase源码,region分配过程大致如下:

org.apache.hadoop.hbase.master.HMaster#finishActiveMasterInitialization->

org.apache.hadoop.hbase.master.AssignmentManager#joinCluster->

org.apache.hadoop.hbase.master.AssignmentManager#rebuildUserRegions->

org.apache.hadoop.hbase.master.AssignmentManager#processDeadServersAndRegionsInTransition->

org.apache.hadoop.hbase.master.AssignmentManager#assignAllUserRegions->

org.apache.hadoop.hbase.master.AssignmentManager#assign(java.util.Map<org.apache.hadoop.hbase.HRegionInfo,org.apache.hadoop.hbase.ServerName>)->

#创建region bulk plan分配计划

org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer#retainAssignment->

#根据region plan开始真正执行region assign

org.apache.hadoop.hbase.master.AssignmentManager#assign(int, int, String, Map<ServerName,List<HRegionInfo>>)->

org.apache.hadoop.hbase.master.AssignmentManager#assign(ServerName, List<HRegionInfo>)->

//在zookeeper /hbase1.0.0/region-in-transition下创建待分配region信息

org.apache.hadoop.hbase.master.AssignmentManager#asyncSetOfflineInZooKeeper->

org.apache.hadoop.hbase.zookeeper.ZKAssign#asyncCreateNodeOffline->

org.apache.hadoop.hbase.zookeeper.ZKUtil#asyncCreate

生产环境这个异常问题在org.apache.hadoop.hbase.master.AssignmentManager#rebuildUserRegions这个方法,先来看看这个方法里面有什么内容

/**
   * Rebuild the list of user regions and assignment information.
   * <p>
   * Returns a set of servers that are not found to be online that hosted
   * some regions.
   * @return set of servers not online that hosted some regions per meta
   * @throws IOException
   */
  Set<ServerName> rebuildUserRegions() throws
      IOException, KeeperException, CoordinatedStateException {
    Set<TableName> disabledOrEnablingTables = tableStateManager.getTablesInStates(
      ZooKeeperProtos.Table.State.DISABLED, ZooKeeperProtos.Table.State.ENABLING);

    Set<TableName> disabledOrDisablingOrEnabling = tableStateManager.getTablesInStates(
      ZooKeeperProtos.Table.State.DISABLED,
      ZooKeeperProtos.Table.State.DISABLING,
      ZooKeeperProtos.Table.State.ENABLING);

    // Region assignment from META
    List<Result> results = MetaTableAccessor.fullScanOfMeta(server.getConnection());
    // Get any new but slow to checkin region server that joined the cluster
    Set<ServerName> onlineServers = serverManager.getOnlineServers().keySet();
    // Set of offline servers to be returned
    Set<ServerName> offlineServers = new HashSet<ServerName>();
    // Iterate regions in META
    for (Result result : results) {
      if (result == null && LOG.isDebugEnabled()){
        LOG.debug("null result from meta - ignoring but this is strange.");
        continue;
      }
      RegionLocations rl =  MetaTableAccessor.getRegionLocations(result);
      if (rl == null) continue;
      HRegionLocation[] locations = rl.getRegionLocations();
      if (locations == null) continue;
      for (HRegionLocation hrl : locations) {
        HRegionInfo regionInfo = hrl.getRegionInfo();
        if (regionInfo == null) continue;
        int replicaId = regionInfo.getReplicaId();
        State state = RegionStateStore.getRegionState(result, replicaId);
        ServerName lastHost = hrl.getServerName();  //由于namespace表info:server为null,导致lastHost也为null
        ServerName regionLocation = RegionStateStore.getRegionServer(result, replicaId);
        regionStates.createRegionState(regionInfo, state, regionLocation, lastHost);  //进而导致namespace region的状态为State.OFFLINE,master不分配此region
        if (!regionStates.isRegionInState(regionInfo, State.OPEN)) {
          // Region is not open (either offline or in transition), skip
          continue;
        }
        TableName tableName = regionInfo.getTable();
        if (!onlineServers.contains(regionLocation)) {
          // Region is located on a server that isn‘t online
          offlineServers.add(regionLocation);
          if (useZKForAssignment) {
            regionStates.regionOffline(regionInfo);
          }
        } else if (!disabledOrEnablingTables.contains(tableName)) {
          // Region is being served and on an active server
          // add only if region not in disabled or enabling table
          regionStates.regionOnline(regionInfo, regionLocation);
          balancer.regionOnline(regionInfo, regionLocation);
        } else if (useZKForAssignment) {
          regionStates.regionOffline(regionInfo);
        }
        // need to enable the table if not disabled or disabling or enabling
        // this will be used in rolling restarts
        if (!disabledOrDisablingOrEnabling.contains(tableName)
          && !getTableStateManager().isTableState(tableName,
            ZooKeeperProtos.Table.State.ENABLED)) {
          setEnabledTable(tableName);
        }
      }
    }
    return offlineServers;
  }

  /**
   * Add a region to RegionStates with the specified state.
   * If the region is already in RegionStates, this call has
   * no effect, and the original state is returned.
   *
   * @param hri the region info to create a state for
   * @param newState the state to the region in set to
   * @param serverName the server the region is transitioning on
   * @param lastHost the last server that hosts the region
   * @return the current state
   */
  public synchronized RegionState createRegionState(final HRegionInfo hri,
      State newState, ServerName serverName, ServerName lastHost) {
    if (newState == null || (newState == State.OPEN && serverName == null)) {  //如果region lastHost为null,这个region直接被指定为State.OFFLINE
      newState =  State.OFFLINE;
    }
    if (hri.isOffline() && hri.isSplit()) {
      newState = State.SPLIT;
      serverName = null;
    }
    String encodedName = hri.getEncodedName();
    RegionState regionState = regionStates.get(encodedName);
    if (regionState != null) {
      LOG.warn("Tried to create a state for a region already in RegionStates, "
        + "used existing: " + regionState + ", ignored new: " + newState);
    } else {
      regionState = new RegionState(hri, newState, serverName);
      regionStates.put(encodedName, regionState);
      if (newState == State.OPEN) {
        if (!serverName.equals(lastHost)) {
          LOG.warn("Open region‘s last host " + lastHost
            + " should be the same as the current one " + serverName
            + ", ignored the last and used the current one");
          lastHost = serverName;
        }
        lastAssignments.put(encodedName, lastHost);
        regionAssignments.put(hri, lastHost);
      } else if (!regionState.isUnassignable()) {
        regionsInTransition.put(encodedName, regionState);
      }
      if (lastHost != null && newState != State.SPLIT) {
        addToServerHoldings(lastHost, hri);
        if (newState != State.OPEN) {
          oldAssignments.put(encodedName, lastHost);
        }
      }
    }
    return regionState;
  }

到此根本问题已经定位,meta表数据丢失,原因可能是hbase集群在split操作时维护人员强行kill hbase造成,不管是什么原因导致这个问题,肯定跟hbase自身健壮性有关,不管维护人员在何种状态下stop或kill hbase,也不应该出现meta有问题。meta表本身可以通过hbase hbck -repair来修复,很尴尬的是在集群没启动的情况下hbase hbck用不了。

既然找到了问题,那就想办法解决。麻烦的是集群起不来,meta表没法修复,解决思想有两种:

方案1)修改hbase源码,如果region所属namespace,且server信息为空,人为指定一台regionserver给这个region,使其能被master分配出去,集群能起来,集群起来之后就好办,可以hbck repair meta,也可以进到hbase shell操作meta表。meta表修复之后stop集群,把hbase原来的jar包换回来,启动集群就ok了。

方案2)在开发环境meta表中put一条namespace info:server记录,指定一个regionserver地址,强制执行flush ‘hbase:meta‘,生成包含namespace server信息的HFile,然后把HFile放到生产环境meta目录下,重启之后就可以恢复namespace server信息。

这次采用的解决方案是修改hbase源码,方案二没去验证,根据理论,应该是没问题。

  /**   * org.apache.hadoop.hbase.MetaTableAccessor#getServerName   * 修改此方法,给namespace region指定一个regionserver地址
   * Returns a {@link ServerName} from catalog table {@link Result}.
   * @param r Result to pull from
   * @return A ServerName instance or null if necessary fields not found or empty.
   */
  private static ServerName getServerName(final Result r, final int replicaId) {
    byte[] serverColumn = getServerColumn(replicaId);
    Cell cell = r.getColumnLatestCell(getFamily(), serverColumn);

      //---------------------------------------------------------------------------
      HRegionInfo regionInfo = getHRegionInfo(r, getRegionInfoColumn());
      String tableName = regionInfo.getTable().getNameAsString();
      if((cell == null || cell.getValueLength() == 0) && tableName.equals("hbase:namespace")) {
          return ServerName.valueOf("adfdn14:16020", 1481534251178L);
      }
      //---------------------------------------------------------------------------

    if (cell == null || cell.getValueLength() == 0) return null;
    String hostAndPort = Bytes.toString(
      cell.getValueArray(), cell.getValueOffset(), cell.getValueLength());
    byte[] startcodeColumn = getStartCodeColumn(replicaId);
    cell = r.getColumnLatestCell(getFamily(), startcodeColumn);
    if (cell == null || cell.getValueLength() == 0) return null;
    return ServerName.valueOf(hostAndPort,
      Bytes.toLong(cell.getValueArray(), cell.getValueOffset(), cell.getValueLength()));
  }

至此集群已经能正常启动,花了一天时间解决故障。

再来谈谈hbase的设计,master启动时首先分配meta表,然后再分配其它表,但是hbase:namespace系统表和其它用户表分配时同等对待,并没有先分配系统表再分配用户表,如果一个集群region非常多,默认5分钟可能还分配不到namespace表,还得修改hbase.master.namespace.init.timeout超时时间,总感觉有点不合理。

hbase hbck工具是很好用,但是在集群启动不了的情况下,meta表数据没法通过工具来修改,是否可以提供一个离线工具来方便生成HFile,或者提供一个miniCluster在集群瘫痪下能启动起来管理meta、namespace表。

最后想说的是,在生产环境最好不要直接kill掉master和regionserver,否则可能出现意想不到的问题,够你头疼一整子,哈哈

阅读原文

时间: 2024-08-02 14:05:23

hbase hmaster故障分析及解决方案:Timedout 300000ms waiting for namespace table to be assigned的相关文章

Hbase HMaster启动问题

一.启动HBase 在Namenode节点上执行start-hbase.sh后,HMaster启动了,但是过几秒钟就挂了, 查看日志报错: [master:master:60000] catalog.CatalogTracker: Failed verification of hbase:meta,,1 at address=node3,60020,1409104234032, exception=org.apache.hadoop.hbase.NotServingRegionException

HBASE列族不能太多的真相 (一个table有几个列族就有几个 Store)

HRegionServer内部管理了一系列HRegion对象,每个HRegion对 应了table中的一个region,HRegion中由多 个HStore组成.每个HStore对应了Table中的一个column family的存储,可以看出每个columnfamily其实就是一个集中的存储单元,因此最好将具备共同IO特性的column放在一个column family中,这样最高效. HStore存储是HBase存储的核心,由两部分组成,一部分是MemStore,一 部分是StoreFile

利用ClouderaManager启动HBase时,出现 master.TableNamespaceManager: Namespace table not found. Creating...

1.错误描述: 出现上述这个错误的原因是我之前已经安装了Cloudera Manager中的CDH,其中添加了所有的服务,当然也包含HBase.然后重新安装的时候,就会出现如下错误: Failed to become active master,org.apache.hadoop.hbase.TableExistsException:hbase:namespace. 根据上面错误的我们可以很清楚的知道,在启动Hbase的时候,由于之前安装的HBase版本的数据还存在,因此重新安装的HBase会报

Hadoop上配置Hbase数据库

已有环境: 1. Ubuntu:14.04.2 2.jdk: 1.8.0_45 3.hadoop:2.6.0 4.hBase:1.0.0 详细过程: 1.下载最新的Hbase,这里我下载的是hbase-1.0.0版本,然后打开终端,输入: tar zxvf hbase-1.0.0.tar.gz解压,然后将hbase放到合适的路径下(可以是用户目录,也可以是根目录,不太清楚是否必须要与hadoop放在用一个根目录下,本人是放在同一个目录下的) 2.修改2个配置文件(这里是伪分布式,单机版不再叙述)

hbase官方文档(转)

Apache HBase™ 参考指南  HBase 官方文档中文版 Copyright © 2012 Apache Software Foundation.保留所有权利. Apache Hadoop, Hadoop, MapReduce, HDFS, Zookeeper, HBase 及 HBase项目 logo 是Apache Software Foundation的商标. Revision History Revision 0.95-SNAPSHOT 2012-12-03T13:38 中文版

hbase启动问题记录

昨天测试环境的Hbase启动有问题,日志中显示: transaction type: 1 error: KeeperErrorCode = NoNode for /hbase hmaster等其他进程日志中显示连接不上zookeeper,发现zookeeper启动有问题. 于是判断可能是zookeeper中的某些数据丢失了,之前也出现过类似的启动问题,都是清除zookeeper所有数据解决的,这显然不能根本上解决问题. 进一步分析和判断想到hbase的数据目录,由于商测环境是用伪分布式的方式部署

【转载】HBase 数据库检索性能优化策略

转自:http://www.ibm.com/developerworks/cn/java/j-lo-HBase/index.html 高性能 HBase 数据库 本文首先介绍了 HBase 数据库基本原理及专用术语,然后介绍了 HBase 数据库发布的操作 API 及部分示例,重点介绍了 Scan 方法的操作方式,接着介绍了检索 HBase 数据库时的优化方案,最后通过一个示例总结了实际项目中遇到的检索速度慢的解决方案. HBase 数据表介绍 HBase 数据库是一个基于分布式的.面向列的.主

HBase单机执行

hbase-env.sh中添加了一句:export JAVA_HOME=/usr/java/jdk1.7.0_65 1.仅设置了JDK地址之后,HBase的启动日志:(注意标红的信息,系统信息,JDK信息,hbase信息(默认存储位置显示),zookeeper信息等) 1 2014年 09月 03日 星期三 11:09:58 CST Starting master on lhh 2 core file size (blocks, -c) 0 3 data seg size (kbytes, -d

HBase基础架构及原理

1. HBase框架简单介绍 HBase是一个分布式的.面向列的开源数据库,它不同于一般的关系数据库,是一个适合于非结构化数据存储的数据库.另一个不同的是HBase基于列的而不是基于行的模式.HBase使用和 BigTable非常相同的数据模型.用户存储数据行在一个表里.一个数据行拥有一个可选择的键和任意数量的列,一个或多个列组成一个ColumnFamily,一个Fmaily下的列位于一个HFile中,易于缓存数据.表是疏松的存储的,因此用户可以给行定义各种不同的列.在HBase中数据按主键排序