hive的hiveserver2模式启动不起来,发现Hadoop一直处于安全模式

hive的hiveserver2模式启动不起来,发现Hadoop一直处于安全模式

命令介绍



命令hadoop fs –safemode get 查看安全模式状态
命令hadoop fs –safemode enter 进入安全模式状态
命令hadoop fs –safemode leave 离开安全模式状态

用Hadoop fsck查看破坏丢失的文件位置

hadoop  fsck

Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
        <path>             检查这个目录中的文件是否完整

        -move               破损的文件移至/lost+found目录
        -delete             删除破损的文件

        -openforwrite   打印正在打开写操作的文件

        -files                 打印正在check的文件名

        -blocks             打印block报告 (需要和-files参数一起使用)

        -locations         打印每个block的位置信息(需要和-files参数一起使用)

        -racks               打印位置信息的网络拓扑图 (需要和-files参数一起使用)

第一步:检查hadoop文件系统hadoop fsck /

[[email protected] export]# hadoop fsck /
....................................................................................................
.............Status: CORRUPT                    #Hadoop状态:不正常
 Total size:    273821489 B
 Total dirs:    403
 Total files:   213
 Total symlinks:        0
 Total blocks (validated):  201 (avg. block size 1362295 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:  2 (0.99502486 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:    2                           #损坏了两个文件
  MISSING BLOCKS:   2                           #丢失了两个块
  MISSING SIZE:     6174 B
  CORRUPT BLOCKS:   2
  ********************************
 Minimally replicated blocks:   199 (99.004974 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks:     0 (0.0 %)
 Default replication factor:    3
 Average block replication: 2.8208954
 Corrupt blocks:        2
 Missing replicas:      0 (0.0 %)
 Number of data-nodes:      3
 Number of racks:       1
FSCK ended at Fri Aug 23 10:43:11 CST 2019 in 12 milliseconds

看到这些代表hadoop集群不正常,有文件丢失:

? .............Status: CORRUPT #Hadoop状态:不正常

CORRUPT FILES: 2 #损坏了两个文件
MISSING BLOCKS: 2 #丢失了两个块

第二步:将hadoop文件状态信息打印到文件中

内容太多,截取了一部分信息

hadoop fsck / -files -blocks -locations -racks >/export/missingFile.txt 将检查到的内容打印到/export/missingFile.txt文件中

[[email protected] export]# hadoop fsck /  -files -blocks -locations  -racks >/export/missingFile.txt

/flink-checkpoint/11748bc079799f330078967fbf018a48/chk-74/_metadata 452 bytes, 1 block(s):  OK
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073742825_2005 len=452 Live_repl=1 [/default-rack/192.168.52.110:50010]

/flink-checkpoint/11748bc079799f330078967fbf018a48/shared <dir>
/flink-checkpoint/11748bc079799f330078967fbf018a48/taskowned <dir>
/flink-checkpoint/42d81db182771fe71932120fa8933612 <dir>
/flink-checkpoint/42d81db182771fe71932120fa8933612/chk-950 <dir>
/flink-checkpoint/42d81db182771fe71932120fa8933612/chk-950/_metadata 337 bytes, 1 block(s):  OK
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073745657_4837 len=337 Live_repl=1 [/default-rack/192.168.52.120:50010]

/flink-checkpoint/42d81db182771fe71932120fa8933612/chk-950/f59c63a0-a35d-4d4b-8e73-72c2aa1dd383 5657 bytes, 1 block(s):  OK
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073745656_4836 len=5657 Live_repl=1 [/default-rack/192.168.52.100:50010]

/flink-checkpoint/42d81db182771fe71932120fa8933612/shared <dir>
/flink-checkpoint/42d81db182771fe71932120fa8933612/taskowned <dir>
/flink-checkpoint/50aebc9e7aac85fd33bff905972a6e01 <dir>
/flink-checkpoint/50aebc9e7aac85fd33bff905972a6e01/chk-9 <dir>
/flink-checkpoint/50aebc9e7aac85fd33bff905972a6e01/chk-9/_metadata 451 bytes, 1 block(s):  OK
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073742843_2023 len=451 Live_repl=1 [/default-rack/192.168.52.100:50010]

/flink-checkpoint/50aebc9e7aac85fd33bff905972a6e01/chk-9/c58c8c49-8782-41b4-a3df-2fa7ff1d1eba 5663 bytes, 1 block(s):  OK
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073742842_2022 len=5663 Live_repl=1 [/default-rack/192.168.52.120:50010]

/flink-checkpoint/50aebc9e7aac85fd33bff905972a6e01/shared <dir>
/flink-checkpoint/50aebc9e7aac85fd33bff905972a6e01/taskowned <dir>
/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995 <dir>
/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175 <dir>
/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080 5663 bytes, 1 block(s):
/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080: CORRUPT blockpool BP-2135962035-192.168.52.100-1562110398602 block blk_1073743749
 MISSING 1 blocks of total size 5663 B
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073743749_2929 len=5663 MISSING!

/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata 511 bytes, 1 block(s):
/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata: CORRUPT blockpool BP-2135962035-192.168.52.100-1562110398602 block blk_1073743750
 MISSING 1 blocks of total size 511 B
0. BP-2135962035-192.168.52.100-1562110398602:blk_1073743750_2930 len=511 MISSING!

可以看到正常文件后面都有ok字样,有MISSING!字样的就是丢失的文件。

/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080: CORRUPT blockpool BP-2135962035-192.168.52.100-1562110398602 block blk_1073743749
MISSING 1 blocks of total size 5663 B

/flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata: CORRUPT blockpool BP-2135962035-192.168.52.100-1562110398602 block blk_1073743750
MISSING 1 blocks of total size 511 B

根据这个的路劲可以在hadoop浏览器界面中找到对应的文件路径,如下图:

第三步:修复两个丢失、损坏的文件

[[email protected] conf]# hdfs debug recoverLease -path /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080 -retries 10

[[email protected] conf]# hdfs debug recoverLease -path /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata -retries 10

[[email protected] conf]# hdfs debug recoverLease -path /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080 -retries 10
recoverLease SUCCEEDED on /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/19195239-a205-4462-921d-09e0483a4080

[[email protected] conf]# hdfs debug recoverLease -path /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata -retries 10
recoverLease SUCCEEDED on /flink-checkpoint/626ea65de810a2ec3b1799b605a6a995/chk-175/_metadata
[[email protected] conf]# 

可以看到:

...........Status: HEALTHY
 Total size:    273815315 B
 Total dirs:    403
 Total files:   211
 Total symlinks:        0
 Total blocks (validated):  199 (avg. block size 1375956 B)
 Minimally replicated blocks:   199 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks:     0 (0.0 %)
 Default replication factor:    3
 Average block replication: 2.8492463
 Corrupt blocks:        0
 Missing replicas:      0 (0.0 %)
 Number of data-nodes:      3
 Number of racks:       1
FSCK ended at Fri Aug 23 11:15:01 CST 2019 in 11 milliseconds

...........Status: HEALTHY 集群状态:健康

现在重新启动hadoop就不会一直处于安全模式了,hiveserver2也能正常启动了。。

第四:意外状况

如果修复不了,或者提示修复成功但是集群状态还是下面这样:

.............Status: CORRUPT                    #Hadoop状态:不正常
 Total size:    273821489 B
 Total dirs:    403
 Total files:   213
 Total symlinks:        0
 Total blocks (validated):  201 (avg. block size 1362295 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:  2 (0.99502486 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:    2                           #损坏了两个文件
  MISSING BLOCKS:   2                           #丢失了两个块
  MISSING SIZE:     6174 B
  CORRUPT BLOCKS:   2
  ********************************
 Minimally replicated blocks:   199 (99.004974 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks:     0 (0.0 %)
 Default replication factor:    3
 Average block replication: 2.8208954
 Corrupt blocks:        2
 Missing replicas:      0 (0.0 %)
 Number of data-nodes:      3
 Number of racks:       1
FSCK ended at Fri Aug 23 10:43:11 CST 2019 in 12 milliseconds

1、如果损坏的文件不重要

首先:将找到的损坏文件备份好

然后:执行[[email protected] export]# hadoop fsck / -delete将损坏文件删除

[[email protected] export]# hadoop fsck / -delete

此命令一次不成功可以多试几次,前提是丢失、损坏的文件不重要!!!!!!!!!!

2、如果损坏的文件很重要不能丢失

可以先执行此命令:hadoop fs –safemode leave 强制离开安全模式状态

[[email protected] export]# hadoop fs –safemode leave

此操作不能完全解决问题,只能暂时让集群能够工作!!!!

而且,以后每次启动hadoop集群都要执行此命令,直到问题彻底解决。

原文地址:https://www.cnblogs.com/-xiaoyu-/p/11399287.html

时间: 2024-10-10 03:20:27

hive的hiveserver2模式启动不起来,发现Hadoop一直处于安全模式的相关文章

深入理解 JBoss 7/WildFly Standalone 模式启动过程

概述 JBoss 7/WildFly Standalone 模式启动过程大致如下: 启动脚本 standalone.sh 启动 JBoss Modules,JBoss Modules 启动 JBoss MSC,JBoss MSC 运行加载相关的 Services,本文简单以调试代码的方式,简单说明这几个步骤. 调试 jboss modules 当我们完成 JBoss 安装,我们会发现在 JBOSS_HOME 目录下有一个 `jboss-modules.jar`,该 jar 主要加载 JBoss

WinCE compact 7 VS2008 调试模式启动应用错误

在使用WinCE compact 7 VS2008 调试模式启动应用错误,平台是Telecips 89XX 系列,弹出的错误信息如下: Unhandled exception at 0x00019ef4 inMusicAgentHSTest_SDK.exe: 0xC0000005: 读取位置 0x6003043c 时发生访问冲突. 两个应用,都是使用 VS2008 建立的默认工程,未增加任何代码.两个应用单独运行一切 OK,启动多少次都不会出现任何问题.无论是 Debug.还是 Release

深入理解 JBoss 7/WildFly Domain 模式启动过程

概述 JBoss 7/WildFly 以 domain 模式启动时会启动多个 JVM,例如如下通过启动脚本启动 domain 模式: ./domain.sh 启动后我们查看进程: [[email protected] tdump]$ jps -l 23655 /home/kylin/work/eap/jboss-eap-6.1/jboss-modules.jar 23671 /home/kylin/work/eap/jboss-eap-6.1/jboss-modules.jar 23736 /h

如何让Tomcat以Debug模式启动

有时候我们需要远程调试项目代码,这就需要让Tomcat以Debug模式启动. 环境说明 Windows 7 64位 JDK 8 64位 tomcat-8.0.21 x64.zip 配置方法 第一步:配置JPDA参数 如果Tomcat使用的是JDK 1.5以上版本,那么JPDA可以使用JVMDI,配置方法为: 在tomcat的bin/catalina.bat文件中一开始加入: set JPDA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,sus

【Hadoop】hiveserver2 不能启动端口 10000 开启服务的相关经验总结

转载来自http://blog.csdn.net/lsttoy/article/details/53490144. 这个问题困扰了我三天,各种查资料踩坑填坑的尝试,终于搞定了这个问题. 首先来品尝下喜悦! [[email protected] bin]# beeline ls: cannot access /home/hive/lib/hive-jdbc-*-standalone.jar: No such file or directory Beeline version 2.1.0 by Ap

gradle配置本地jar包依赖和以debug模式启动jetty

以debug模式启动jetty,这个目的很明显,为了后台java代码的远程调试. 做法是:增加环境变量GRADLE_OPTS,值设置为: -Xdebug -Xrunjdwp:transport=dt_socket,address=9999,server=y,suspend=n 这样通过gradle jettyRun命令启动jetty容器的时候,会有如下的显示结果,代表已经启用了debug模式 gradle jettyRun Listening for transport dt_socket at

tomcat在debug模式启动直接提示:弹框无法启动,无报错信息;但直接启动的话,就会有报错信息

今天运行项目,Debug模式启动Tomcat,直接弹框:无法启动(翻译,因为后来整理,所以都忘记当时的截图) 后来尝试直接start,发现不弹框了,但是console有报出错信息. 类似以下错误 2015-8-13 17:29:03 org.apache.tomcat.util.net.JIoEndpoint$Acceptor run严重: Socket accept failedjava.net.SocketException: select failedat java.net.PlainSo

Linux下Debug模式启动Tomcat进行远程调试

J2EE开发各类资源下载清单,  史上最全IT资源,点击进入! 一.      应用场景 在实际的测试过程中,可能会遇到由于程序执行的不间断性,我们无法构造测试场景来验证某个功能的正确性,只有通过代码级的调试才能验证功能是否正确.然而开发本地调试的话,不具有说服力,这时我们测试人员必须连接到linux下的基线版本代码进行远程调试 二.调试步骤 1.部署服务工程到Linux系统下的Tomcat中,本文档重点是远程调式,为了防止文档篇幅累赘,这里就不介绍部署工程了. 2.更改tomcat远程调试端口

HIVE 1.1.0 启动时报错: Unsupported major.minor version 51.0

JDK 1.6 + HIVE 1.1.0 启动时抛异常: [email protected]:~# hive Logging initialized using configuration in jar:file:/hive/apache-hive-1.1.0-bin/lib/hive-common-1.1.0.jar!/hive-log4j.propertiesSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found bin