Secondarynamenode无法正常备份:ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint

原先使用hadoop默认设置(hadoop1.2.1),secondarynamenode会正常进行备份,定时从namenode拷贝image文件到SNN。但是具体SNN备份的时间周期和log文件的大小无法定制,后来楼主就修改了SNN的设置,将fs.checkpoint.period修改为3600s,fs.checkpoint.size修改为64兆。在core-site.xml配置文件中添加这两个参数之后,却发现SNN总是无法备份。后来google查找发现还是配置文件没有配置完整造成的,修改配置文件core-site.xml 和hdfs-site.xml文件后问题解决。

贴一下这两个文件内容:

core-site.xml:

 1 <!-- ****************************************************************************************-->
 2 <!-- This file only used in secondnamenode!!-->
 3 <!-- ****************************************************************************************-->
 4
 5 <configuration>
 6
 7 <property>
 8         <name>hadoop.tmp.dir</name>
 9         <value>/bigdata/hadoop/tmp/</value>
10         <description>A base for other temporary directories.</description>
11 </property>
12
13 <property>
14         <name>fs.default.name</name>
15         <value>hdfs://namenode:54310</value>
16 </property>
17
18 <property>
19         <name>fs.checkpoint.period</name>
20         <value>3600</value>
21         <description>The number of seconds between two periodic checkpoints. </description>
22 </property>
23
24 <property>
25         <name>fs.checkpoint.size</name>
26         <value>67108864</value>
27         <description>The size of the current edit log (in bytes) that triggers a periodic checkpoint even if the fs.checkpoint.period hasn‘t
28 expired. </description>
29 </property>
30
31
32 <property>
33         <name>fs.checkpoint.dir</name>
34         <value>/bigdata/hadoop/namesecondary/</value>
35 </property>
36 </configuration>
hdfs-site.xml

 1 <!-- ****************************************************************************************-->
 2 <!-- This file only used in secondnamenode!!-->
 3 <!-- ****************************************************************************************-->
 4
 5 <configuration>
 6
 7
 8 <property>
 9         <name>fs.checkpoint.period</name>
10         <value>3600</value>
11         <description>The number of seconds between two periodic checkpoints. </description>
12 </property>
13
14
15 <property>
16        <name>dfs.secondary.http.address</name>
17        <value>secondnamenode:50090</value>
18 </property>
19
20
21 <property>
22      <name>dfs.http.address</name>
23      <value>namenode:50070</value>
24      <final>true</final>
25 </property>
26
27
28 <property>
29         <name>dfs.replication</name>
30         <value>2</value>
31 </property>
32
33 <property>
34         <name>dfs.name.dir</name>
35         <value>/bigdata/hadoop/secondnamenodelogs/</value>
36 </property>......

其中红色部分为关键参数。楼主刚开始以为hdfs-site.xml不需要做修改,后来发现问题主要是出现在这个文件中,真是坑爹@!!!

在hdfs-site.xml文件中需要加上core-site.xml文件中的参数fs.checkpoint.period 或者fs.checkpoint.size;dfs.http.address指定namenode的访问地址,SNN根据这个地址来获取NN保存的image。dfs.secondary.http.address则是SNN自己Web接口,这个参数必须配置,楼主就是因为没有配置这个参数一直报下面这个错误:

 1 2014-06-25 14:17:40,408 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint:
 2 2014-06-25 14:17:40,408 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.FileNotFoundException: http://namenode:50070/
 3 getimage?putimage=1&port=50090&machine=0.0.0.0&token=-41:620270652:0:1403579817000:1403578915285&newChecksum=7fcdd4793ce44f017d290e7db78870e7
 4         at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434)
 5         at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:177)
 6         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.putFSImage(SecondaryNameNode.java:462)
 7         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:525)
 8         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:396)
 9         at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:360)
10         at java.lang.Thread.run(Thread.java:662)

Secondarynamenode无法正常备份:ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint

时间: 2024-10-17 22:01:25

Secondarynamenode无法正常备份:ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint的相关文章

启动hadoop报ERROR org.apache.hadoop.hdfs.server.namenode.FSImage: Failed to load image from FSImageFile

不知道怎么回事,今天在启动集群时通过jps查看进程时始终有一个standby namenode进程无法启动.查看日志时报的是不能加载fsimage文件.日志截图如下: 日志报的很明显了是不能加载元数据信息,解决方案: 解决办法: 1.手动copy namenode(active)所在的那台服务器上XXX/dfs/name/current/下的所有文件到namenode(standby) 所在的那台服务器的对应文件夹下. 2. 重新格式化namenode(active),然后再把格式化后的元数据复

ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Incompatible namespaceIDs

用三台centos操作系统的机器搭建了一个hadoop的分布式集群.启动服务后失败,查看datanode的日志,提示错误:ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /var/lib/hadoop-0.20/cache/hdfs/dfs/data: namenode namespaceID = 240012870; datanode

hadoop错误FATAL org.apache.hadoop.hdfs.server.namenode.NameNode Exception in namenode join java.io.IOException There appears to be a gap in the edit log

错误: FATAL org.apache.hadoop.hdfs.server.namenode.NameNode Exception in namenode join java.io.IOException There appears to be a gap in the edit log 原因: namenode元数据被破坏,需要修复 解决:     恢复一下namenode hadoop namenode –recover 一路选择c,一般就OK了 如果,您认为阅读这篇博客让您有些收获,不

HDFS超租约异常总结(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException)

HDFS超租约异常总结(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException) 转载 2014年02月22日 14:40:58 9686 异常信息: 13/09/11 12:12:06 INFO hdfs.DFSClient: SMALL_BUFFER_SIZE is 512 org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode

【异常】org.apache.hadoop.hdfs.server.common.InconsistentFSStateException

1 异常信息 05-30 07:53:45,204 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Maximum size of an xattr: 16384 2019-05-30 07:53:45,204 WARN org.apache.hadoop.hdfs.server.common.Storage: Storage directory /mnt/software/hadoop-2.6.0-cdh5.16.1/data

Datanode启动问题 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool &lt;registering&gt;

2017-04-15 21:21:15,423 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: supergroup = supergroup 2017-04-15 21:21:15,467 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity:

hadoop格式化:java.io.IOException: Incompatible clusterIDs in /home/lxh/hadoop/hdfs/data: namenode clusterID

1 概述  解决hadoop启动hdfs时,datanode无法启动的问题.错误为: java.io.IOException: Incompatible clusterIDs in /home/lxh/hadoop/hdfs/data: namenode clusterID = CID-a3938a0b-57b5-458d-841c-d096e2b7a71c; datanode clusterID = CID-200e6206-98b5-44b2-9e48-262871884eeb 2 问题描述

hbase(ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet)

今天启动clouder manager集群时候hbase list出现 (ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet ERROR: Can't get master address from ZooKeeper; znode data == null ) 一类错误 看网上有几种解决方案 一个是 1: 在使用 hbase shell 时,一直报错.花了半个多小时

HBase 报错 ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

学hbase的时候,搭建好环境启动hbase,jps查看进程发现hmaster和hregionserevr可以正常出现. 终端下输入hbase shell后也可以进入hbase的shell,输入List后就报错: 错误提示:ERROR: org.apache.hadoop.hbase.PleaseHoldException: Master is initializing 查看错误日志发现: 如图 发现有一台节点无法连接,所以master一直是等待状态,不能列出具体的namespace.启动另一台