关于hadoop的AlreadyBeingCreatedException异常的解决

今天工作上遇到一个问题,报错如下:

org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP for DFSClient_attempt_201601231122_96889_m_000004_0_1149914572_1 on client 132.121.94.29, because this file is already being created by NN_Recovery on 132.121.94.29

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1826)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1649)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1595)

at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:712)

at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:691)

at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438)

at org.apache.hadoop.ipc.Client.call(Client.java:1118)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)

at com.sun.proxy.$Proxy7.create(Unknown Source)

at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)

at com.sun.proxy.$Proxy7.create(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3660)

at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:930)

at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:207)

查NameNode的日志:

2016-02-15 00:02:28,392 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile: recover lease [Lease.  Holder: NN_Recovery, pendingcreates: 182], src=/apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP from client NN_Recovery

2016-02-15 00:02:28,392 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering lease=[Lease.  Holder: NN_Recovery, pendingcreates: 182], src=/apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP 2016-02-15 00:02:28,392 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* blk_-3441330245353537374_1007295680 recovery started, primary=132.121.94.29:1010

2016-02-15 00:02:28,407 WARN org.apache.hadoop.hdfs.StateChange: DIR* startFile: failed to create file /apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP for DFSClient_attempt_201601231122_95299_m_000003_0_-559831283_1 on client 132.121.94.7, because this file is already being created by NN_Recovery on 132.121.94.29 2016-02-15 00:02:28,407 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:icps/[email protected] cause:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP for DFSClient_attempt_201601231122_95299_m_000003_0_-559831283_1 on client 132.121.94.7, because this file is already being created by NN_Recovery on 132.121.94.29 2016-02-15 00:02:28,407 INFO org.apache.hadoop.ipc.Server: IPC Server handler 34 on 8020, call create(/apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP, rwxr-xr-x, DFSClient_attempt_201601231122_95299_m_000003_0_-559831283_1, true, 3, 67108864) from 132.121.94.7:48867: error: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP for DFSClient_attempt_201601231122_95299_m_000003_0_-559831283_1 on client 132.121.94.7, because this file is already being created by NN_Recovery on 132.121.94.29 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /apps/icps/data/collect/tmp/10100345.10011301..IVPN_F.20160214.5834.ICPS.COLLECT.201602000000.0000.NORMAL.TMP for DFSClient_attempt_201601231122_95299_m_000003_0_-559831283_1 on client 132.121.94.7, because this file is already being created by NN_Recovery on 132.121.94.29         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1826)         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1649)         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1595)         at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:712)         at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:691)

一直反复刷这三条日志,

而我的代码只是简单的在HDFS上创建一个文件:

FSDataOutputStream fsdout = fs.create(tmpPath);

默认是如果该文件存在则会覆盖掉,一开始很奇怪,我感觉这是hadoop本身的bug

于是上hadoop官网上查找这个异常,发现了这个:

Creating an already-open-for-write file with overwrite=true fails

Details

  • Type: Bug
  • Status:CLOSED
  • Priority: Major
  • Resolution:Fixed
  • Affects Version/s:2.0.0-alpha
  • Fix Version/s:2.0.2-alpha
  • Component/s:namenode
  • Labels:

    None

  • Hadoop Flags:

    Reviewed

  • Release Note:

    This is an incompatible change: Before this change, if a file is already open for write by one client, and another client calls fs.create() with overwrite=true, an AlreadyBeingCreatedException is thrown. After this change, the file will be deleted and the new file will be created successfully.

    Description

    If a file is already open for write by one client, and another client calls fs.create() with overwrite=true, the file should be deleted and the new file successfully created. Instead, it is currently throwing AlreadyBeingCreatedException.

    This is a regression since branch-1.

    (详情请看:https://issues.apache.org/jira/browse/HDFS-3755)

    查看这个BUG的修改日志:

    --- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java	(revision 1370568)
    +++ hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java	(working copy)
    @@ -1755,8 +1755,6 @@
    
         try {
           INodeFile myFile = dir.getFileINode(src);
    -      recoverLeaseInternal(myFile, src, holder, clientMachine, false);
    -
           try {
             blockManager.verifyReplication(src, replication, clientMachine);
           } catch(IOException e) { recoverLeaseInternal(myFile, src, holder, clientMachine, false);这个函数被删掉了,取而代之的是blockManager.verifyReplication(src, replication, clientMachine); 我反编译我们所使用的依赖库代码,发现我们的代码还是使用这个函数: ......

    try {

    INode myFile = dir.getFileINode(src);

    recoverLeaseInternal(myFile, src, holder, clientMachine, false);

    try {

    verifyReplication(src, replication, clientMachine);

    ......结论: 试图创建一个已存在的文件有可能会导致抛出AlreadyBeingCreatedException这个异常,由于namenode的server一直刷回收lease的日志,有可能是lease回收失败导致. 解决办法: 将代码: FSDataOutputStream fsdout = fs.create(tmpPath); 改成: if (fs.exists(tmpPath)) {      fs.delete(tmpPath, false);     } FSDataOutputStream fsdout = fs.create(tmpPath);
时间: 2024-11-03 01:33:15

关于hadoop的AlreadyBeingCreatedException异常的解决的相关文章

hadoop namenode启动异常,死活失败

2014-05-12注定是春光灿烂猪八戒的一天,历史595无故障的hadoop服务器,终于还是出了问题,事前无人登陆操作服务器,此故障属于自发行为,目前未知发生原因. 细节描述: namenode无法启动. 先贴出错误信息 2014-05-12 07:17:39,447 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: /**********************************************

java.lang.ArithmeticException: Non-terminating decimal expansion; no exact representable decimal result异常的解决方法

今天在写一个JAVA程序的时候出现了异常:java.lang.ArithmeticException: Non-terminating decimal expansion; no exact representable decimal result.发现报错的语句是: 1 foo.divide(bar)); 原来JAVA中如果用BigDecimal做除法的时候一定要在divide方法中传递第二个参数,定义精确到小数点后几位,否则在不整除的情况下,结果是无限循环小数时,就会抛出以上异常.解决方法:

使用maven时出现Failure to transfer 异常的解决办法

> 使用maven时出现Failure to transfer 错误的解决方法 在eclipse里使用maven,连接nexus私服. 添加依赖之后,总是报添加的依赖jar文件找不到,但是在nexus的库里面能找到这个依赖的jar文件,但是在本地的maven库里面找不到,于是我将本地库里面这个依赖对应的文件夹删掉,然后在eclipse里面执行update dependencies.成功解决问题! 右键单击项目->maven->update dependencies. 引起的原因是由于本地

Hibernate懒加载异常的解决与深刻分析

出现原因: Hibernate允许对关联对象及属性延迟加载,前提是这个操作在同一个Hibernate session范围内进行.如果发生这样的情况,当service层返回了一个启用延迟加载功能的domain对象给web层,而web层访问到这些需要延迟加载的对象或属性时,由于加载domain对象的session已经关闭导致无法访问,就会出现访问异常. 解决方法: 我们可以使用一个过滤器,在web请求时给它绑定一个Hibernatesession,这样就可以保证整个请求的过程中session都是开启

.net安装部署“Error 1001 在初始化安装时发生异常” 的解决方法

状况描述:打包安装后,如果删除安装目录中的某个文件,这时从桌面快捷方式启动软件系统会自动运行修复程序,此时因为路径问题会报出"错误 1001 在初始化安装时发生异常xxx"的异常.(前提是你的安装部署中加入了"自定义操作",并为其传值). 查找原因原来是"自定义操作"中CustomActionData值有误造成.未出错前的CustomActionData. /DbName=[DBNAME] /ServerName=[SERVERNAME] /Us

cocos2d-x发生undefined reference to `XX&#39;异常 一劳永逸解决办法

cocos2d-x发生undefined reference to `XX'错误 一劳永逸解决方法 参考文章: http://blog.csdn.net/kafeidev/article/details/9157895 http://blog.csdn.net/fu_zk/article/details/12836431 eclipse cocos2dx项目,出现错误 E:/Acocos2d-x/cocos2d-1.0.1-x-0.11.0/MyBilliard/android/jni/../.

Android Eclipse工程开发中的常见调试问题(二)android.os.NetworkOnMainThreadException 异常的解决办法

android.os.NetworkOnMainThreadException 异常的解决办法, 刚开是把HttpURLConnectionnection 打开连接这个方法放在UI线程里了,可能不是线程安全的,而且这个方法请求是需要等待的,所以就抛出了这个异常,后来用子线程打开的HttpURLConnection, 一切就都正常了,只要在主线程里开启子线程就行了.子线程利用URL 问题解决.下面贴一段代码 String file1 = SERVER_PATH; URL url = new URL

Android开发:StaggeredGridView瀑布流控件运行异常崩溃解决方法

StaggeredGridView是github上一个开源的瀑布流图片库,本文将分享集成StaggeredGridView时碰到的异常以及解决方法,StaggeredGriedView开源地址为:https://github.com/maurycyw/StaggeredGridView. StaggeredGriedViewDemo运行报错异常为: java.lang.RuntimeException: Unable to start activity  ComponentInfo{com.ex

Maven常见异常及解决方法(转)

异常1: [ERROR] Failed to execute goal on project biz_zhuhai: Could not resolve dependencies for project biz_zhuhai:biz_zhuhai:jar:0.0.1-SNAPSHOT: Failed to collect dependencies for [com.maywide.ibh:lib345:pom:1.0 (compile)]: Failed to read artifact des