Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常

共享原因:虽然用一篇博文写问题感觉有点奢侈,但是搜索百度,相关文章太少了,苦苦探寻日志才找到解决方案。

遇到问题:在windows平台上开发的mapreduce程序,运行迟迟没有结果。

Mapreduce程序

public class Test {
    public static void main(String [] args) throws Exception{
        Configuration conf = new Configuration();
       conf.set("fs.defaultFS", "hdfs://master:9000/");

        conf.set("mapreduce.job.jar", "D:/intelij-workspace/aaron-bigdata/aaorn-mapreduce/target/aaorn-mapreduce-1.0-SNAPSHOT.jar".trim());
        conf.set("mapreduce.framework.name", "yarn");
        conf.set("yarn.resourcemanager.hostname", "master");
        conf.set("mapreduce.app-submission.cross-platform", "true");
        Job job = Job.getInstance(conf);
        job.setMapperClass(WordCountMapper.class);
        job.setReducerClass(WordCountReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(LongWritable.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);

        FileInputFormat.setInputPaths(job,"hdfs://master:9000/input/");
        FileOutputFormat.setOutputPath(job,new Path("hdfs://master:9000/output3/"));

        job.waitForCompletion(true);
    }
}

运行结果

[QC] INFO [main] org.apache.hadoop.yarn.client.RMProxy.createRMProxy(98) | Connecting to ResourceManager at master/192.168.56.100:8032
[QC] WARN [main] org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(64) | Hadoop
command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
[QC] INFO [main] org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(283) | Total input paths to process : 2
[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(198) | number of splits:2
[QC] INFO [main] org.apache.hadoop.mapreduce.JobSubmitter.printTokens(287) | Submitting tokens for job: job_1496627557122_0004
[QC] INFO [main] org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(273) | Submitted application application_1496627557122_0004
[QC] INFO [main] org.apache.hadoop.mapreduce.Job.submit(1294) | The url to track the job: http://master:8088/proxy/application_1496627557122_0004/
[QC] INFO [main] org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(1339) | Running job: job_1496627557122_0004

Master(NameNode)日志

java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
        at org.apache.hadoop.ipc.Server.channelRead(Server.java:2603)
        at org.apache.hadoop.ipc.Server.access$2800(Server.java:136)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1481)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:771)
        at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:637)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:608

Slave(DataNode)的日志异常

2017-06-05 09:49:40,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:41,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:42,465 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:43,467 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:44,468 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:45,470 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:46,472 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2017-06-05 09:49:47,474 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

说明

我的hadoop集群是Master(namenode)、Slave1、Slave2、Slave3

解决办法

在所有的Slave机器的yarn-site.xml,之前我只在Master机器上添加了这些内容

<configuration>
  <property>
      <name>yarn.resourcemanager.hostname</name>
      <value>master</value>
  </property>
  <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
  </property>
  <property>
      <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
</configuration>
时间: 2024-10-10 14:01:52

Windows平台开发Mapreduce程序远程调用运行在Hadoop集群—Yarn调度引擎异常的相关文章

在windows远程提交任务给Hadoop集群(Hadoop 2.6)

我使用3台Centos虚拟机搭建了一个Hadoop2.6的集群.希望在windows7上面使用IDEA开发mapreduce程序,然后提交的远程的Hadoop集群上执行.经过不懈的google终于搞定 开始我使用hadoop的eclipse插件来执行job,竟然成功了,后来发现mapreduce是在本地执行的,根本没有提交到集群上.我把hadoop的4个配置文件加上后就开始出现了问题. 1:org.apache.hadoop.util.Shell$ExitCodeException: /bin/

本地idea开发mapreduce程序提交到远程hadoop集群执行

https://www.codetd.com/article/664330 https://blog.csdn.net/dream_an/article/details/84342770 通过idea开发mapreduce程序并直接run,提交到远程hadoop集群执行mapreduce. 简要流程:本地开发mapreduce程序–>设置yarn 模式 --> 直接本地run–>远程集群执行mapreduce程序: 完整的流程:本地开发mapreduce程序--> 设置yarn模式

基于科大讯飞语音云windows平台开发

前记: 前段时间公司没事干,突发奇想想做一个语音识别系统,看起来应该很简单的,但做起来却是各种问题,这个对电气毕业的我,却是挺为难的.谷姐已经离我们而去,感谢度娘,感谢CSDN各位大神,好歹也做的是那么回事了,虽然还是不好用,但基本功能实现了. 该软件使用VS2008C++/CLR开发,由于科大讯飞提供的是C的API接口,结果到这边就是各种不兼容,CLR是基于托管堆运行的,而这个API有是非托管堆的,使用了各种指针,原本打算使用C#来做,最后门外汉的我也没能做到C#和C指针完美结合,真怀恋单片机

gcc和MinGW的异同(在cygwin/gcc做的东西可以无缝的用在linux下,没有任何问题,是在windows下开发linux程序的一个很好的选择)

cygwin/gcc和MinGW都是gcc在windows下的编译环境,但是它们有什么区别,在实际工作中如何选择这两种编译器. cygwin/gcc完全可以和在linux下的gcc化做等号,这个可以从boost库的划分中可以看出来端倪,cygwin下的gcc和linux下的gcc完全使用的是相同的Toolsets.所以完全可以和linux一起同步更新gcc版本,而不用担心问题,并且在cygwin/gcc做的东西(不用win32的)可以无缝的用在linux下,没有任何问题.是在windows下开发

在Eclipse中开发MapReduce程序

一.Eclipse的安装与设置 1.在Eclipse官网上下载eclipse-jee-oxygen-3a-linux-gtk-x86_64.tar.gz文件并将其拷贝到/home/jun/Resources下,然后再将文件拷贝到/home/jun下并解压. [[email protected] ~]$ cp /home/jun/Resources/eclipse-jee-oxygen-3a-linux-gtk-x86_64.tar.gz /home/jun/ [[email protected]

hadoop开发MapReduce程序

准备工作: 1.设置HADOOP_HOME,指向hadoop安装目录,否则报这个错: 2.在window下,需要把hadoop/bin那个目录替换下,在网上搜一个对应版本的 3.如果还报org.apache.hadoop.io.nativeio.NativeIO$Windows.access0错,把其中的hadoop.dll复制到c:\windows\system32目录 依赖的jar 1.common hadoop-2.7.3\share\hadoop\common\hadoop-common

WinCE平台的C#程序中调用MessageBeep发出一些系统自带的声音,而不用使用playsound

[DllImport("coredll.dll", EntryPoint = "MessageBeep")] public static extern bool MessageBeep(int iType); int i = 0x00000040; ClassPublicFunction.MessageBeep(i); 声音的类型 public enum BeepType {  SimpleBeep = -1,  IconAsterisk = 0x00000040,

windows下在eclipse上远程连接hadoop集群调试mapreduce错误记录

第一次跑mapreduce,记录遇到的几个问题,hadoop集群是CDH版本的,但我windows本地的jar包是直接用hadoop2.6.0的版本,并没有特意找CDH版本的 1.Exception in thread "main" java.lang.NullPointerException atjava.lang.ProcessBuilder.start 下载Hadoop2以上版本时,在Hadoop2的bin目录下没有winutils.exe和hadoop.dll,网上找到对应版本

eclipse开发mapreduce程序时出现的问题

1.报HDFS权限不够:org.apache.hadoop.security.AccessControlException: Permission denied:user=ouqiping, access=WRITE, inode="/user/Administrator/DatingRecommender/ratings.dat":root:supergroup:drwxr-xr-x 解决办法: 到服务器上修改hadoop的配置文件:conf/hdfs-core.xml, 找到 df