Windows 使用Eclipse编译运行MapReduce --WordCount 本地调式

一 . 准备工作

操作系统:windows 10

开发工具:eclipse 4.5

java虚拟机 :jdk 1.8  (jdk-8u91-windows-x64.exe) 官网下载地址http://download.oracle.com/otn-pub/java/jdk/8u91-b14/jdk-8u91-windows-x64.exe

hadoop版本:hadoop2.6 (hadoop-2.6.4.tar.gz) 官网下载地址http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz

hadoop插件:hadoop-eclipse-plugin-2.6.0 是一个专门用于eclipse的hadoop插件,可以根据使用的hadoop版本编译,这里使用的是hadoop-eclipse-plugin-2.6.0.jar

hadoop2.6插件包:在hadoop2.6.0源码的hadoop-common-project\hadoop-common\src\main\winutils下,有一个vs.net工程,编译这个工程可以得到这一堆文件,输出的文件中,

hadoop.dll、winutils.exe 这二个最有用 (主要是防止插件报各种莫名错误,比如空对象引用)

注:如果不想编译,可直接下载编译好的文件 hadoop2.6(x64)V0.2.zip

二 . 安装过程

2.1 jdk

a. 安装过程

b. 环境变量

JAVA_HOME=C:\Program Files\Java\jdk1.8.0_91

classpath=.;%JAVA_HOME%\lib\dt.jar;%JAVA_HOME%\lib\tools.jar

Path=;%JAVA_HOME%\bin;%JAVA_HOME%\jre\bin

2.2 hadoop

a.安装过程

解压 hadoop-2.6.4.tar.gz 到 D:\hadoop >>D:\hadoop\hadoop-2.6.4 即可

b. 环境变量

HADOOP_HOME=D:\hadoop\hadoop-2.6.4

Path=;%HADOOP_HOME%\bin

2.3 hadoop eclipse插件

将下载后的hadoop-eclipse-plugin-2.6.0.jar复制到eclipse/plugins目录下,然后重启eclipse就OK了

2.4 hadoop2.6插件包

将winutils.exe复制到$HADOOP_HOME\bin目录,将hadoop.dll复制到%windir%\system32目录

三、Eclipse远程配置

重启Eclipse后,左侧出现DFS Localtions,下面Map/Reduce Localtions。

配置hadoop路径:Window 》Preferences ,选择Hadoop Map/Reduce ,输入hadoop的路径,如下

在Map/Reduce Localtions 下  点击“new hadoop location..” ,输入name node 节点的IP和端口,自定义Location name >> "namenode"

配置成功,则显示如下;否则会提示连接失败,如果失败,请检查IP和端口是否正确

四 新建MapReduce项目并运行--WordCount测试

1.新建MapReduce工程: File>New>Other>MapReduce,命名"mr-project"

2 .在src目录下创建package:org.apache.hadoop.examples

3 .把MapReduce的例子WordCount.java拷贝到org.apache.hadoop.examples

4.在src目录下创建log4j.properties日志,并配置以下信息

log4j.properties配置信息

  1. log4j.rootLogger=INFO, stdout
  2. log4j.appender.stdout=org.apache.log4j.ConsoleAppender
  3. log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
  4. log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
  5. log4j.appender.logfile=org.apache.log4j.FileAppender
  6. log4j.appender.logfile.File=target/spring.log
  7. log4j.appender.logfile.layout=org.apache.log4j.PatternLayout
  8. log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n

5.点击WordCount.java右击-->Run As-->Run COnfigurations   设置输入和输出目录路径(注意,这个输入路径必须已经存在,并且有文件,输出目录则相反),点击Apply。如图所示:

6.点击WordCount.java右击-->Run
As-->Run on  Hadoop ,控制台打印如下信息

2016-05-04 09:42:55,326 INFO [org.apache.hadoop.conf.Configuration.deprecation] - session.id is deprecated. Instead, use dfs.metrics.session-id
  2016-05-04 09:42:55,328 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with processName=JobTracker, sessionId=
  2016-05-04 09:42:56,050 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
  2016-05-04 09:42:56,125 INFO [org.apache.hadoop.mapreduce.lib.input.FileInputFormat] - Total input paths to process : 1
  2016-05-04 09:42:56,267 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - number of splits:1
  2016-05-04 09:42:56,351 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - Submitting tokens for job: job_local384499348_0001
  2016-05-04 09:42:56,571 INFO [org.apache.hadoop.mapreduce.Job] - The url to track the job: http://localhost:8080/
  2016-05-04 09:42:56,572 INFO [org.apache.hadoop.mapreduce.Job] - Running job: job_local384499348_0001
  2016-05-04 09:42:56,573 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter set in config null
  2016-05-04 09:42:56,581 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
  2016-05-04 09:42:56,688 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for map tasks
  2016-05-04 09:42:56,689 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local384499348_0001_m_000000_0
  2016-05-04 09:42:56,730 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
  2016-05-04 09:42:56,780 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : [email protected]
  2016-05-04 09:42:56,786 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: hdfs://192.168.5.97:8020/tmp/htb/mr/input/testcount.txt:0+168
  2016-05-04 09:42:56,833 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
  2016-05-04 09:42:56,833 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
  2016-05-04 09:42:56,833 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
  2016-05-04 09:42:56,833 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
  2016-05-04 09:42:56,833 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
  2016-05-04 09:42:56,837 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
  2016-05-04 09:42:57,188 INFO [org.apache.hadoop.mapred.LocalJobRunner] -
  2016-05-04 09:42:57,191 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
  2016-05-04 09:42:57,191 INFO [org.apache.hadoop.mapred.MapTask] - Spilling map output
  2016-05-04 09:42:57,191 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufend = 295; bufvoid = 104857600
  2016-05-04 09:42:57,191 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396(104857584); kvend = 26214272(104857088); length = 125/6553600
  2016-05-04 09:42:57,212 INFO [org.apache.hadoop.mapred.MapTask] - Finished spill 0
  2016-05-04 09:42:57,219 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local384499348_0001_m_000000_0 is done. And is in the process of committing
  2016-05-04 09:42:57,370 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map
  2016-05-04 09:42:57,370 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local384499348_0001_m_000000_0' done.
  2016-05-04 09:42:57,370 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local384499348_0001_m_000000_0
  2016-05-04 09:42:57,370 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map task executor complete.
  2016-05-04 09:42:57,373 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for reduce tasks
  2016-05-04 09:42:57,373 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local384499348_0001_r_000000_0
  2016-05-04 09:42:57,382 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
  2016-05-04 09:42:57,437 INFO [org.apache.hadoop.mapred.Task] -  Using ResourceCalculatorProcessTree : [email protected]
  2016-05-04 09:42:57,441 INFO [org.apache.hadoop.mapred.ReduceTask] - Using ShuffleConsumerPlugin: [email protected]
  2016-05-04 09:42:57,454 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - MergerManager: memoryLimit=1310195712, maxSingleShuffleLimit=327548928, mergeThreshold=864729216, ioSortFactor=10, memToMemMergeOutputsThreshold=10
  2016-05-04 09:42:57,457 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - attempt_local384499348_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
  2016-05-04 09:42:57,490 INFO [org.apache.hadoop.mapreduce.task.reduce.LocalFetcher] - localfetcher#1 about to shuffle output of map attempt_local384499348_0001_m_000000_0 decomp: 325 len: 329 to MEMORY
  2016-05-04 09:42:57,497 INFO [org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput] - Read 325 bytes from map-output for attempt_local384499348_0001_m_000000_0
  2016-05-04 09:42:57,500 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - closeInMemoryFile -> map-output of size: 325, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->325
  2016-05-04 09:42:57,503 INFO [org.apache.hadoop.mapreduce.task.reduce.EventFetcher] - EventFetcher is interrupted.. Returning
  2016-05-04 09:42:57,504 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 1 / 1 copied.
  2016-05-04 09:42:57,505 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
  2016-05-04 09:42:57,521 INFO [org.apache.hadoop.mapred.Merger] - Merging 1 sorted segments
  2016-05-04 09:42:57,522 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 1 segments left of total size: 321 bytes
  2016-05-04 09:42:57,525 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merged 1 segments, 325 bytes to disk to satisfy reduce memory limit
  2016-05-04 09:42:57,526 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 1 files, 329 bytes from disk
  2016-05-04 09:42:57,527 INFO [org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl] - Merging 0 segments, 0 bytes from memory into reduce
  2016-05-04 09:42:57,527 INFO [org.apache.hadoop.mapred.Merger] - Merging 1 sorted segments
  2016-05-04 09:42:57,529 INFO [org.apache.hadoop.mapred.Merger] - Down to the last merge-pass, with 1 segments left of total size: 321 bytes
  2016-05-04 09:42:57,530 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 1 / 1 copied.
  2016-05-04 09:42:57,576 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local384499348_0001 running in uber mode : false
  2016-05-04 09:42:57,577 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 0%
  2016-05-04 09:42:57,616 INFO [org.apache.hadoop.conf.Configuration.deprecation] - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
  2016-05-04 09:42:58,053 INFO [org.apache.hadoop.mapred.Task] - Task:attempt_local384499348_0001_r_000000_0 is done. And is in the process of committing
  2016-05-04 09:42:58,095 INFO [org.apache.hadoop.mapred.LocalJobRunner] - 1 / 1 copied.
  2016-05-04 09:42:58,095 INFO [org.apache.hadoop.mapred.Task] - Task attempt_local384499348_0001_r_000000_0 is allowed to commit now
  2016-05-04 09:42:58,254 INFO [org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter] - Saved output of task 'attempt_local384499348_0001_r_000000_0' to hdfs://192.168.5.97:8020/tmp/htb/mr/ouput/_temporary/0/task_local384499348_0001_r_000000
  2016-05-04 09:42:58,255 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce > reduce
  2016-05-04 09:42:58,255 INFO [org.apache.hadoop.mapred.Task] - Task 'attempt_local384499348_0001_r_000000_0' done.
  2016-05-04 09:42:58,255 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Finishing task: attempt_local384499348_0001_r_000000_0
  2016-05-04 09:42:58,256 INFO [org.apache.hadoop.mapred.LocalJobRunner] - reduce task executor complete.
  2016-05-04 09:42:58,579 INFO [org.apache.hadoop.mapreduce.Job] -  map 100% reduce 100%
  2016-05-04 09:42:59,580 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local384499348_0001 completed successfully
  2016-05-04 09:42:59,592 INFO [org.apache.hadoop.mapreduce.Job] - Counters: 38
	File System Counters
		FILE: Number of bytes read=1104
		FILE: Number of bytes written=509445
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=336
		HDFS: Number of bytes written=211
		HDFS: Number of read operations=13
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=4
	Map-Reduce Framework
		Map input records=2
		Map output records=32
		Map output bytes=295
		Map output materialized bytes=329
		Input split bytes=120
		Combine input records=32
		Combine output records=28
		Reduce input groups=28
		Reduce shuffle bytes=329
		Reduce input records=28
		Reduce output records=28
		Spilled Records=56
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=5
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=503840768
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters
		Bytes Read=168
	File Output Format Counters
		Bytes Written=211

查看输出目录,如下

测试到这里,Eclipse远程调试hadoop就配置成功了,这里也可以设置断点调式!

log4j.properties主要解决Eclipse中运行MapReduce程序时控制台无法打印进度信息的问题,

如果没有log4j.properties配置文件,控制台只输出以下这些信息

[plain] view
plain
 copy

  1. log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
  2. log4j:WARN Please initialize the log4j system properly.
  3. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

配置过程中遇到可能会过几个问题,这里可以参考 http://my.oschina.net/muou/blog/408543

参考:http://www.cnblogs.com/yjmyzz/p/how-to-remote-debug-hadoop-with-eclipse-and-intellij-idea.html

参考:http://blog.csdn.net/hipercomer/article/details/27063577

时间: 2024-11-07 20:34:36

Windows 使用Eclipse编译运行MapReduce --WordCount 本地调式的相关文章

使用Eclipse编译运行MapReduce程序 Hadoop2.6.0_Ubuntu/CentOS

文章来源:http://www.powerxing.com/hadoop-build-project-using-eclipse/ 使用Eclipse编译运行MapReduce程序 Hadoop2.6.0_Ubuntu/CentOS 本教程介绍的是如何在 Ubuntu/CentOS 中使用 Eclipse 来开发 MapReduce 程序,在 Hadoop 2.6.0 下验证通过.虽然我们可以使用命令行编译打包运行自己的MapReduce程序,但毕竟编写代码不方便.使用 Eclipse,我们可以

windows下使用Eclipse编译运行MapReduce程序 Hadoop2.6.0/Ubuntu(二)

在上篇文章中eclipse已经能访问HDFS目录,但并不能进行Mapreduce编程,在这里小编将常见错误和处理办法进行总结,希望对大家有所帮助 错误1:ERROR [main] util.Shell (Shell.java:getWinUtilsPath(303)) - Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable n

windows下使用Eclipse编译运行MapReduce程序 Hadoop2.6.0/Ubuntu

一.环境介绍 宿主机:windows8 虚拟机:Ubuntu14.04 hadoop2.6伪分布:搭建教程http://blog.csdn.net/gamer_gyt/article/details/46793731 Eclipse:eclipse-jee-luna-SR2-win32-x86_64 二.准备阶段 网上下载hadoop-eclipse-plugin-2.6.0.jar (点击下载) 也可以自行编译(网上教程挺多的,可以自己百度 or Google) 三.begin 复制编译好的j

在windows 使用eclipse 跨平台运行mapreduce 提示连接不上

错误提示:call from 10.32.6.150:8020 failed on connection 如果你遇到这个问题了,在网络正常的情况下,更改你的代码,如下: 1 System.setProperty("HADOOP_USER_NAME", "root"); 2 Configuration conf = new Configuration(); 3 conf.set("fs.defaultFS", "hdfs://10.32.

关于导入geoserver 源码到Eclipse编译运行

参考http://blog.csdn.net/gisshixisheng/article/details/43016443 和  http://blog.sina.com.cn/s/blog_6e3765300102vvh8.html 两篇文章 一.安装JDK环境 1.到到官网下载JDK 最好是1.8.0以上版本,这里jdk jdk-8u101-windows-x64(我的机器是64位,如果是32位,需要32位) 2.安装JDK 一路next,其中jdk安装位置和jre安装位置默认是放在c盘,如

Linux下的Eclipse远程运行Mapreduce到hadoop集群

假设已经配置好集群. 在开发客户机Linux centos 6.5上进行: a.客户机centos 有一个和集群同名的访问用户:huser. b.vim /etc/hosts 加入namenode,加入本机IP. ------------------------- 1.安装hadoop集群 同版本的 jdk, hadoop, 2.Eclipse 编译和安装同版本的 hadoop-plugin, 2.a设定preference->hadoop MR->hadoop install directo

Mac平台下的Qt程序在Windows下执行编译运行出现的中文乱码问题

Mac平台下的Qt程序在Windows下执行编译运行时,QString::asprintf()部分会出现的中文乱码问题,之前已经使用QStringLiteral宏解决了一个中文乱码问题: 但是此种情形单凭QStringLiteral宏解决不了. 原因: mac下采用MinGW编译,在Windows下可能是MinGW,也可能是MSV2017等VS编译器编译.如果win平台下是MinGW编译,在mac下写好的程序在win平台下运行不会出现乱码: 如果win下是MSV2017编译就会出现中文乱码,需要

如何在命令提示符下编译运行含有Package的java文件

这篇是大二自学Java的时候记下的笔记,中午回顾印象笔记的时候意外看到了这篇.看到多年前写下的文字,我想起那时候我对Java的懵懵懂懂,每天晚上在图书馆照着书写书上的示例代码,为一个中文分号绞尽脑汁,为命令提示符上打印出的图案而兴奋.到现在我依然觉得,一个从没有过编程经验的人在屏幕上打印出Hello World 的时候,他真的感觉是对一个全新的世界说了句"你好,我来了". 尽管现在来看那时候遇到的问题现在看来真的是很简单,甚至可以说是很蠢的,但我依然感激当初那个再图书馆写代码的自己.正

[hadoop]Windows下eclipse导入hadoop源码,编译WordCount

hadoop版本为hadoop1.2.1 eclipse版本为eclipse-standard-kepler-SR2-win32-x86_64 WordCount.java为hadoop-1.2.1\src\examples\org\apache\hadoop\examples\WordCount.java 1 /** 2 * Licensed under the Apache License, Version 2.0 (the "License"); 3 * you may not