Hadoop2.6运行wordcount

Hadoop2.6运行wordcount

1、启动hadoop

[[email protected] hadoop-2.6.0]$ ./sbin/start-all.sh

[[email protected] hadoop-2.6.0]$ jps

21444 ResourceManager

21301 SecondaryNameNode

22072 Jps

21117 NameNode

[[email protected] current]$ jps

5505 NodeManager

5397 DataNode

6102 Jps

2、在hadoop的目录下创建一个file文件夹(哪里其实无所谓,导入到input就行)

[[email protected] ~]$ mkdir file

[[email protected] ~]$ cd file

在file文件夹中创建两个子文件,并输入内容:

[[email protected] file]$ echo "Hello World" > file1.txt

[[email protected] file]$ echo "Hello World" > file2.txt

[[email protected] file]$ ls

file1.txt  file2.txt

[[email protected] file]$ cat file1.txt

Hello World

[[email protected] file]$ cat file2.txt

Hello World

3、在HDFS上创建输入文件夹目录 input

[[email protected] hadoop-2.6.0]$ bin/hadoop fs -mkdir /input

[[email protected] hadoop-2.6.0]$ hadoop fs -ls /

Found 1 items

drwxr-xr-x   - hadoop supergroup          0 2016-02-28 15:51 /input

[[email protected] hadoop-2.6.0]$ bin/hadoop fs -put ~/file/file

file1.txt  file2.txt

4、把本地文件传到hdfs的/input中

[[email protected] hadoop-2.6.0]$ bin/hadoop fs -put ~/file/file* /input

[[email protected] hadoop-2.6.0]$ bin/hadoop fs -ls /input

Found 2 items

-rw-r--r--   2 hadoop supergroup         12 2016-02-28 15:55 /input/file1.txt

-rw-r--r--   2 hadoop supergroup         12 2016-02-28 15:55 /input/file2.txt

5、运行wordcount程序(使用hadoop自带运行wordcount的jar包)

[[email protected] hadoop-2.6.0]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input/ /output/wordcount1

16/02/28 15:58:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

16/02/28 15:58:16 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.101.230:8032

16/02/28 15:58:17 INFO input.FileInputFormat: Total input paths to process : 2

16/02/28 15:58:17 INFO mapreduce.JobSubmitter: number of splits:2

16/02/28 15:58:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1456645810248_0001

16/02/28 15:58:19 INFO impl.YarnClientImpl: Submitted application application_1456645810248_0001

16/02/28 15:58:19 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1456645810248_0001/

16/02/28 15:58:19 INFO mapreduce.Job: Running job: job_1456645810248_0001

16/02/28 15:58:32 INFO mapreduce.Job: Job job_1456645810248_0001 running in uber mode : false

16/02/28 15:58:32 INFO mapreduce.Job:  map 0% reduce 0%

16/02/28 15:58:43 INFO mapreduce.Job:  map 100% reduce 0%

16/02/28 15:58:56 INFO mapreduce.Job:  map 100% reduce 100%

16/02/28 15:58:56 INFO mapreduce.Job: Job job_1456645810248_0001 completed successfully

16/02/28 15:58:56 INFO mapreduce.Job: Counters: 49

File System Counters

FILE: Number of bytes read=54

FILE: Number of bytes written=317807

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=222

HDFS: Number of bytes written=16

HDFS: Number of read operations=9

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=2

Launched reduce tasks=1

Data-local map tasks=2

Total time spent by all maps in occupied slots (ms)=19118

Total time spent by all reduces in occupied slots (ms)=8889

Total time spent by all map tasks (ms)=19118

Total time spent by all reduce tasks (ms)=8889

Total vcore-seconds taken by all map tasks=19118

Total vcore-seconds taken by all reduce tasks=8889

Total megabyte-seconds taken by all map tasks=19576832

Total megabyte-seconds taken by all reduce tasks=9102336

Map-Reduce Framework

Map input records=2

Map output records=4

Map output bytes=40

Map output materialized bytes=60

Input split bytes=198

Combine input records=4

Combine output records=4

Reduce input groups=2

Reduce shuffle bytes=60

Reduce input records=4

Reduce output records=2

Spilled Records=8

Shuffled Maps =2

Failed Shuffles=0

Merged Map outputs=2

GC time elapsed (ms)=394

CPU time spent (ms)=3450

Physical memory (bytes) snapshot=368005120

Virtual memory (bytes) snapshot=959819776

Total committed heap usage (bytes)=247578624

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=24

File Output Format Counters

Bytes Written=16

6、查看输出结果,计数成功

[[email protected] hadoop-2.6.0]$ bin/hdfs dfs -cat /output/wordcount1/*

16/02/28 16:00:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Hello   2

World   2

同时可以在web页面上查看wordcount运行的结果

时间: 2024-11-05 14:47:27

Hadoop2.6运行wordcount的相关文章

CentOS上安装Hadoop2.7,添加数据节点,运行wordcount

安装hadoop的步骤比较繁琐,但是并不难. 在CentOS上安装Hadoop2.7 1. 安装 CentOS,注:图形界面并无必要 2. 在CentOS里设置静态IP,手工编辑如下4个文件 /etc/hosts /etc/sysconfig/netwok /etc/hostname /etc/sysconfig/network-scripts/ifcfg-eno1677773 3. 关闭防火墙 Close firewalld systemctl stop firewalld.service #

win10+eclipse+hadoop2.7.2+maven直接通过Run as Java Application运行wordcount

一.准备工作 (1)Hadoop2.7.2 在linux部署完毕,成功启动dfs和yarn,通过jps查看,进程都存在 (2)安装maven 二.最终效果 在windows系统中,直接通过Run as Java Application运行wordcount,而不需要先打包成jar包,然后在linux终端运行 三,操作步骤 1.启动dfs和yarn 终端:${HADOOP_HOME}/sbin/start-dfs.sh ${HADOOP_HOME}/sbin/start-yarn.sh 通过在na

hadoop-2.6.0伪分布运行WordCount

hadoop-2.6.0伪分布运行WordCount 1.启动Hadoop:  2.创建file 文件夹:  这个是建在本地硬盘上的: 查看创建的 file 文件: 进入该目录,创建两个 txt 文件: 结果如下: 3.在HDFS上创建输入文件夹目录 input : 把本地硬盘上创建的文件传进input 里面: 查看结果: 4.Hadoop自带的运行 wordcount 例子的 jar 包: 5.开始运行 wordcount: 过程: 查看运行结果: 附完整运行过程: 附代码: import j

Hadoop2.8.2 运行wordcount

1 例子jar位置 [[email protected] mapreduce]$ pwd /hadoop/hadoop-2.8.2/share/hadoop/mapreduce [[email protected] mapreduce]$ ls -lrt 总用量 5084 drwxr-xr-x 2 hadoop hadoop 4096 10月 20 05:11 lib drwxr-xr-x 2 hadoop hadoop 4096 10月 20 05:11 jdiff -rw-r--r-- 1

RedHat 安装Hadoop并运行wordcount例子

1.安装 Red Hat 环境 2.安装JDK 3.下载hadoop2.8.0 http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz 4.在用户目录下新建hadoop文件夹,并解压hadoop压缩包 mkdir Hadoop tar -zxvf hadoop-2.8.0.tar.gz 5.为hadoop配置JAVA_HOME [[email protected] ~]$

[Linux][Hadoop] 运行WordCount例子

紧接上篇,完成Hadoop的安装并跑起来之后,是该运行相关例子的时候了,而最简单最直接的例子就是HelloWorld式的WordCount例子.   参照博客进行运行:http://xiejianglei163.blog.163.com/blog/static/1247276201443152533684/   首先创建一个文件夹,并创建两个文件,目录随意,为以下文件结构: examples --file1.txt --file2.txt 文件内容随意填写,我是从新闻copy下来的一段英文: 执

hadoop 2.2.0 编译运行wordcount

hadoop2.2.0 编译运行wordcount,因为hadoop2.2.0不支持eclipse的插件,所以运行wordcount,需要手动编译并将wordcount打包成jar包来运行,下面记录一下编译运行的过程,希望能给大家有些帮助. 1.首先介绍下hadoop的版本问题,当前Hadoop版本比较混乱,让很多用户不知所措.实际上,当前Hadoop只有两个版本:Hadoop 1.0和Hadoop 2.0,其中,Hadoop 1.0由一个分布式文件系统HDFS和一个离线计算框架MapReduc

win7 64位下安装hadoop的eclipse插件并编写运行WordCount程序

win7 64位下安装hadoop的eclipse插件并编写运行WordCount程序 环境: win7 64位 hadoop-2.6.0 步骤: 1.下载hadoop-eclipse-plugin-2.6.0.jar包 2.把hadoop-eclipse-plugin-2.6.0.jar放到eclipse安装目录下的plugins目录下 3.打开eclipse发现左边多出来一个DFS Locations 4.在win7上解压hadoop-2.6.0. 5.下载hadoop.dll.winuti

运行wordcount

在/home/llh/hadoop目录下创建文件夹file 创建两个文本文件 在hdfs上创建输入文件夹 $ bin/hadoop fs -mkdir /input 将file中的文件上传到input目录下 $ bin/hadoop fs -put /home/llh/hadoop/file/file*.txt /input 运行Wordcount程序 $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.