Hadoop2.7的配置部署及测试

1.环境准备:

安装Centos6.5的操作系统

下载hadoop2.7版本的软件

wget http://124.205.69.132/files/224400000162626A/mirrors.hust.edu.cn/apache/hadoop/common/stable/hadoop-2.7.1.tar.gz

下载jdk1.87版本的软件

wget http://download.oracle.com/otn-pub/java/jdk/8u60-b27/jdk-8u60-linux-x64.tar.gz?AuthParam=1443446776_174368b9ab1a6a92468aba5cd4d092d0

2.修改/etc/hosts文件及配置互信:

在/etc/hosts文件中增加如下内容:

192.168.1.61 host61

192.168.1.62 host62

192.168.1.63 host63

配置好各服务器之间的ssh互信

3.添加用户,解压文件并配置环境变量:

useradd hadoop

passwd hadoop

tar -zxvf hadoop-2.7.1.tar.gz

mv hadoop-2.7.1 /usr/local

ln -s hadoop-2.7.1 hadoop

chown -R hadoop:hadoop hadoop-2.7.1

tar -zxvf jdk-8u60-linux-x64.tar.gz

mv jdk1.8.0_60 /usr/local

ln -s jdk1.8.0_60 jdk

chown -R root:root jdk1.8.0_60

echo ‘export JAVA_HOME=/usr/local/jdk‘ >>/etc/profile

echo ‘export PATH=/usr/local/jdk/bin:$PATH‘ >/etc/profile.d/java.sh

4.修改hadoop配置文件:

1)修改hadoop-env.sh文件:

cd /usr/local/hadoop/etc/hadoop/hadoop-env.sh

sed -i ‘s%#export JAVA_HOME=${JAVA_HOME}%export JAVA_HOME=/usr/local/jdk%g‘ hadoop-env.sh

2)修改core-site.xml,在最后添加如下内容:

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://host61:9000/</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/temp</value>

</property>

</configuration>

3)修改hdfs-site.xml文件:

<configuration>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

</configuration>

4)修改mapred-site.xml

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>host61:9001</value>

</property>

</configuration>

5)配置masters

host61

6)配置slaves

host62

host63

5.用同样的方式配置host62及host63

6.格式化分布式文件系统

/usr/local/hadoop/bin/hadoop namenode format

7.替换hadoop的库文件:

mv /usr/local/hadoop/lib/native /usr/local/hadoop/lib/native_old

将编译好的hadoop文件下的lib/native文件夹复制过来;

8.运行hadoop

1)/usr/local/hadoop/sbin/start-dfs.sh

2)/usr/local/hadoop/sbin/start-yarn.sh

9.检查:

[[email protected] sbin]# jps

4532 ResourceManager

4197 NameNode

4793 Jps

4364 SecondaryNameNode

[[email protected] ~]# jps

32052 DataNode

32133 NodeManager

32265 Jps

[[email protected] local]# jps

6802 NodeManager

6963 Jps

6717 DataNode

10.通过web了解hadoop:

namenode的信息:

http://192.168.1.61:50070/

secondnamenode的信息:

http://192.168.1.61:50090/

datanode的信息:

http://192.168.1.62:50075/

11.测试

echo "this is the first file" >/tmp/mytest1.txt

echo "this is the second file" >/tmp/mytest2.txt

cd /usr/local/hadoop/bin;

[[email protected] bin]$ ./hadoop fs -mkdir /in

[[email protected] bin]$ ./hadoop fs -put /tmp/mytest*.txt /in

[[email protected] bin]$ ./hadoop fs -ls /in

Found 2 items

-rw-r--r--   3 hadoop supergroup         23 2015-10-02 18:45 /in/mytest1.txt

-rw-r--r--   3 hadoop supergroup         24 2015-10-02 18:45 /in/mytest2.txt

[[email protected] hadoop]$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar  wordcount /in /out

15/10/02 18:53:30 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id

15/10/02 18:53:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=

15/10/02 18:53:34 INFO input.FileInputFormat: Total input paths to process : 2

15/10/02 18:53:35 INFO mapreduce.JobSubmitter: number of splits:2

15/10/02 18:53:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1954603964_0001

15/10/02 18:53:40 INFO mapreduce.Job: The url to track the job: http://localhost:8080/

15/10/02 18:53:40 INFO mapreduce.Job: Running job: job_local1954603964_0001

15/10/02 18:53:40 INFO mapred.LocalJobRunner: OutputCommitter set in config null

15/10/02 18:53:40 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:40 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter

15/10/02 18:53:41 INFO mapred.LocalJobRunner: Waiting for map tasks

15/10/02 18:53:41 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_m_000000_0

15/10/02 18:53:41 INFO mapreduce.Job: Job job_local1954603964_0001 running in uber mode : false

15/10/02 18:53:41 INFO mapreduce.Job:  map 0% reduce 0%

15/10/02 18:53:41 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:41 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

15/10/02 18:53:41 INFO mapred.MapTask: Processing split: hdfs://host61:9000/in/mytest2.txt:0+24

15/10/02 18:53:51 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)

15/10/02 18:53:51 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100

15/10/02 18:53:51 INFO mapred.MapTask: soft limit at 83886080

15/10/02 18:53:51 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600

15/10/02 18:53:51 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

15/10/02 18:53:51 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer

15/10/02 18:53:52 INFO mapred.LocalJobRunner:

15/10/02 18:53:52 INFO mapred.MapTask: Starting flush of map output

15/10/02 18:53:52 INFO mapred.MapTask: Spilling map output

15/10/02 18:53:52 INFO mapred.MapTask: bufstart = 0; bufend = 44; bufvoid = 104857600

15/10/02 18:53:52 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600

15/10/02 18:53:52 INFO mapred.MapTask: Finished spill 0

15/10/02 18:53:52 INFO mapred.Task: Task:attempt_local1954603964_0001_m_000000_0 is done. And is in the process of committing

15/10/02 18:53:53 INFO mapred.LocalJobRunner: map

15/10/02 18:53:53 INFO mapred.Task: Task ‘attempt_local1954603964_0001_m_000000_0‘ done.

15/10/02 18:53:53 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_m_000000_0

15/10/02 18:53:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_m_000001_0

15/10/02 18:53:53 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:53 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

15/10/02 18:53:53 INFO mapred.MapTask: Processing split: hdfs://host61:9000/in/mytest1.txt:0+23

15/10/02 18:53:53 INFO mapreduce.Job:  map 100% reduce 0%

15/10/02 18:53:53 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)

15/10/02 18:53:53 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100

15/10/02 18:53:53 INFO mapred.MapTask: soft limit at 83886080

15/10/02 18:53:53 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600

15/10/02 18:53:53 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

15/10/02 18:53:53 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer

15/10/02 18:53:54 INFO mapred.LocalJobRunner:

15/10/02 18:53:54 INFO mapred.MapTask: Starting flush of map output

15/10/02 18:53:54 INFO mapred.MapTask: Spilling map output

15/10/02 18:53:54 INFO mapred.MapTask: bufstart = 0; bufend = 43; bufvoid = 104857600

15/10/02 18:53:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600

15/10/02 18:53:54 INFO mapred.MapTask: Finished spill 0

15/10/02 18:53:54 INFO mapred.Task: Task:attempt_local1954603964_0001_m_000001_0 is done. And is in the process of committing

15/10/02 18:53:54 INFO mapreduce.Job:  map 50% reduce 0%

15/10/02 18:53:54 INFO mapred.LocalJobRunner: map

15/10/02 18:53:54 INFO mapred.Task: Task ‘attempt_local1954603964_0001_m_000001_0‘ done.

15/10/02 18:53:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_m_000001_0

15/10/02 18:53:54 INFO mapred.LocalJobRunner: map task executor complete.

15/10/02 18:53:54 INFO mapred.LocalJobRunner: Waiting for reduce tasks

15/10/02 18:53:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1954603964_0001_r_000000_0

15/10/02 18:53:54 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 1

15/10/02 18:53:54 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]

15/10/02 18:53:54 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: [email protected]

15/10/02 18:53:55 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10

15/10/02 18:53:55 INFO reduce.EventFetcher: attempt_local1954603964_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events

15/10/02 18:53:55 INFO mapreduce.Job:  map 100% reduce 0%

15/10/02 18:53:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1954603964_0001_m_000001_0 decomp: 55 len: 59 to MEMORY

15/10/02 18:53:56 INFO reduce.InMemoryMapOutput: Read 55 bytes from map-output for attempt_local1954603964_0001_m_000001_0

15/10/02 18:53:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 55, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->55

15/10/02 18:53:56 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1954603964_0001_m_000000_0 decomp: 56 len: 60 to MEMORY

15/10/02 18:53:56 INFO reduce.InMemoryMapOutput: Read 56 bytes from map-output for attempt_local1954603964_0001_m_000000_0

15/10/02 18:53:56 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 56, inMemoryMapOutputs.size() -> 2, commitMemory -> 55, usedMemory ->111

15/10/02 18:53:56 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning

15/10/02 18:53:56 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/10/02 18:53:56 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs

15/10/02 18:53:57 INFO mapred.Merger: Merging 2 sorted segments

15/10/02 18:53:57 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 97 bytes

15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merged 2 segments, 111 bytes to disk to satisfy reduce memory limit

15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merging 1 files, 113 bytes from disk

15/10/02 18:53:57 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce

15/10/02 18:53:57 INFO mapred.Merger: Merging 1 sorted segments

15/10/02 18:53:57 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 102 bytes

15/10/02 18:53:57 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/10/02 18:53:57 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords

15/10/02 18:53:59 INFO mapred.Task: Task:attempt_local1954603964_0001_r_000000_0 is done. And is in the process of committing

15/10/02 18:53:59 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/10/02 18:53:59 INFO mapred.Task: Task attempt_local1954603964_0001_r_000000_0 is allowed to commit now

15/10/02 18:53:59 INFO output.FileOutputCommitter: Saved output of task ‘attempt_local1954603964_0001_r_000000_0‘ to hdfs://host61:9000/out/_temporary/0/task_local1954603964_0001_r_000000

15/10/02 18:53:59 INFO mapred.LocalJobRunner: reduce > reduce

15/10/02 18:53:59 INFO mapred.Task: Task ‘attempt_local1954603964_0001_r_000000_0‘ done.

15/10/02 18:53:59 INFO mapred.LocalJobRunner: Finishing task: attempt_local1954603964_0001_r_000000_0

15/10/02 18:53:59 INFO mapred.LocalJobRunner: reduce task executor complete.

15/10/02 18:53:59 INFO mapreduce.Job:  map 100% reduce 100%

15/10/02 18:53:59 INFO mapreduce.Job: Job job_local1954603964_0001 completed successfully

15/10/02 18:54:00 INFO mapreduce.Job: Counters: 35

File System Counters

FILE: Number of bytes read=821850

FILE: Number of bytes written=1655956

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=118

HDFS: Number of bytes written=42

HDFS: Number of read operations=22

HDFS: Number of large read operations=0

HDFS: Number of write operations=5

Map-Reduce Framework

Map input records=2

Map output records=10

Map output bytes=87

Map output materialized bytes=119

Input split bytes=196

Combine input records=10

Combine output records=10

Reduce input groups=6

Reduce shuffle bytes=119

Reduce input records=10

Reduce output records=6

Spilled Records=20

Shuffled Maps =2

Failed Shuffles=0

Merged Map outputs=2

GC time elapsed (ms)=352

Total committed heap usage (bytes)=457912320

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=47

File Output Format Counters

Bytes Written=42

[[email protected] hadoop]$

[[email protected] hadoop]$ ./bin/hadoop fs -ls /out

Found 2 items

-rw-r--r--   3 hadoop supergroup          0 2015-10-02 18:53 /out/_SUCCESS

-rw-r--r--   3 hadoop supergroup         42 2015-10-02 18:53 /out/part-r-00000

[[email protected] hadoop]$ ./bin/hadoop fs -cat /out/_SUCCESS

[[email protected] hadoop]$ ./bin/hadoop fs -cat /out/part-r-00000

file 2

first 1

is 2

second 1

the 2

this 2

[[email protected] hadoop]$

12.至此hadoop的配置部署工作顺利完成;

时间: 2024-10-14 19:19:39

Hadoop2.7的配置部署及测试的相关文章

hive 安装配置部署与测试

系统初始化 mysql5.6 的安装配置 hive 的安装配置处理 一: 系统环境初始化 1.1 系统环境: CentOS6.4x64 安装好的hadoop伪分布环境 所需软件包: apache-hive-0.3.1.tar.gz mysql-connector-java-5.1.27.tar.gz mysql-server-5.6.24-1.el6x86_64 mysql-client-5.6.24-1.el6x86_64 上传到/home/hadoop/yangyang/ 二: 安装mysq

Hive运行架构及配置部署

Hive 运行架构 由Facebook开源,最初用于解决海量结构化的日志数据统计问题:ETL工具: 构建于Hadoop的HDFS和MapReduce智商,用于管理和查询结构化/非结构化数据的数据仓库: 设计目的是让SQL技能良好,但Java技能较弱的分析师可以查询海量数据: 使用HQL作为查询接口: 使用HDFS作为存储底层: 使用MapReduce作为执行层: 2008年facebook把Hive项目贡献给Apache: 1.Hive的缺点 Hive的HQL表达能力有限:有些复杂运算用HQL不

3-2 Hadoop伪分布模式配置部署

Hadoop伪分布模式配置部署 一.实验介绍 1.1 实验内容 hadoop配置文件介绍及修改 hdfs格式化 启动hadoop进程,验证安装 1.2 实验知识点 hadoop核心配置文件 文件系统的格式化 测试WordCount程序 1.3 实验环境 hadoop2.7.6 CentOS6终端 1.4 适合人群 本课程难度为一般,属于初级级别课程,适合具有hadoop基础的用户. 1.5 相关文件 https://pan.baidu.com/s/1a_Pjl8uJ2d_-r1hbN05fWA

3-3 Hadoop集群完全分布式配置部署

Hadoop集群完全分布式配置部署 下面的部署步骤,除非说明是在哪个服务器上操作,否则默认为在所有服务器上都要操作.为了方便,使用root用户. 1.准备工作 1.1 centOS6服务器3台 手动指定3服务器台以下信息: hostname IP mask gateway DNS 备注 master 172.17.138.82 255.255.255.0 172.17.138.1 202.203.85.88 服务器1 slave1 172.17.138.83 255.255.255.0 172.

集群部署及测试SolrCloud-5

SolrCloud-5.2.1 集群部署及测试 一. 说明 Solr5内置了Jetty服务,所以不用安装部署到Tomcat了,网上部署Tomcat的资料太泛滥了. 部署前的准备工作: 1. 将各主机IP配置为静态IP(保证各主机可以正常通信,为避免过多的网络传输,建议在同一网段). 2. 修改主机名,配置各主机映射:修改hosts文件,加入各主机IP和主机名的映射. 3. 开放相应端口或者直接关闭防火墙. 4. 保证Zookeeper集群服务正常运行.Zookeeper的部署参考:http://

【甘道夫】Hive 0.13.1 on Hadoop2.2.0 + Oracle10g部署详解

环境: hadoop2.2.0 hive0.13.1 Ubuntu 14.04 LTS java version "1.7.0_60" Oracle10g ***欢迎转载,请注明来源***    http://blog.csdn.net/u010967382/article/details/38709751 到以下地址下载安装包 http://mirrors.cnnic.cn/apache/hive/stable/apache-hive-0.13.1-bin.tar.gz 安装包解压到

网上最新的devstack安装教程【一键部署openstack测试环境】

这段时间由于测试需要搭建openstack环境,搜遍谷歌百度进行不下30次尝试...看了陈沙克等前辈的安装教程,很多配置信息都已经过期了.不过大致的步骤基本不变,由于最近devstack在github更新频繁,可能很多功能被添加更新,源软件链接可能需要更改.下面给出这段时间安装的步骤(附最新源),希望对有需要进行一键部署openstack测试环境的朋友有帮助. 测试版本:Ubuntu 12.04.5 LTS 系统内核:Linux r10f17332.sqa.zmf 3.2.0-60-generi

slurm-16.05.3任务调度系统部署与测试(1)

1.概述2.同步节点时间3.下载并解压文件4.编译安装munge-0.5.125.配置munge6.编译安装slurm-16.05.37.配置slurm8.配置MySQL数据库环境9.启动slurm集群10.总结 1.概述 本博客通过VMware workstation创建了虚拟机console,然后在console内部创建了8台kvm虚拟机,使用这8台虚拟机作为集群,来部署配置和测试slurm任务调度系统 配置为4核心CPU,8G内存,20G系统盘,20G数据盘挂载到/opt,10G数据盘挂载

LDAP-openldap服务部署和测试(YUM安装)

1. 概述2. 服务端部署过程2.1 软件包说明2.2 部署过程2.3 配置过程3. 测试4. 生成LDIF格式文件4.1 安装migrationtools工具4.2 用migrationtools生成ldif文件4.3 添加ldif到ldap数据库5. 日志配置5.1 openldap的日志级别5.2 配置日志功能6. 客户端配置6.1 基础环境准备6.2 配置nslcd客户端7. 添加系统用户7.1 添加用户7.2 产生ldif文件7.3 添加ldif文件至LDAP数据库中7.4 验证 1.