【hadoop】hadoop3.2.0的安装并测试

前言:前段时间将hadoop01的虚拟机弄的崩溃掉了,也没有备份,重新从hadoop02虚拟上克隆过来的,结果hadoop-eclipse插件一样的编译,居然用不起了,找了3天的原因,最后还是没有解决,只能用hadoop shell 命令去测试了,反正影响不大,只不过用着不方便而已。

心累中...........

正文:

解压安装Hadoop

[[email protected] ~]$ cp /home/hadoop/Resources/hadoop-3.2.0.tar.gz ~/
[[email protected] ~]$ tar -zxvf ~/hadoop-3.2.0.tar.gz
[[email protected] ~]$ cd hadoop-3.2.0
[[email protected] hadoop-3.2.0]$ ls -l
total 184
drwxr-xr-x. 2 hadoop hadoop    203 Jan  8  2019 bin
drwxr-xr-x. 3 hadoop hadoop     20 Jan  8  2019 etc
drwxr-xr-x. 2 hadoop hadoop    106 Jan  8  2019 include
drwxr-xr-x. 3 hadoop hadoop     20 Jan  8  2019 lib
drwxr-xr-x. 4 hadoop hadoop   4096 Jan  8  2019 libexec
-rw-rw-r--. 1 hadoop hadoop 150569 Oct 19  2018 LICENSE.txt
-rw-rw-r--. 1 hadoop hadoop  22125 Oct 19  2018 NOTICE.txt
-rw-rw-r--. 1 hadoop hadoop   1361 Oct 19  2018 README.txt
drwxr-xr-x. 3 hadoop hadoop   4096 Jan  8  2019 sbin
drwxr-xr-x. 4 hadoop hadoop     31 Jan  8  2019 share

配置Hadoop环境变量

[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/hadoop-env.sh
编辑文件并保存:
# The java implementation to use. By default, this environment
# variable is REQUIRED on ALL platforms except OS X!
# export JAVA_HOME=
export JAVA_HOME=/usr/java/jdk1.8.0_11/

配置YARN环境变量

[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/yarn-env.sh
编辑文件并保存:
export JAVA_HOME=/usr/java/jdk1.8.0_11/

配置核心组件文件(core-site.xml)

[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/core-site.xml
编辑文件并保存:
<configuration>
<!--hdfs 的默认地址、端口 访问地址-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9802</value>
</property>
<!--hdfs临时路径-->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoopdata</value>
</property>
</configuration>

配置文件系统(hdfs-site.xml)

[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/hdfs-site.xml
编辑文件并保存:
<configuration>
<!--hdfs web的地址 -->
<property>
    <name>dfs.namenode.http-address</name>
    <value>hadoop01:50070</value>
</property>
<!-- 副本数-->
<property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
<!-- 是否启用hdfs权限检查 false 关闭 -->
       <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>
<!-- 块大小,默认字节, 可使用 k m g t p e-->
       <property>
        <name>dfs.blocksize</name>
    <!--128m-->
        <value>134217728</value>
    </property>
    <!--hadoop的name和data目录路径-->
       <property>
         <name>dfs.namenode.name.dir</name>
         <value>file:/home/hadoop/hdfs/name</value>
   </property>
   <property>
         <name>dfs.datanode.data.dir</name>
         <value>file:/home/hadoop/hdfs/data</value>
   </property>
</configuration>

配置yarn-site.xml文件

[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/yarn-site.xml
编辑文件并保存:
<configuration>
<!-- Site specific YARN configuration properties -->
<!--集群master,-->
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop01</value>
</property>

<!-- NodeManager上运行的附属服务-->
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
<!--容器可能会覆盖的环境变量,而不是使用NodeManager的默认值-->
<property>
        <name>yarn.nodemanager.env-whitelist</name>
    <value> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ</value>
</property>
<!-- 关闭内存检测,虚拟机需要,不配会报错-->
<property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
</property>
</configuration>

配置MapReduce计算框架文件
[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/mapred-site.xml
编辑文件并保存:

<configuration>
<!--local表示本地运行,classic表示经典mapreduce框架,yarn表示新的框架-->
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<!--如果map和reduce任务访问本地库(压缩等),则必须保留原始值
当此值为空时,设置执行环境的命令将取决于操作系统:
Linux:LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native.
windows:PATH =%PATH%;%HADOOP_COMMON_HOME%\\bin.
-->
<property>
        <name>mapreduce.admin.user.env</name>
        <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<!--
可以设置AM【AppMaster】端的环境变量
如果上面缺少配置,可能会造成mapreduce失败
-->
<property>
        <name>yarn.app.mapreduce.am.env</name>
        <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
</configuration>

【选】配置slaves文件(hadoop2.x修改slaves)
[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/slaves
编辑文件并保存:

hadoop02
hadoop03

【选】配置workers文件(hadoop3.x修改workers)
[[email protected] hadoop-3.2.0]$ gedit /home/hadoop/hadoop-3.2.0/etc/hadoop/workers
编辑文件并保存:

hadoop01
hadoop02
hadoop03

复制hadoop01上的Hadoop到hadoop02和hadoop03节点上
scp -r /home/hadoop/hadoop-3.2.0 [email protected]:~/
scp -r /home/hadoop/hadoop-3.2.0 [email protected]:~/

配置操作系统环境变量(需要在所有节点上进行,且使用一般用户权限)
gedit ~/.bash_profile
source ~/.bash_profile
编辑文件并保存:

#以下是新添加入代码
export JAVA_HOME=/usr/java/jdk1.8.0_11/
export PATH=$JAVA_HOME/bin:$PATH
#hadoop
export HADOOP_HOME=/home/hadoop/hadoop-3.2.0
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

创建Hadoop数据目录(所有节点操作)
mkdir /home/hadoop/hadoopdata

格式化文件系统(主端进行)
hdfs namenode -format

启动和关闭Hadoop
cd ~/hadoop-3.2.0
sbin/start-all.sh
stop-all.sh

启动成功结果:

[[email protected] hadoop-3.2.0]$ jps
20848 DataNode
21808 Jps
21076 SecondaryNameNode
21322 ResourceManager
20668 NameNode
21468 NodeManager
[[email protected] hadoop-3.2.0]$

【最后测试】在Hadoop集群中运行程序
将计算圆周率pi的Java程序包投入运行

[[email protected] hadoop-3.2.0]$ cd ~/hadoop-3.2.0/share/hadoop/mapreduce
[[email protected] mapreduce]$ ls
hadoop-mapreduce-client-app-3.2.0.jar     hadoop-mapreduce-client-hs-plugins-3.2.0.jar       hadoop-mapreduce-client-shuffle-3.2.0.jar   lib
hadoop-mapreduce-client-common-3.2.0.jar  hadoop-mapreduce-client-jobclient-3.2.0.jar        hadoop-mapreduce-client-uploader-3.2.0.jar  lib-examples
hadoop-mapreduce-client-core-3.2.0.jar    hadoop-mapreduce-client-jobclient-3.2.0-tests.jar  hadoop-mapreduce-examples-3.2.0.jar         sources
hadoop-mapreduce-client-hs-3.2.0.jar      hadoop-mapreduce-client-nativetask-3.2.0.jar       jdiff
[[email protected] mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar
An example program must be given as the first argument.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
[[email protected] mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar pi 10 10
Number of Maps  = 10
Samples per Map = 10
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
2019-08-27 13:47:11,866 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/192.168.1.100:8032
2019-08-27 13:47:12,179 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1566884685380_0001
2019-08-27 13:47:12,285 INFO input.FileInputFormat: Total input files to process : 10
2019-08-27 13:47:12,341 INFO mapreduce.JobSubmitter: number of splits:10
2019-08-27 13:47:12,372 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2019-08-27 13:47:12,479 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1566884685380_0001
2019-08-27 13:47:12,480 INFO mapreduce.JobSubmitter: Executing with tokens: []
2019-08-27 13:47:12,645 INFO conf.Configuration: resource-types.xml not found
2019-08-27 13:47:12,645 INFO resource.ResourceUtils: Unable to find ‘resource-types.xml‘.
2019-08-27 13:47:13,018 INFO impl.YarnClientImpl: Submitted application application_1566884685380_0001
2019-08-27 13:47:13,099 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1566884685380_0001/
2019-08-27 13:47:13,099 INFO mapreduce.Job: Running job: job_1566884685380_0001
2019-08-27 13:47:20,205 INFO mapreduce.Job: Job job_1566884685380_0001 running in uber mode : false
2019-08-27 13:47:20,209 INFO mapreduce.Job:  map 0% reduce 0%
2019-08-27 13:47:27,371 INFO mapreduce.Job:  map 20% reduce 0%
2019-08-27 13:47:46,535 INFO mapreduce.Job:  map 20% reduce 7%
2019-08-27 13:47:50,559 INFO mapreduce.Job:  map 40% reduce 7%
2019-08-27 13:47:51,570 INFO mapreduce.Job:  map 50% reduce 7%
2019-08-27 13:47:53,586 INFO mapreduce.Job:  map 60% reduce 7%
2019-08-27 13:47:58,631 INFO mapreduce.Job:  map 60% reduce 20%
2019-08-27 13:47:59,641 INFO mapreduce.Job:  map 80% reduce 20%
2019-08-27 13:48:00,665 INFO mapreduce.Job:  map 100% reduce 20%
2019-08-27 13:48:01,682 INFO mapreduce.Job:  map 100% reduce 100%
2019-08-27 13:48:01,708 INFO mapreduce.Job: Job job_1566884685380_0001 completed successfully
2019-08-27 13:48:01,780 INFO mapreduce.Job: Counters: 54
    File System Counters
        FILE: Number of bytes read=226
        FILE: Number of bytes written=2443397
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=2640
        HDFS: Number of bytes written=215
        HDFS: Number of read operations=45
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=3
        HDFS: Number of bytes read erasure-coded=0
    Job Counters
        Launched map tasks=10
        Launched reduce tasks=1
        Data-local map tasks=10
        Total time spent by all maps in occupied slots (ms)=270199
        Total time spent by all reduces in occupied slots (ms)=31653
        Total time spent by all map tasks (ms)=270199
        Total time spent by all reduce tasks (ms)=31653
        Total vcore-milliseconds taken by all map tasks=270199
        Total vcore-milliseconds taken by all reduce tasks=31653
        Total megabyte-milliseconds taken by all map tasks=276683776
        Total megabyte-milliseconds taken by all reduce tasks=32412672
    Map-Reduce Framework
        Map input records=10
        Map output records=20
        Map output bytes=180
        Map output materialized bytes=280
        Input split bytes=1460
        Combine input records=0
        Combine output records=0
        Reduce input groups=2
        Reduce shuffle bytes=280
        Reduce input records=20
        Reduce output records=0
        Spilled Records=40
        Shuffled Maps =10
        Failed Shuffles=0
        Merged Map outputs=10
        GC time elapsed (ms)=67681
        CPU time spent (ms)=63700
        Physical memory (bytes) snapshot=2417147904
        Virtual memory (bytes) snapshot=30882955264
        Total committed heap usage (bytes)=2966421504
        Peak Map Physical memory (bytes)=382750720
        Peak Map Virtual memory (bytes)=2810384384
        Peak Reduce Physical memory (bytes)=181923840
        Peak Reduce Virtual memory (bytes)=2815541248
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=1180
    File Output Format Counters
        Bytes Written=97
Job Finished in 49.977 seconds
Estimated value of Pi is 3.20000000000000000000

原文地址:https://www.cnblogs.com/CQ-LQJ/p/11602927.html

时间: 2024-08-01 09:12:53

【hadoop】hadoop3.2.0的安装并测试的相关文章

Hadoop 2.2.0部署安装(笔记,单机安装)

SSH无密安装与配置 具体配置步骤: ◎ 在root根目录下创建.ssh目录 (必须root用户登录) cd /root & mkdir .ssh chmod 700 .ssh & cd .ssh ◎ 创建密码为空的 RSA 密钥对: ssh-keygen -t rsa -P "" ◎ 在提示的对称密钥名称中输入 id_rsa将公钥添加至 authorized_keys 中: cat id_rsa.pub >> authorized_keys chmod 6

Qt5.3.0的安装与测试

Qt5.3.0的安装与测试(交叉编译,用于arm,支持tslib触摸屏) 本次移植可以使用触摸屏. 首先下载源码包: http://download.qt.io/official_releases/qt/ 由于我之前pc机安装的是5.3.0,因此这里也下载linux的5.3.0 我选择的是: qt-everywhere-opensource-src-5.3.0.tar.xz 下载地址(http://download.qt.io/official_releases/qt/5.3/5.3.0/sin

Sysbench 1.0.17安装与测试

Sysbench安装与测试 1.安装: cd /usr/local/src wget https://codeload.github.com/akopytov/sysbench/tar.gz/1.0.17 tar -xzvf /usr/local/src/sysbench-1.0.17.tar.gz -C /usr/local/ cd /usr/local/sysbench-1.0.17 ##RHEL/CentOS yum -y install make automake libtool pkg

hadoop入门-在windows上安装,测试hadoop

上一篇简单叙述了怎么样在windows上编译hadoop,接着上一篇,这篇叙述怎么样安装hadoop并进行简单的验证安装是否正确.编译的机器与安装的机器分开. 我编译的机器是windows7,安装的机器是windows 2008 r2. 第一步:编译完之后,会在target目录下生成hadoop-2.2.0.tar.gz文件,将该文件解压到一个目录下面,然后将整个目录拷贝到目标机器上,尽量选择简单的目录,比如e:\hd 第二步:添加HADOOP_HOME到系统环境变量里,值为e:\hd.并将%H

[ActionSprit 3.0] FMS安装与测试

1.运行Flash Media Server4.5.exe,安装选项可全默认,序列号可不填:输入用户名和密码需记住,登录时要用. 2.安装完成,在安装目录C:\Program Files\Adobe\Flash Media Server 4.5\tools\下找到fms_adminConsole.htm或者fms_adminConsole.swf,这是管理文件,打开其中之一填入相应信息登录,Server Address 填本机ip,如192.168.3.106,Username和Password

【hadoop】hadoop3.2.0应用环境搭建指南

下面列出我搭建hadoop应用环境的文章整理在一起,不定期更新,供大家参考,互相学习!!! 第一篇 HADOOP部分 1.1 hadoop3.2.0的安装并测试 1.2 编译Hadoop连接eclipse的插件遇见的一系列错误,崩溃的操作 1.3 在eclipse上运行WordCount的操作过程 第二篇 HIVE与HBASE部分 2.1centos7下mysql的安装以及基本操作 2.2centos7下apache-hive-3.1.2-bin的安装测试 2.3apache-zookeeper

【hbase】hbase配置独立的zookeeper的安装与测试

下载hbase-2.2.1-bin.tar.gz并执行安装命令: [[email protected] ~]$ tar -zxvf hbase-2.2.1-bin.tar.gz 查看安装目录: [[email protected] ~]$ cd hbase-2.2.1 [[email protected] hbase-2.2.1]$ ls -l total 872 drwxr-xr-x. 4 hadoop hadoop 4096 Sep 10 14:26 bin -rw-r--r--. 1 ha

【sqoop】安装配置测试sqoop1

1.1.1 下载sqoop1:sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz 1.1.2 解压并查看目录: [[email protected] ~]$ tar -zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz --解压 [[email protected] ~]$ cd sqoop-1.4.7.bin__hadoop-2.6.0 [[email protected] sqoop-1.4.7.bin__hadoop-2.6.0]$ l

【spark】spark-2.4.4的安装与测试

4.2.1 下载并安装spark 下载文件名:spark-2.4.4-bin-without-hadoop.tgz [[email protected] ~]$ tar -zxvf spark-2.4.4-bin-without-hadoop.tgz 4.2.2 配置linux环境变量 [[email protected] ~]$ gedit ~/.bash_profile [[email protected] ~]$ source ~/.bash_profile 新加入: #spark exp