hadoop2.6安装配置以及整合eclipse开发环境

在ubuntu14.04上安装java和hadoop环境

Java安装的是/usr/lib/jvm/jdk1.7.0_72

1.下载,

2.使用sudo创建jvm文件夹,并且cp

3.解压tar–zxvf

4.sudochown -R castle:castle hadoop-2.6.0修改权限

5.配置环境变量

~/.profile中也可以在~/.bashrc中添加

#setjava env

exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_72

exportJRE_HOME=${JAVA_HOME}/jre

exportCLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib

exportPATH=${JAVA_HOME}/bin:$PATH

#sethadoop env

exportHADOOP_HOME=/usr/local/hadoop/hadoop-2.6.0

exportPATH=$PATH:$HADOOP_HOME/bin

source .profile
不需要注销登陆时文件生效

hadoop/usr/local/hadoop/hadoop-2.6.0

前面的步骤与上面的很相似的

1.配置etc/hadoop/hadoop-env.sh

#set to the root of your Java installation

exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_72

#hadoop

exportHADOOP_PREFIX=/usr/local/hadoop/hadoop-2.6.0

2.伪分布配置

etc/hadoop/core-site.xml:


<property>
        <name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/hadoop-2.6.0/tmp</value>
        <description>Abase for other
temporary directories.
          </description>
    </property>
<configuration>
   <property>
       <name>fs.defaultFS</name>
       <value>hdfs://localhost:9000</value>
   </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
   <property>
       <name>dfs.replication</name>
       <value>1</value>
   </property>
   <property>
       <name>dfs.namenode.name.dir</name>
       <value>file:/usr/local/hadoop/hadoop-2.6.0/dfs/name</value>
   </property>
   <property>
       <name>dfs.datanode.data.dir</name>
       <value>file:/usr/local/hadoop/hadoop-2.6.0/dfs/data</value>
   </property>
   <property>
           <name>dfs.permissions</name>
           <value>false</value>

////这个属性节点是为了防止后面eclopse存在拒绝读写设置的
    </property>
</configuration>

mapred-site.xml

<!--mapreduce parameter -->

<!--新框架支持第三方MapReduce开发框架以支持如SmartTalk/DGSG等非Yarn架构,注意通常情况下这个配置的值都设置为Yarn,

如果没有配置这项,那么提交的Yarn job只会运行在locale模式,而不是分布式模式。-->

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>

注意:旧版的mapreduce在这里面是要配置以下内容的:

<property>

<name>mapred.job.tracker</name>

<value>http://192.168.1.2:9001</value>

</property>

新框架中已改为Yarn-site.xml中的resouceManager及nodeManager具体配置项,新框架中历史job的查询已从Jobtracker剥离,归入单独的mapreduce.jobtracker.jobhistory相关配置,

所以这里不需要配置这个选项。在yarn-site.xml配置相关属性即可。

yarn-site.xml

<configuration>
    <property>
<name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

关于新旧版本的mapreduce的差别可以查看这些:

http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop-yarn/

虾皮最经典的集群配置方法。http://www.cnblogs.com/xia520pi/archive/2012/05/16/2503949.html

其他的博文

http://www.cnblogs.com/kinglau/p/3802705.html

http://blog.csdn.net/ggz631047367/article/details/42497557

3.配置SSH无密码登陆

如果ubuntu没有安装ssh相关的软件

$
sudo apt-get install ssh
$
sudo apt-get install rsync

Setuppassphraseless ssh

Nowcheck that you can ssh to the localhost without a passphrase:

 $
ssh localhost

Ifyou cannot ssh to localhost without a passphrase, execute thefollowing commands:

 $
ssh-keygen -t dsa -P ‘‘ -f ~/.ssh/id_dsa
 $
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys


ssh-keygen
代表产生密钥
ssh
localhost还是出现问题
无法连接
ssh:
connect to host localhost port 22: Connection refused
从网上得知解决办法
1.首先查看是否有sshd进程
ps
-e | grep ssh
2.没有的话启动
  /etc/init.d/ssh
-start 如果启动不了的话,需要安装
3.安装
sudo
apt-get install openssh-server
4.重新启动
5.查看可以了
1695
?        00:00:00 ssh-agent
12407
?        00:00:00 sshd
[email protected]:~$
ssh localhost
The
authenticity of host ‘localhost (127.0.0.1)‘ can‘t be established.
ECDSA
key fingerprint is ae:23:4a:95:bc:37:dd:1a:5b:48:4f:66:e2:87:12:1c.
Are
you sure you want to continue connecting (yes/no)? y
Please
type ‘yes‘ or ‘no‘: yes
Warning:
Permanently added ‘localhost‘ (ECDSA) to the list of known hosts.
Welcome
to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-43-generic x86_64)

*
Documentation:  https://help.ubuntu.com/


The
programs included with the Ubuntu system are free software;
the
exact distribution terms for each program are described in the
individual
files in /usr/share/doc/*/copyright.

Ubuntu
comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable
law.








 $
bin/hdfs namenode -format
bin/hdfs namenode -format 只需要执行一次即可。如果执行两次的话,
每次namenode
format会重新创建一个namenodeId
/usr/local/hadoop/hadoop2.6.0/tmp/dfs/name
会被清空;而datanode不清空。
会出现:datanode的clusterID
和
namenode的clusterID
不匹配
出现这种问题的解决办法是:修改.../tmp/dfs/name下的namenodeId.
为什么我在hadoop0.20.2中每一次都执行了format?
我想是因为我每一次format都不成功的原因吧。

hdfs dfs -mkdir /user 在hdfs中创建文件夹。


$
sbin/start-dfs.sh
使用jps命令查看
2855
org.eclipse.equinox.launcher_1.3.0.v20140415-2008.jar
11127 DataNode
10975 NameNode
11432 Jps
11284 SecondaryNameNode
表示成功了。

$
sbin/start-yarn.sh

$
sbin/stop-dfs.sh

$
sbin/stop-yarn.sh



如果在eclipse运行helloword的时候,控制台没有打印出运行的过程。那么就将hadoop安装文件夹中的etc/hadoop/log4j.properties复制到eclipse项目中的src文件夹中即可。

15/01/2410:30:12 WARN util.NativeCodeLoader: Unable to load native-hadooplibrary for your platform...
using builtin-java classes whereapplicable

15/01/2410:30:13 INFO Configuration.deprecation: session.id is deprecated.Instead, use dfs.metrics.session-id

15/01/2410:30:13 INFO jvm.JvmMetrics: Initializing JVM Metrics withprocessName=JobTracker,
sessionId=

15/01/2410:30:13 WARN mapreduce.JobSubmitter: No job jar file set. Userclasses may not be
found. See Job or Job#setJar(String).

15/01/2410:30:13 INFO input.FileInputFormat: Total input paths to process : 2

15/01/2410:30:14 INFO mapreduce.JobSubmitter: number of splits:2

15/01/2410:30:14 INFO mapreduce.JobSubmitter: Submitting tokens for job:job_local632218717_0001

15/01/2410:30:14 INFO mapreduce.Job: The url to track the job:http://localhost:8080/

15/01/2410:30:14 INFO mapreduce.Job: Running job: job_local632218717_0001

15/01/2410:30:14 INFO mapred.LocalJobRunner: OutputCommitter set in confignull

15/01/2410:30:14 INFO mapred.LocalJobRunner: OutputCommitter isorg.apache.hadoop.mapreduce.lib.output.FileOutputCommitter

15/01/2410:30:15 INFO mapred.LocalJobRunner: Waiting for map tasks

15/01/2410:30:15 INFO mapred.LocalJobRunner: Starting task:attempt_local632218717_0001_m_000000_0

15/01/2410:30:15 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]

15/01/2410:30:15 INFO mapred.MapTask: Processing split:hdfs://localhost:9000/user/castle/wordcount_input/input1:0+32

15/01/2410:30:15 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)

15/01/2410:30:15 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100

15/01/2410:30:15 INFO mapred.MapTask: soft limit at 83886080

15/01/2410:30:15 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600

15/01/2410:30:15 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

15/01/2410:30:15 INFO mapred.MapTask: Map output collector class =org.apache.hadoop.mapred.MapTask$MapOutputBuffer

15/01/2410:30:15 INFO mapred.LocalJobRunner:

15/01/2410:30:15 INFO mapred.MapTask: Starting flush of map output

15/01/2410:30:15 INFO mapred.MapTask: Spilling map output

15/01/2410:30:15 INFO mapred.MapTask: bufstart = 0; bufend = 52; bufvoid =104857600

15/01/2410:30:15 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend =26214380(104857520);
length = 17/6553600

15/01/2410:30:15 INFO mapred.MapTask: Finished spill 0

15/01/2410:30:15 INFO mapred.Task:Task:attempt_local632218717_0001_m_000000_0 is done. And
is in theprocess of committing

15/01/2410:30:15 INFO mapred.LocalJobRunner: map

15/01/2410:30:15 INFO mapred.Task: Task‘attempt_local632218717_0001_m_000000_0‘ done.

15/01/2410:30:15 INFO mapred.LocalJobRunner: Finishing task:attempt_local632218717_0001_m_000000_0

15/01/2410:30:15 INFO mapred.LocalJobRunner: Starting task:attempt_local632218717_0001_m_000001_0

15/01/2410:30:15 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]

15/01/2410:30:15 INFO mapred.MapTask: Processing split:hdfs://localhost:9000/user/castle/wordcount_input/input2:0+29

15/01/2410:30:15 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)

15/01/2410:30:15 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100

15/01/2410:30:15 INFO mapred.MapTask: soft limit at 83886080

15/01/2410:30:15 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600

15/01/2410:30:15 INFO mapred.MapTask: kvstart = 26214396; length = 6553600

15/01/2410:30:15 INFO mapred.MapTask: Map output collector class =org.apache.hadoop.mapred.MapTask$MapOutputBuffer

15/01/2410:30:15 INFO mapred.LocalJobRunner:

15/01/2410:30:15 INFO mapred.MapTask: Starting flush of map output

15/01/2410:30:15 INFO mapred.MapTask: Spilling map output

15/01/2410:30:15 INFO mapred.MapTask: bufstart = 0; bufend = 49; bufvoid =104857600

15/01/2410:30:15 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend =26214380(104857520);
length = 17/6553600

15/01/2410:30:15 INFO mapred.MapTask: Finished spill 0

15/01/2410:30:15 INFO mapred.Task:Task:attempt_local632218717_0001_m_000001_0 is done. And
is in theprocess of committing

15/01/2410:30:15 INFO mapred.LocalJobRunner: map

15/01/2410:30:15 INFO mapred.Task: Task‘attempt_local632218717_0001_m_000001_0‘ done.

15/01/2410:30:15 INFO mapred.LocalJobRunner: Finishing task:attempt_local632218717_0001_m_000001_0

15/01/2410:30:15 INFO mapred.LocalJobRunner: map task executor complete.

15/01/2410:30:15 INFO mapred.LocalJobRunner: Waiting for reduce tasks

15/01/2410:30:15 INFO mapred.LocalJobRunner: Starting task:attempt_local632218717_0001_r_000000_0

15/01/2410:30:15 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]

15/01/2410:30:15 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin:[email protected]

15/01/2410:30:15 INFO reduce.MergeManagerImpl: MergerManager:memoryLimit=626471744, maxSingleShuffleLimit=156617936,mergeThreshold=413471360,
ioSortFactor=10,memToMemMergeOutputsThreshold=10

15/01/2410:30:15 INFO reduce.EventFetcher:attempt_local632218717_0001_r_000000_0 Thread
started: EventFetcherfor fetching Map Completion Events

15/01/2410:30:15 INFO mapreduce.Job: Job job_local632218717_0001 running inuber mode : false

15/01/2410:30:15 INFO mapreduce.Job: map 100% reduce 0%

15/01/2410:30:16 INFO reduce.LocalFetcher: localfetcher#1 about to shuffleoutput of map
attempt_local632218717_0001_m_000000_0 decomp: 40 len:44 to MEMORY

15/01/2410:30:16 INFO reduce.InMemoryMapOutput: Read 40 bytes from map-outputfor attempt_local632218717_0001_m_000000_0

15/01/2410:30:16 INFO reduce.MergeManagerImpl: closeInMemoryFile ->map-output of size: 40,
inMemoryMapOutputs.size() -> 1,commitMemory -> 0, usedMemory ->40

15/01/2410:30:16 INFO reduce.LocalFetcher: localfetcher#1 about to shuffleoutput of map
attempt_local632218717_0001_m_000001_0 decomp: 51 len:55 to MEMORY

15/01/2410:30:16 INFO reduce.InMemoryMapOutput: Read 51 bytes from map-outputfor attempt_local632218717_0001_m_000001_0

15/01/2410:30:16 INFO reduce.MergeManagerImpl: closeInMemoryFile ->map-output of size: 51,
inMemoryMapOutputs.size() -> 2,commitMemory -> 40, usedMemory ->91

15/01/2410:30:16 INFO reduce.EventFetcher: EventFetcher is interrupted..Returning

15/01/2410:30:16 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/01/2410:30:16 INFO reduce.MergeManagerImpl: finalMerge called with 2in-memory map-outputs
and 0 on-disk map-outputs

15/01/2410:30:16 INFO mapred.Merger: Merging 2 sorted segments

15/01/2410:30:16 INFO mapred.Merger: Down to the last merge-pass, with 2segments left of
total size: 71 bytes

15/01/2410:30:16 INFO reduce.MergeManagerImpl: Merged 2 segments, 91 bytes todisk to satisfy
reduce memory limit

15/01/2410:30:16 INFO reduce.MergeManagerImpl: Merging 1 files, 93 bytes fromdisk

15/01/2410:30:16 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytesfrom memory into
reduce

15/01/2410:30:16 INFO mapred.Merger: Merging 1 sorted segments

15/01/2410:30:16 INFO mapred.Merger: Down to the last merge-pass, with 1segments left of
total size: 79 bytes

15/01/2410:30:16 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/01/2410:30:16 INFO Configuration.deprecation: mapred.skip.on isdeprecated. Instead, use
mapreduce.job.skiprecords

15/01/2410:30:16 INFO mapred.Task:Task:attempt_local632218717_0001_r_000000_0 is done. And
is in theprocess of committing

15/01/2410:30:16 INFO mapred.LocalJobRunner: 2 / 2 copied.

15/01/2410:30:16 INFO mapred.Task: Taskattempt_local632218717_0001_r_000000_0 is allowed
to commit now

15/01/2410:30:16 INFO output.FileOutputCommitter: Saved output of task‘attempt_local632218717_0001_r_000000_0‘
tohdfs://localhost:9000/user/castle/wordcount_output/_temporary/0/task_local632218717_0001_r_000000

15/01/2410:30:16 INFO mapred.LocalJobRunner: reduce > reduce

15/01/2410:30:16 INFO mapred.Task: Task‘attempt_local632218717_0001_r_000000_0‘ done.

15/01/2410:30:16 INFO mapred.LocalJobRunner: Finishing task:attempt_local632218717_0001_r_000000_0

15/01/2410:30:16 INFO mapred.LocalJobRunner: reduce task executor complete.

15/01/2410:30:16 INFO mapreduce.Job: map 100% reduce 100%

15/01/2410:30:16 INFO mapreduce.Job: Job job_local632218717_0001 completedsuccessfully

15/01/2410:30:16 INFO mapreduce.Job: Counters: 38

FileSystem Counters

FILE:Number of bytes read=1732

FILE:Number of bytes written=754881

FILE:Number of read operations=0

FILE:Number of large read operations=0

FILE:Number of write operations=0

HDFS:Number of bytes read=154

HDFS:Number of bytes written=42

HDFS:Number of read operations=25

HDFS:Number of large read operations=0

HDFS:Number of write operations=5

Map-ReduceFramework

Mapinput records=10

Mapoutput records=10

Mapoutput bytes=101

Mapoutput materialized bytes=99

Inputsplit bytes=242

Combineinput records=10

Combineoutput records=7

Reduceinput groups=5

Reduceshuffle bytes=99

Reduceinput records=7

Reduceoutput records=5

SpilledRecords=14

ShuffledMaps =2

FailedShuffles=0

MergedMap outputs=2

GCtime elapsed (ms)=0

CPUtime spent (ms)=0

Physicalmemory (bytes) snapshot=0

Virtualmemory (bytes) snapshot=0

Totalcommitted heap usage (bytes)=855638016

ShuffleErrors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

FileInput Format Counters

BytesRead=61

FileOutput Format Counters

BytesWritten=42






























Hadoop2.6和eclipse整合开发配置
编译hadoop
eclipse插件
git
clone https://github.com/winghc/hadoop2x-eclipse-plugin.git
然后使用ant进行编译
 cd
src/contrib/eclipse-plugin

ant jar -Dversion=2.6.0 -Declipse.home=/usr/local/eclipse -Dhadoop.home=/usr/local/hadoop-2.6.0  //需要手动安装的eclipse,通过命令行一键安装的不行

eclipse.home和hadoop.home设置成你自己的环境路径

生成的位置是:/home/hunter/hadoop2x-eclipse-plugin/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.6.0.jar

不好意思我没有成功,就是编译的时候卡在那里,也不报错什么的。
后来我用这个git文件中release下有一个hadoop2.2.0版本的。用这个就可以,其他的就不行。

































右边配置的要和core-site.xml中的一致。
左边的话可以不需要配置,以前旧版的mapreduce是配置和mapred-site.xml中的一致。

				
时间: 2024-08-06 16:01:33

hadoop2.6安装配置以及整合eclipse开发环境的相关文章

resin4.0.25 安装配置 及结合eclipse开发

resin4.0.25 安装配置 及结合eclipse开发 本文大部分内容是对官网的翻译,及自己配置后的一些体会. 一.  基于win  ,resin基本安装1,安装jdk1.6或更高版本2,配置环境变量JAVA-HOME  ,RESIN-HOME3,配置classpath %RESIN-HOME%\lib\resin.jar (如果是win7的话没必要设置) 否则报异常:com.caucho.loader.SystemClassLoader4,解压resin-4.0.255, 点击resin.

配置Hadoop的Eclipse开发环境

前言 在先前的文章(点击此处浏览)中,已经介绍了如何在Ubuntu Kylin操作系统下搭建Hadoop运行环境,而现在将在之前工作的基础上搭建Eclipse开发环境. 配置 开发环境:Eclipse 4.2 其他同先前 第一步 安装Eclipse4.2 在Eclipse官网获取到Eclipse然后解压到用户目录即可. 第二步 编译Hadoop的Eclipse插件 这一步可以选择使用别人编译好了的插件(需注意hadoop版本号 和编译时的Eclipse版本号要一致),也可以自行编译(较繁琐).

Hadoop伪分布配置与基于Eclipse开发环境搭建

原文地址: http://my.oschina.net/lanzp/blog/309078 目录[-] 1.开发配置环境: 2.Hadoop服务端配置(Master节点) 3.基于Eclipse的Hadoop2.x开发环境配置 4.运行Hadoop程序及查看运行日志 1.开发配置环境: 开发环境:Win7(64bit)+Eclipse(kepler service release 2) 配置环境:Ubuntu Server 14.04.1 LTS(64-bit only) 辅助工具:WinSCP

Windows环境下安装配置Anaconda的Python开发环境

Python官方自带的软件只有基础的库文件,而在编程过程中需要使用各种类型的库,都需要花费大量的时间去查找并不断下载库文件并加载到环境中去才能使用,这也是我自学Python遇到的一个困难点,anaconda提供了一个整合的环境解决了这个问题. 官方下载地址如下:(这里选择的是windows版本安装) https://www.anaconda.com/support/ 但是官网下载特别慢,这里学长Ben推荐了清华大学开源镜像站下载 https://mirrors.tuna.tsinghua.edu

iDempiere 使用指南 windows下eclipse开发环境配置及打包下载

Created by 蓝色布鲁斯,QQ32876341,blog http://www.cnblogs.com/zzyan/ iDempiere官方中文wiki主页 http://wiki.idempiere.org/zhiDempiere 中文社区www.idempiere.org.cniDempiere 中文社区QQ群 65713012 本文说明如何在windows下面配置iDempiere的eclipse 开发环境,同时提供配置好的环境下载.请同时参照第一篇 绿色版iDempiere环境

Go语言Eclipse开发环境配置-windows

1.首先安装eclipse,选择一个适合的版本就好,解压即可 http://www.eclipse.org/downloads/ 2.下载go语言安装包 官网地址 :http://www.golang.org/ 官网地址有时候打不开,百度云盘下载go1.2安装包  http://pan.baidu.com/s/1c0Gf4UO#dir/path=%2FSoft%2FGoLang  32 64位都有 一路Next cmd输入go version 看到这个就安装好了,环境变量什么的会自动配置好的,唯

Ubuntu Kylin 14.04 安装配置 jdk、eclipse、tomcat 通用

一.安装jdk 1)首先去sun官网下载jdk http://www.oracle.com/technetwork/java/javase/downloads/index.html 注意下载的是tar.gz结尾的文件,例如:jdk-7u21-linux-i586.tar.gz 2)下载之后,打开终端使用命令进行解压: sudo tar -zxvf jdk-7u21-linux-i586.tar.gz 回车后需要输入登陆密码,注意输入的密码是看不到的,你输入后回车就行 我下载的jdk是:jdk-7

配置Arcengine10.1+java开发环境(Eclipse)

以下开发环境配置是假定用户已经安装了Java开发的IDE(Eclipse) 软件准备 (一)ArcEngine 10.1 安装包            提取码:poa0 (二)ArcGIS License Manager          提取密码:a54e   软件安装 1.安装Licence Manager  按照安装指引一直下一步直到安装完成. 启动 Licence Server Administrator ,停止许可服务(如下图) 2.安装ArcEngine 一直点下一步直到完成 3.安

怎样在Win7 64位旗舰版安装Python+Eclipse开发环境

原地址:http://www.cnblogs.com/balian/archive/2011/06/19/2084632.html 自从上周抛弃了WinXP转而安装了Win7,64位后,尝试安装Python+Eclipse遇到了一点小问题.现在已经解决,将安装顺序记录如下,供参考. Setp1,到ORACLE网站下载64位的JDK.http://www.oracle.com/technetwork/java/javase/downloads/index.html下载JDK,我下载的是jdk-6u