Hadoop2.6.0 + Spark1.4.0 在Ubuntu14.10环境下的伪分布式集群的搭建(实践可用)





  (2)解压jdk-7u25-linux-i586.tar.gz,并将其移动到 /opt/java/jdk/路径下面


    在 /etc/profile文件中追加  

  #set java env
  export JAVA_HOME=/opt/java/jdk/jdk1.7.0_25
  export JRE_HOME=${JAVA_HOME}/jre
  export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
  export PATH=${JAVA_HOME}/bin:$PATH


[email protected]:~/installs$ java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) Client VM (build 23.25-b01, mixed mode)

  特别注意:之前在root用户下安装好jdk,然后切换到hadoop用户下执行java -version就报错,最后排查是因为把java环境变量配置到~/.bashrc中了,重新配置到/etc/profile后,问题解决。




   “在launchpad.net/Ubuntu/中搜索openssh,根据搜索结果选择对应开发代号下选择相应版本即可。本文是在Ubuntu 12.10上安装的,而其对应的开发代号为Quantal   Quetzal,运行环境为i386,故而下载以下三个文件:openssh-client_6.0p1-3ubuntu1_i386.deb、openssh-server_6.0p1-3ubuntu1_i386.deb、ssh_6.0p1-               3ubuntu1_all.deb。”



  sudo dpkg -i openssh-client_6.0p1-3ubuntu1_i386.deb
  sudo dpkg -i openssh-server_6.0p1-3ubuntu1_i386.deb
  sudo dpkg -i ssh_6.0p1-3ubuntu1_all.deb

  (3)验证,执行 ssh localhost 能登录则说明安装成功。


  ssh-keygen -t rsa -P ""然后一直回车即可
  cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys



  将hadoop-2.6.0.tar.gz 解压到 /opt/hadoop/路径下;



  #java env
  export JAVA_HOME=/opt/java/jdk/jdk1.7.0_25


        <description>Abase for other temporary directories.</description>











<!-- Site specific YARN configuration properties -->



  bin/hdfs namenode -format


  可通过localhost:50070和localhost:8088 查看Web或者用bin/hadoop dfsadmin -report命令查看集群是否正常启动,如下:

[email protected]:/opt/hadoop/hadoop-2.6.0$ bin/hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

15/10/22 01:34:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 19945680896 (18.58 GB)
Present Capacity: 13635391488 (12.70 GB)
DFS Remaining: 13635178496 (12.70 GB)
DFS Used: 212992 (208 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

Live datanodes (1):

Name: (localhost)
Hostname: ubuntu
Decommission Status : Normal
Configured Capacity: 19945680896 (18.58 GB)
DFS Used: 212992 (208 KB)
Non DFS Used: 6310289408 (5.88 GB)
DFS Remaining: 13635178496 (12.70 GB)
DFS Used%: 0.00%
DFS Remaining%: 68.36%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Oct 22 01:34:25 PDT 2015


$bin/hadoop fs -mkdir /input
$bin/hadoop fs -copyFromLocal /home/test.txt /input
$cd  /opt/hadoop/hadoop-2.6.0/share/hadoop/mapreduce
$/opt/hadoop/hadoop-2.6.0/bin/hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output
 $/opt/hadoop/hadoop-2.6.0/bin/hadoop fs -cat /output/*


  将spark-1.4.0-bin-hadoop2.6.tgz解压到 /opt/spark/路径下

  验证:可通过Web管理页面localhost:4040或者运行自带程序验证(bin/run-example SparkPi 10


[email protected]:/opt/spark$ bin/spark-shell
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark‘s default log4j profile: org/apache/spark/log4j-defaults.properties
15/10/22 01:44:26 INFO SecurityManager: Changing view acls to: hadoop
15/10/22 01:44:26 INFO SecurityManager: Changing modify acls to: hadoop
15/10/22 01:44:26 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/10/22 01:44:26 INFO HttpServer: Starting HTTP Server
15/10/22 01:44:27 INFO Utils: Successfully started service ‘HTTP class server‘ on port 51327.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  ‘_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.4.0

Using Scala version 2.10.4 (Java HotSpot(TM) Client VM, Java 1.7.0_25)
Type in expressions to have them evaluated.
Type :help for more information.
15/10/22 01:44:36 WARN Utils: Your hostname, ubuntu resolves to a loopback address:; using instead (on interface eth0)
15/10/22 01:44:36 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/10/22 01:44:36 INFO SparkContext: Running Spark version 1.4.0
15/10/22 01:44:36 INFO SecurityManager: Changing view acls to: hadoop
15/10/22 01:44:36 INFO SecurityManager: Changing modify acls to: hadoop
15/10/22 01:44:36 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/10/22 01:44:37 INFO Slf4jLogger: Slf4jLogger started
15/10/22 01:44:37 INFO Remoting: Starting remoting
15/10/22 01:44:38 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:35977]
15/10/22 01:44:38 INFO Utils: Successfully started service ‘sparkDriver‘ on port 35977.
15/10/22 01:44:38 INFO SparkEnv: Registering MapOutputTracker
15/10/22 01:44:38 INFO SparkEnv: Registering BlockManagerMaster
15/10/22 01:44:38 INFO DiskBlockManager: Created local directory at /tmp/spark-08e380aa-a102-48a2-91e3-b358cb2a6a35/blockmgr-d25aa3bd-b1af-4746-9d1a-edd7e8f1e08c
15/10/22 01:44:38 INFO MemoryStore: MemoryStore started with capacity 267.3 MB
15/10/22 01:44:39 INFO HttpFileServer: HTTP File server directory is /tmp/spark-08e380aa-a102-48a2-91e3-b358cb2a6a35/httpd-4113cef7-2865-4efd-890a-19fcbde49bcb
15/10/22 01:44:39 INFO HttpServer: Starting HTTP Server
15/10/22 01:44:39 INFO Utils: Successfully started service ‘HTTP file server‘ on port 33633.
15/10/22 01:44:39 INFO SparkEnv: Registering OutputCommitCoordinator
15/10/22 01:44:41 INFO Utils: Successfully started service ‘SparkUI‘ on port 4040.
15/10/22 01:44:41 INFO SparkUI: Started SparkUI at
15/10/22 01:44:42 INFO Executor: Starting executor ID driver on host localhost
15/10/22 01:44:42 INFO Executor: Using REPL class URI:
15/10/22 01:44:45 INFO Utils: Successfully started service ‘org.apache.spark.network.netty.NettyBlockTransferService‘ on port 37625.
15/10/22 01:44:45 INFO NettyBlockTransferService: Server created on 37625
15/10/22 01:44:45 INFO BlockManagerMaster: Trying to register BlockManager
15/10/22 01:44:45 INFO BlockManagerMasterEndpoint: Registering block manager localhost:37625 with 267.3 MB RAM, BlockManagerId(driver, localhost, 37625)
15/10/22 01:44:45 INFO BlockManagerMaster: Registered BlockManager
15/10/22 01:44:45 INFO SparkILoop: Created spark context..
Spark context available as sc.
15/10/22 01:44:48 INFO HiveContext: Initializing execution hive, version 0.13.1
15/10/22 01:44:49 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
15/10/22 01:44:49 INFO ObjectStore: ObjectStore, initialize called
15/10/22 01:44:50 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/10/22 01:44:50 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/10/22 01:44:50 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
Thu Oct 22 01:44:51 PDT 2015 Thread[main,5,main] java.io.FileNotFoundException: derby.log (Permission denied)
15/10/22 01:44:51 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
Thu Oct 22 01:44:51 PDT 2015:
Booting Derby version The Apache Software Foundation - Apache Derby - - (1458268): instance a816c00e-0150-8eb8-dd90-0000186374f8
on database directory /tmp/spark-ea20e824-5489-4ead-a2d7-c8b14434dc51/metastore with class loader [email protected]
Loaded from file:/opt/spark/lib/spark-assembly-1.4.0-hadoop2.6.0.jar
java.vendor=Oracle Corporation
Database Class Loader started - derby.database.classpath=‘‘
15/10/22 01:44:53 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
15/10/22 01:44:53 INFO MetaStoreDirectSql: MySQL check failed, assuming we are not on mysql: Lexical error at line 1, column 5.  Encountered: "@" (64), after : "".
15/10/22 01:44:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/22 01:44:54 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/22 01:44:55 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/10/22 01:44:55 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/10/22 01:44:55 INFO ObjectStore: Initialized ObjectStore
15/10/22 01:44:56 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.1aa
15/10/22 01:44:56 INFO HiveMetaStore: Added admin role in metastore
15/10/22 01:44:56 INFO HiveMetaStore: Added public role in metastore
15/10/22 01:44:56 INFO HiveMetaStore: No user is added in admin role, since config is empty
15/10/22 01:44:57 INFO SessionState: No Tez session required at this point. hive.execution.engine=mr.
15/10/22 01:44:57 INFO SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.






时间: 2024-08-02 06:59:21

