1、下载Scala
wget http://www.scala-lang.org/files/archive/scala-2.10.3.tgz
tar xvzf scala-2.10.3.tgz -C /usr/local
2、下载Spark
wget http://www.apache.org/dist/incubator/spark/spark-0.9.0-incubating/spark-0.9.0-incubating-bin-hadoop2.tgz
tar -zxvf spark-0.9.0-incubating-bin-hadoop2.tgz
3、设置spark的配置文件
mv spark-env.sh.template spark-env.sh
cat >> spark-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_26
export SCALA_HOME=/usr/local/scala-2.10.3
export HADOOP_HOME=/root/hadoop-2.2.0
SPARK_LOCAL_DIR="/data/spark/tmp"
4、设置环境变量
cat >> /etc/profile
export SCALA_HOME=/usr/local/scala-2.10.3
export SPARK_HOME=/root/spark
export PATH=$SCALA_HOME/bin:$PATH
source /etc/profile
5、启动Hadoop相关进程
start-all.sh
6、启动Spark Worker
sbin/start-all.sh
此时通过jps可知启动了如下进程:
21624 Worker
10664 MainGenericRunner
20515 SecondaryNameNode
21057 Master
20311 NameNode
20689 ResourceManager
7、测试
[email protected]:~/spark/conf# ./run-example org.apache.spark.examples.SparkPi local
输出:
...
[main] INFO org.apache.spark.SparkContext - Job finished: reduce at SparkPi.scala:39, took 5.0273654 s
Pi is roughly 3.13986
....
Spark0.9 安装