hadoop端口
----------------
1.namenode 50070
http://namenode:50070/
2.resourcemanager:8088
http://localhost:8088/
3.historyServer
http://hs:19888/
4.name rpc(remote procedure call,远程过程调用)
hdfs://namenode:8020/
ssh指令结合操作命令
---------------------
$>ssh s300 rm -rf /xx/x/x
通过scp远程复制
--------------------
$>scp -r /xxx/x [email protected]:/path
编写脚本,实现文件或者文件夹的在所有节点远程复制。
xcopy.sh
--------------------
scp -r path [email protected]:/path
删除
------
xrm.sh a.txt
ssh s200 rm -rf path
远程复制文件
[/usr/local/sbin/xcp.sh] #!/bin/bash if [ $# -lt 1 ] ;then echo no args exit; fi #get first argument arg1=$1; cuser=`whoami` fname=`basename $arg1` dir=`dirname $arg1` if [ "$dir" = "." ]; then dir=`pwd` fi for (( i=200;i<=500;i=i+100)) ; do echo -----coping $arg1 to $i ------; if [ -d $arg1 ] ;then scp -r $arg1 [email protected]$i:$dir else scp $arg1 [email protected]$i:$dir fi echo done
slaves
----------
master
masters
hadoop2.7.2源代码处理
-----------------------
1.下载并加压hadoop.2.7.2-tar.gz文件
2.对Jar包按照CONF,LIB,SOURCES,TSET等分类
从jar包提取所有的配置项
------------------------
1.core-default.xml
D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-common-2.7.2.jar
2.hdfs-default.xml
D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-hdfs-2.7.2.jar
3.mapred-default.xml
D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-mapreduce-client-core-2.7.2.jar
4.yarn-default.xml
D:\downloads\bigdata\hadoop-2.7.2\_libs\hadoop-yarn-common-2.7.2.jar
master node == NameNode
------------------------
{hadoop}/sbin/start-all.sh
--------------------------------------
1.{hadoop}\libexec\hadoop-config.sh
HADOOP_CONF_DIR=...//--config参数
2./sbin/start-dfs.sh --config $HADOOP_CONF_DIR
3./sbin/start-yarn.sh --config $HADOOP_CONF_DIR
{hadoop_home}/sbin/start-dfs.sh
--------------------------------
1.{hadoop}\libexec\hadoop-config.sh
HADOOP_CONF_DIR=...//--config参数
2.NAMENODE={hadoop_home}/bin/hdfs getconf -namenodes//提取名称节点的主机名
3.{hadoop_home}/sbin/hadoop-daemons.sh --config ... --hostnames ... --script "{hadoop_home}/bin/hdfs" start namenode $dataStartOpt
4.{hadoop_home}/sbin/hadoop-daemons.sh --config ... --hostnames ... --script "{hadoop_home}/bin/hdfs" start datanode $dataStartOpt
5.{hadoop_home}/sbin/hadoop-daemons.sh --config ... --hostnames ... --script "{hadoop_home}/bin/hdfs" start secondarynamenode
{hadoop_home}/sbin/hadoop-daemons.sh
---------------------------------------
1.{hadoop}\libexec\hadoop-config.sh
HADOOP_CONF_DIR=...//--config参数
2.exec "$bin/slaves.sh" --config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh" --config $HADOOP_CONF_DIR "[email protected]"
{hadoop_home}/sbin/slaves.sh
-----------------------------
1.{hadoop}\libexec\hadoop-config.sh
HADOOP_CONF_DIR=...//--config参数
2."${HADOOP_CONF_DIR}/hadoop-env.sh"
3.提取slaves文件的所有主机名-->SLAVE_NAMES
4.for SLAVE_NAMES --> ssh @hostname ...
"$bin/hadoop-daemon.sh"
-----------------------------
1.{hadoop}\libexec\hadoop-config.sh
HADOOP_CONF_DIR=...//--config参数
2.namenode|datanode|2namenode|..
bin/hdfs/xxxx
2NN配置独立的主机
--------------------
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>0.0.0.0:50090</value>
<description>
The secondary namenode http server address and port.
</description>
</property>
修改默认的hadoop临时目录
-------------------------
[core-site.xml]
hadoop.tmp.dir=/home/ubuntu/hadoop/
修改blocksize大小,默认是128m
-----------------------------
[hdfs-site.xml]
dfs.blocksize=8m
1.测试方式
put 文件 > 8m,通过webui查看块大小