自动实现Hadoop Decommission shell脚本版

介绍

之前我有篇博文是介绍如何用ansible的playbook来自动化Hadoop Decommission的，本文介绍用shell脚本来实现。

脚本都放在跳板机上，远端服务器调用本地脚本完成所有操作，不需要拷贝文件到远端服务器。

正文

主脚本：decom.sh

#!/bin/bash
iplist=/home/hadoop/shell_scripts/iplist
#1. process iplist ,append them into exclude files 
# call append.sh
for exclude_host in `cat $iplist` ; do
    ssh [email protected] "bash -s" < append.sh "$exclude_host" hdfs-exclude
    ssh [email protected] "bash -s" < append.sh "$exclude_host" mapred-exclude
    ssh [email protected] "bash -s" < append.sh "$exclude_host" hdfs-exclude
    ssh [email protected] "bash -s" < append.sh "$exclude_host" mapred-exclude
done
#2. 
ssh [email protected] "bash -s" < refreshnodes.sh
ssh [email protected] "bash -s" < refreshnodes.sh
#3. stop nodemanager and datanode service ,maybe regionserver service too
for client in `cat iplist`; do
    ssh [email protected]"${client}" "bash -s"  < stopservice.sh
done

分脚本：append.sh

#!/bin/bash
conf_dir=/opt/hadoop-2.6.0/etc/hadoop/
backup_dir=/opt/hadoop-2.6.0/etc/hadoop/BAK/
exclude_host=$1
exclude_file=$2
function usage() {
    echo -e "usage: $0 exclude file\nplease input the parameter ---- mapred-exclude or hdfs-exclude"
}
if [ $# -ne 2 ] ;then
    usage
    exit 1
elif [ "$exclude_file" != "mapred-exclude" -a "$exclude_file" != "hdfs-exclude" ];then
    usage
    exit 1
fi
#if [ -d /apache/hadoop/conf ] ;then
#    cd /apache/hadoop/conf
#else
#    echo "dir /apache/hadoop/conf doesnot exist , please check!"
#    exit 3
#fi
[ ! -d ${backup_dir} ] && mkdir ${backup_dir}
# backup exclude file 
cp "${conf_dir}${exclude_file}" ${backup_dir}"${exclude_file}"-`date +%F.%H.%M.%S`
# append hosts to exclude file
    grep ${exclude_host} "${conf_dir}${exclude_file}" >/dev/null 2>&1
    retval=$?
    if [ $retval -ne 0 ];then
        echo ${exclude_host} >> "${conf_dir}${exclude_file}"
    else
        echo "duplicated host: ${exclude_host}"
    fi

分脚本：refreshnodes.sh

#!/bin/bash
hadoop_bin_dir=/opt/hadoop-2.6.0/bin/
${hadoop_bin_dir}yarn rmadmin -refreshNodes 2>/dev/null
if [ $? -ne 0 ] ; then
    echo "command yarn rmadmin -refreshNodes Failed on $(hostname)!!!"
    exit 2
fi
# wait a while to let mapreduce can switch jobs to other nodes 
sleep 2
${hadoop_bin_dir}hadoop dfsadmin -refreshNodes 2>/dev/null
if [ $? -ne 0 ] ; then
        echo "command hadoop dfsadmin -refreshNodes Failed on $(hostname)!!!"
        exit 3
fi

分脚本：stopservice.sh

#!/bin/bash
hadoop_bin_dir=/opt/hadoop-2.6.0/sbin/
#svc -d /service/nodemanager
#svc -d /service/datanode
${hadoop_bin_dir}yarn-daemon.sh stop nodemanager
${hadoop_bin_dir}hadoop-daemon.sh stop datanode

文件：iplist

10.9.214.160
10.9.214.149

操作：

bash decom.sh

时间： 2025-01-05 16:49:00

自动实现Hadoop Decommission shell脚本版的相关文章

Hadoop的shell脚本分析

你会发现hadoop-daemon.sh用于启动单独的本机节点而hadoop-daemons.sh 会批量的ssh到别的机器启动前记: 这些天一直学习hadoop,学习中也遇到了许多的问题,主要是对hadoop的shell脚本和hadoop的源码概念不够清楚,所以我就对hadoop的bin目录下的shell脚本进行了研究,有一些成果想记录下来,也希望大家前来批评指正. 分析原因: 很多hadoop的初学者对hadoop的脚本不是很清楚,不知道为什么可以在命令行中启动hadoop,也不知道为什

RHEL自动安装zookeeper的shell脚本

RHEL自动安装zookeeper的shell脚本 A:本脚本运行的机器,Linux RHEL6 B,C,D,...:待安装zookeeper cluster的机器, Linux RHEL6 首先在脚本运行的机器A上确定可以ssh无密码登录到待安装zk的机器B,C,D,...上,然后就可以在A上运行本脚本: $ ./install_zookeeper 前提: B, C, D机器必须配置好repo,本脚本使用的是cdh5的repo, 下面的内容保存到:/etc/yum.repos.d/cloude

Hadoop常用shell命令

为了方便自己回顾记忆,将今天实验的hadoop命令总结一下,方便后续查看. 注意,下述命令是在hadoop/bin 下操作的. 1.hadoop fs -ls \ ->查看当前下面的所有目录. 2.hadoop fs -mkdir xxx ->创建xxx文件夹在hadoop文件系统上. 3.hadoop fs -rmr xxx -> 删除在文件系统上面创建的文件夹 4.hadoop fs -put /home/xuzhang/file1 xxx 将文件放入创建的xxx目录中 5.h

shell脚本报错："[: =: unary operator expected"

shell脚本报错:"[: =: unary operator expected" md5_109a="81ab961153b62d207f0f517048881b5d" md5_109b=`md5sum install.bin|awk '{print $1}'` if [ $md5_109a != $md5_109b ] 原因,当文件install.bin不存在时, $md5_109b为空这样对比字符串就变成了 if [ 81ab961153b62d207f0f5

hadoop入门：hadoop使用shell命令总结

第一部分:Hadoop Bin后面根据项目的实际需要Hadoop Bin 包括:Hadoop hadoop的Shellhadoop-config.sh 它的作用是对一些变量进行赋值 HADOOP_HOME(hadoop的安装目录). HADOOP_CONF_DIR(hadoop的配置文件目录).HADOOP_SLAVES(--hosts指定的文件的地址)hadoop-daemon.sh 单节点启动hadoop-daemons.sh 启动slaves.sh和hadoop-dae

【解决】org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

[环境信息] Hadoop版本:2.4.0 客户端OS:Windows Server 2008 R2 服务器端OS:CentOS 6.4 [问题现象] 在通过Windows客户端向Linux服务器提交Hadoop应用时,会提示如下错误: org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control at org.apache.hadoop.util.Shell.runCommand(Sh

Linux自动安装JDK的shell脚本

Linux自动安装JDK的shell脚本 A:本脚本运行的机器,Linux B:待安装JDK的机器, Linux 首先在脚本运行的机器A上确定可以ssh无密码登录到待安装jdk的机器B上,然后就可以在A上运行本脚本: $ ./install-jdk.sh B的IP or: $ ./install-jdk.sh "B的IP" "JDK的URI" 就可以在机器B上安装JDK.jdk使用的tar包需要用户自己设定DEFAULT_JDK_SRC=?,保证可以wget得到即可

Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException

使用MapReduce编写的中文分词程序出现了 Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: 这样的问题如图: 上网查了好多资料,才明白这是hadoop本身的问题,具体参考: https://issues.apache.org/jira/browse/YARN-1298 https://issues.apache.org/jira/browse/MAPREDUCE-5655 解决办

一个简单的监控网站是否正常并自动重启服务的shell脚本

#!/bin/sh if [ -z "`curl --connect-timeout 15 --max-time 20 --head --silent http://localhost/index.php|head -n 1|grep '200'`" ];then echo -e "$(date +%Y-%m-%d)\n" killall nginx killall php-fpm /usr/local/nginx/sbin/nginx -c conf/nginx.