本文参考了http://wangzan18.blog.51cto.com/8021085/1628984
http://linuxnote.blog.51cto.com/9876511/1637773
一、nagios概述
Nagios是插件式的结构,它本身没有任何监控功能,所有的监控都是通过插件进行的,因此其是高度模块化和富于弹性的。Nagios监控的对象可分为两类:主机和服务。主机通常指的是物理主机,如服务器、路由器、工作站和打印机等,这里的主机也可以是虚拟设备,如xen虚拟出的Linux系统;而服务通常指某个特定的功能,如提供http服务的httpd进程等。而为了管理上的方便,主机和服务还可以分别被规划为主机组和服务组等。
Nagios不监控任何具体数值指标(如操作系统上的进程个数),它仅用四种抽象属性对被监控对象的状态进行描述:OK、WARNING, CRITICAL和UNKNOWN。于是,管理员只需要对某种被监控对象的WARNING和CRITICAL状态的阈值进行关注和定义即可。Nagios通过将WARTING和CRTICAL的阈值传递给插件,并由插件负责某具体对象的监控及结果分析,其输出信息为状态信息(OK,WARNING,CRITICAL或UNKOWN)以及一些附加的详细说明信息。
二、环境
系统:centos 6.4x64最小化安装
nagios-server: 192.168.3.71
nagios-client: 192.168.3.72
三、安装nagios
Nagios通常由一个主程序(Nagios)、一个插件程序(Nagios-plugins)和四个可选的ADDON(NRPE、NSCA、NSClient++和NDOUtils)组成。Nagios的监控工作都是通过插件实现的,因此,Nagios和Nagios-plugins是服务器端工作所必须的组件。而四个ADDON中,NRPE用来在监控的远程Linux/Unix主机上执行脚本插件以实现对这些主机资源的监控;NSCA用来让被监控的远程Linux/Unix主机主动将监控信息发送给Nagios服务器(这在冗余监控模式中特别要用到);NSClient++是用来监控Windows主机时安装在Windows主机上的组件;而NDOUtils则用来将Nagios的配置信息和各event产生的数据存入数据库,以实现这些数据的快速检索和处理。这四个ADDON(附件)中,NRPE和NSClient++工作于客户端,NDOUtils工作于服务器端,而NSCA则需要同时安装在服务器端和客户端。
目前,Nagios只能安装在Linux系统主机上,其编译需要用到gcc。同时,如果打算使用web界面的管理工具的话,还需要有apache服务器和GD图形库的支持
安装nagios依赖关系
[[email protected] ~]# yum -y install httpd gcc glibc glibc-common gd gd-devel php php-mysql mysql mysql-devel mysql-server ntp
同步时间
[[email protected] ~]# ntpdate asia.pool.ntp.org 25 May 10:45:35 ntpdate[22419]: step time server 118.67.201.10 offset 140.952903 sec [[email protected] ~]# hwclock -w [[email protected] ~]# crontab -l MAILTO="" */10 * * * * /usr/sbin/ntpdate asia.pool.ntp.org
添加nagios运行需要的用户和组
[[email protected] ~]# groupadd nagcmd [[email protected] ~]# useradd -G nagcmd nagios
把apache添加到
[[email protected] ~]# usermod -a -G nagcmd apache
下载nagios,并安装
[[email protected] ~]# wget http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.3.1/nagios-3.3.1.tar.gz/download [[email protected] ~]# tar xf nagios-3.3.1.tar.gz [[email protected] ~]# cd nagios [[email protected] nagios]# ./configure --with-command-group=nagcmd --enable-event-broker [[email protected] nagios]# make all #安装所有主程序 [[email protected] nagios]# make install #安装主程序,CHI和HTML文件 [[email protected] nagios]# make install-init #安装启动脚本 [[email protected] nagios]# make install-commandmode #配置目录权限 [[email protected] nagios]# make install-config #安装示例文件 [[email protected] nagios]# make install-webconf #安装nagios的web站点配置文件
为nagios配置告警信息的邮件地址
#编辑文件前,先备份 [[email protected] nagios]# cp /usr/local/nagios/etc/objects/contacts.cfg /usr/local/nagios/etc/objects/contacts.cfg.$(date +%F).bak #邮箱地址修改成自己要设置的地址 [[email protected] nagios]# sed -i ‘s#[email protected]#[email protected]#‘ /usr/local/nagios/etc/objects/contacts.cfg [[email protected] nagios]# grep [email protected] /usr/local/nagios/etc/objects/contacts.cfg email [email protected] ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
创建一个登陆验证的用户和密码
#这里设置密码weyee2014 [[email protected] nagios]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
关闭iptables和启动httpd
[[email protected] nagios]# service iptables stop [[email protected] nagios]# service httpd start
安装nagios-plugins
#下载nagios-plugins [[email protected] ~]# wget http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz [[email protected] ~]# tar xf nagios-plugins-2.0.3.tar.gz [[email protected] ~]# cd nagios-plugins-2.0.3 [[email protected] nagios-plugins-2.0.3]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios [[email protected] nagios-plugins-2.0.3]# make && make install
配置nagios启动级别
#将nagios添加到开机自启动 [[email protected] ~]# chkconfig --add nagios [[email protected] ~]# chkconfig nagios on #检查配置文件 [[email protected] ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg [[email protected] ~]# service nagios start Starting nagios: done. [[email protected] ~]# ps aux |grep nagios nagios 45840 0.0 0.1 28048 1396 ? Ssl 11:25 0:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg root 45852 0.0 0.0 103248 832 pts/0 S+ 11:25 0:00 grep nagios
访问http://192.168.3.71/nagios,用户名和密码是上文中设置的nagiosadmin:weyee2014
四、安装NRPE
NRPE需要nagios-pluin的支持
[[email protected] ~]# wget http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz/download [[email protected] ~]# tar xf nrpe-2.15.tar.gz [[email protected] ~]# cd nrpe-2.15 [[email protected] nrpe-2.15]# ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl [[email protected] nrpe-2.15]# make all [[email protected] nrpe-2.15]# make install-plugin [[email protected] nrpe-2.15]# make install-daemon [[email protected] nrpe-2.15]# make install-daemon-config
配置NRPE
#编辑NRPE的配置文件 [[email protected] ~]# egrep -v "^$|^#" /usr/local/nagios/etc/nrpe.cfg log_facility=daemon pid_file=/var/run/nrpe.pid server_port=5666 server_address=192.168.3.71 #修改这里 nrpe_user=nagios nrpe_group=nagios allowed_hosts=192.168.3.0/24 #修改这里 dont_blame_nrpe=0 allow_bash_command_substitution=0 debug=0 command_timeout=60 connection_timeout=300 command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10 command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20 command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1 command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
启动NRPE
[[email protected] ~]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d [[email protected] ~]# netstat -anpt |grep 5666 tcp 0 0 192.168.3.71:5666 0.0.0.0:* LISTEN 48946/nrpe
编写NRPCE的启动脚本,将脚本放到/etc/init.d路径下
[[email protected] ~]# cat /etc/init.d/nrped #!/bin/bash # chkconfig: 2345 88 12 # description: NRPE DAEMON NRPE=/usr/local/nagios/bin/nrpe NRPECONF=/usr/local/nagios/etc/nrpe.cfg case "$1" in start) echo -n "Starting NRPE daemon..." $NRPE -c $NRPECONF -d echo " done." ;; stop) echo -n "Stopping NRPE daemon..." pkill -u nagios nrpe echo " done." ;; restart) $0 stop sleep 2 $0 start ;; *) echo "Usage: $0 start|stop|restart" ;; esac exit 0 #赋予脚本权限,并测试 [[email protected] ~]# chmod +x /etc/init.d/nrped [[email protected] ~]# chkconfig --add nrped [[email protected] ~]# chkconfig nrped on [[email protected] ~]# service nrped stop Stopping NRPE daemon... done. [[email protected] ~]# netstat -anpt |grep nrpe [[email protected] ~]# pidof nrpe #重新启动nrpe [[email protected] ~]# service nrped start Starting NRPE daemon... done. [[email protected] ~]# netstat -anpt |grep nrpe tcp 0 0 192.168.3.71:5666 0.0.0.0:* LISTEN 49019/nrpe [[email protected] ~]# pidof nrpe 49019 [[email protected] ~]# ps aux |grep nrpe |grep -v grep nagios 49019 0.0 0.1 39240 1312 ? Ss 11:53 0:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
五、在客户端安装NRPE
在nagios-client上安装NRPE,需要先安装nagios-plugin
[[email protected] ~]# yum install ntpdate gcc* openssl openssl-devel -y [[email protected] ~]# ntpdate asia.pool.ntp.org 25 May 13:52:18 ntpdate[22302]: step time server 212.26.18.41 offset 2955.354923 sec [[email protected] ~]# hwclock -w [[email protected] ~]# crontab -l MIALTO="" */10 * * * * /usr/sbin/ntpdate asia.pool.ntp.org [[email protected] ~]# useradd -s /sbin/nologin nagios [[email protected] ~]# wget http://nagios-plugins.org/download/nagios-plugins-2.0.3.tar.gz [[email protected] ~]# tar xf nagios-plugins-2.0.3.tar.gz [[email protected] ~]# cd nagios-plugins-2.0.3 [[email protected] nagios-plugins-2.0.3]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios [[email protected] nagios-plugins-2.0.3]# make all [[email protected] nagios-plugins-2.0.3]# make install
安装NRPE
[[email protected] ~]# wget http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz/download [[email protected] nrpe-2.15]# ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl [[email protected] nrpe-2.15]# make all [[email protected] nrpe-2.15]# make install-plugin [[email protected] nrpe-2.15]# make install-daemon [[email protected] nrpe-2.15]# make install-daemon-config #修改配置文件 [[email protected] nrpe-2.15]# egrep -v "^#|^$" /usr/local/nagios/etc/nrpe.cfg log_facility=daemon pid_file=/var/run/nrpe.pid server_port=5666 server_address=192.168.3.72 #监控本机哪个IP nrpe_user=nagios nrpe_group=nagios allowed_hosts=192.168.3.71 #允许哪个IP对本机进行监控 #编写nrpe启动脚本 [[email protected] ~]# cat /etc/init.d/nrped #!/bin/bash # chkconfig: 2345 88 12 # description: NRPE DAEMON NRPE=/usr/local/nagios/bin/nrpe NRPECONF=/usr/local/nagios/etc/nrpe.cfg case "$1" in start) echo -n "Starting NRPE daemon..." $NRPE -c $NRPECONF -d echo " done." ;; stop) echo -n "Stopping NRPE daemon..." pkill -u nagios nrpe echo " done." ;; restart) $0 stop sleep 2 $0 start ;; *) echo "Usage: $0 start|stop|restart" ;; esac exit 0 [[email protected] ~]# chmod +x /etc/init.d/nrped [[email protected] ~]# chkconfig --add nrped [[email protected] ~]# chkconfig nrped on [[email protected] ~]# service nrped start Starting NRPE daemon... done. [[email protected] ~]# netstat -anpt |grep nrpe tcp 0 0 192.168.3.72:5666 0.0.0.0:* LISTEN 46882/nrpe
到此nagios和客户端的安装已完成