转自:http://blog.csdn.net/zyz511919766/article/details/43967793
supervisor:C/S架构的进程控制系统,可使用户在类UNIX系统中监控、管理进程。常用于管理与某个用户或项目相关的进程。
组成部分
supervisord:服务守护进程
supervisorctl:命令行客户端
Web Server:提供与supervisorctl功能相当的WEB操作界面
XML-RPC Interface:XML-RPC接口
安装
centos平台下可直接用过YUM源安装
yum info supervisor
sudo yum install supervisor
sudo chkconfig supervisord on
服务器启停
sudo /etc/init.d/supervisord {start|stop|status|restart|reload|force-reload|condrestart}
日志
/var/log/supervisor/supervisord.log
配置文件
sudo vim /etc/supervisord.conf
需要重点关注的是以部分
[program:x]中配置要监控的进程
配置样例
[plain] view
plain copy
- [supervisord]
- http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
- ;http_port=127.0.0.1:9001 ; (alternately, ip_address:port specifies AF_INET)
- ;sockchmod=0700 ; AF_UNIX socketmode (AF_INET ignore, default 0700)
- ;sockchown=nobody.nogroup ; AF_UNIX socket uid.gid owner (AF_INET ignores)
- ;umask=022 ; (process file creation umask;default 022)
- logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log)
- logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
- logfile_backups=10 ; (num of main logfile rotation backups;default 10)
- loglevel=info ; (logging level;default info; others: debug,warn)
- pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
- nodaemon=false ; (start in foreground if true;default false)
- minfds=1024 ; (min. avail startup file descriptors;default 1024)
- minprocs=200 ; (min. avail process descriptors;default 200)
- ;nocleanup=true ; (don‘t clean up tempfiles at start;default false)
- ;http_username=user ; (default is no username (open system))
- ;http_password=123 ; (default is no password (open system))
- ;childlogdir=/tmp ; (‘AUTO‘ child log dir, default $TEMP)
- ;user=chrism ; (default is current user, required if root)
- ;directory=/tmp ; (default is not to cd during start)
- ;environment=KEY=value ; (key value pairs to add to environment)
- [supervisorctl]
- serverurl=unix:///var/tmp/supervisor.sock ; use a unix:// URL for a unix socket
- ;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
- ;username=chris ; should be same as http_username if set
- ;password=123 ; should be same as http_password if set
- ;prompt=mysupervisor ; cmd line prompt (default "supervisor")
- ; The below sample program section shows all possible program subsection values,
- ; create one or more ‘real‘ program: sections to be able to control them under
- ; supervisor.
- ;[program:example]
- ;command=/bin/echo; the program (relative uses PATH, can take args)
- ;priority=999 ; the relative start priority (default 999)
- ;autostart=true ; start at supervisord start (default: true)
- ;autorestart=true ; retstart at unexpected quit (default: true)
- ;startsecs=10 ; number of secs prog must stay running (def. 10)
- ;startretries=3 ; max # of serial start failures (default 3)
- ;exitcodes=0,2 ; ‘expected‘ exit codes for process (default 0,2)
- ;stopsignal=QUIT ; signal used to kill process (default TERM)
- ;stopwaitsecs=10 ; max num secs to wait before SIGKILL (default 10)
- ;user=chrism ; setuid to this UNIX account to run the program
- ;log_stdout=true ; if true, log program stdout (default true)
- ;log_stderr=true ; if true, log program stderr (def false)
- ;logfile=/var/log/supervisor.log ; child log path, use NONE for none; default AUTO
- ;logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
- ;logfile_backups=10 ; # of logfile backups (default 10)
“;”为注释。各参数的含义都很明确。可以根据官方手册结合实验来进一步深入了解。重点说几个[program:example]中的参数
[plain] view
plain copy
- ;command=/bin/echo; supervisor启动时将要开启的进程。相对或绝对路径均可。若是相对路径则会从supervisord的$PATH变中查找。命令可带参数。
- ;priority=999 指明进程启动和关闭的顺序。低优先级表明进程启动时较先启动关闭时较后关闭。高优先级表明进程启动时启动时较后启动关闭时较先关闭。
- ;autostart=true 是否随supervisord启动而启动
- ;autorestart=true 进程意外退出后是否自动重启
- ;startsecs=10 进程持续运行多久才认为是启动成功
- ;startretries=3 重启失败的连续重试次数
- ;exitcodes=0,2 若autostart设置为unexpected且监控的进程并非因为supervisord停止而退出,那么如果进程的退出码不在exitcode列表中supervisord将重启进程
- ;stopsignal=QUIT 杀进程的信号
- ;stopwaitsecs=10 向进程发出stopsignal后等待OS向supervisord返回SIGCHILD 的时间。若超时则supervisord将使用SIGKILL杀进程
一个Rabbitmq项目中生产者和消费者进程使用supervisor监控的配置情况:(配置中的其他部分略)
[plain] view
plain copy
- </pre><p><pre name="code" class="plain">[program:worker_for_summary]
- command=/home/op1/scripts/rabbitmqclient/worker_for_summary.py
- priority=1
- log_stderr=true ; if true, log program stderr (def false)
- [program:worker_for_detail_all]
- command=/home/op1/scripts/rabbitmqclient/worker_for_detail_all.py
- priority=1
- log_stderr=true ; if true, log program stderr (def false)
- [program:worker_for_detail_recent_list]
- command=/home/op1/scripts/rabbitmqclient/worker_for_detail_recent_list.py
- priority=1
- log_stderr=true ; if true, log program stderr (def false)
- [program:worker_for_detail_recent_sset]
- command=/home/op1/scripts/rabbitmqclient/worker_for_detail_recent_sset.py
- priority=1
- log_stderr=true ; if true, log program stderr (def false)
- [program:publisher_for_summary]
- command=/home/op1/scripts/rabbitmqclient/publisher_for_summary.py
- priority=999
- log_stderr=true ; if true, log program stderr (def false)
- [program:publisher_for_summary_nt]
- command=/home/op1/scripts/rabbitmqclient/publisher_for_summary_nt.py
- priority=999
- log_stderr=true ; if true, log program stderr (def false)
- [program:publisher_for_detail]
- command=/home/op1/scripts/rabbitmqclient/publisher_for_detail.py
- priority=999
- log_stderr=true ; if true, log program stderr (def false)
- [program:publisher_for_detail_nt]
- command=/home/op1/scripts/rabbitmqclient/publisher_for_detail_nt.py
- priority=999
- log_stderr=true ; if true, log program stderr (def false)
配置完成后启动supervisord
[plain] view
plain copy
- sudo /etc/init.d/supervisord start
可以看到配置的各个进程在后台运行了起来。
停掉某个进程后supervisor会马上重启该进程
停止supervisor
[plain] view
plain copy
- sudo /etc/init.d/supervisord stop
可以看到配置的各个进程都停止运行了。
可以通过supervisorctl查看管理监控的进程情况:
[plain] view
plain copy
- [[email protected] ~]$ sudo supervisorctl
- publisher_for_detail RUNNING pid 27557, uptime 0:00:45
- publisher_for_detail_nt RUNNING pid 27567, uptime 0:00:45
- publisher_for_summary RUNNING pid 27566, uptime 0:00:45
- publisher_for_summary_nt RUNNING pid 27568, uptime 0:00:45
- worker_for_detail_all RUNNING pid 27581, uptime 0:00:45
- worker_for_detail_recent RUNNING pid 27582, uptime 0:00:45
- worker_for_summary RUNNING pid 27559, uptime 0:00:45
- #可通过help了解命令的更多用法
- supervisor> help
- Documented commands (type help <topic>):
- ========================================
- EOF exit maintail quit restart start stop
- clear help open reload shutdown status tail
- supervisor> help stop
- stop <processname> Stop a process.
- stop <processname> <processname> Stop multiple processes
- stop all Stop all processes
- When all processes are stopped, they are stopped in
- reverse priority order (see config file)
- supervisor> help status
- status Get all process status info.
- status <name> Get status on a single process by name.
- status <name> <name> Get status on multiple named processes.
- #停止某个进程
- supervisor> stop publisher_for_summary
- publisher_for_summary: stopped
- #查看此时此刻的状态
- supervisor> status
- publisher_for_detail RUNNING pid 27557, uptime 0:05:41
- publisher_for_detail_nt RUNNING pid 27567, uptime 0:05:41
- publisher_for_summary STOPPED Feb 27 02:48 PM
- publisher_for_summary_nt RUNNING pid 27568, uptime 0:05:41
- worker_for_detail_all RUNNING pid 27581, uptime 0:05:41
- worker_for_detail_recent RUNNING pid 27582, uptime 0:05:41
- worker_for_summary RUNNING pid 27559, uptime 0:05:41
- #发现被supervisorctl停掉的进程不会被自动重启
- #开启刚才停掉的进程
- supervisor> start publisher_for_summary
- publisher_for_summary: started
- supervisor> status
- publisher_for_detail RUNNING pid 27557, uptime 0:08:02
- publisher_for_detail_nt RUNNING pid 27567, uptime 0:08:02
- publisher_for_summary RUNNING pid 3035, uptime 0:00:04
- publisher_for_summary_nt RUNNING pid 27568, uptime 0:08:02
- worker_for_detail_all RUNNING pid 27581, uptime 0:08:02
- worker_for_detail_recent RUNNING pid 27582, uptime 0:08:02
- worker_for_summary RUNNING pid 27559, uptime 0:08:02
- #停掉所有进程
- supervisor> stop all
- worker_for_detail_recent: stopped
- worker_for_detail_all: stopped
- publisher_for_summary_nt: stopped
- publisher_for_detail_nt: stopped
- publisher_for_summary: stopped
- worker_for_summary: stopped
- publisher_for_detail: stopped
- supervisor> status
- publisher_for_detail STOPPED Feb 27 02:51 PM
- publisher_for_detail_nt STOPPED Feb 27 02:51 PM
- publisher_for_summary STOPPED Feb 27 02:51 PM
- publisher_for_summary_nt STOPPED Feb 27 02:51 PM
- worker_for_detail_all STOPPED Feb 27 02:51 PM
- worker_for_detail_recent STOPPED Feb 27 02:51 PM
- worker_for_summary STOPPED Feb 27 02:51 PM
- #开启所有进程
- supervisor> start all
- publisher_for_detail: started
- worker_for_summary: started
- publisher_for_summary: started
- publisher_for_detail_nt: started
- publisher_for_summary_nt: started
- worker_for_detail_all: started
- worker_for_detail_recent: started
- supervisor> status
- publisher_for_detail RUNNING pid 5111, uptime 0:00:15
- publisher_for_detail_nt RUNNING pid 5141, uptime 0:00:15
- publisher_for_summary RUNNING pid 5135, uptime 0:00:15
- publisher_for_summary_nt RUNNING pid 5147, uptime 0:00:15
- worker_for_detail_all RUNNING pid 5153, uptime 0:00:15
- worker_for_detail_recent RUNNING pid 5159, uptime 0:00:14
- worker_for_summary RUNNING pid 5112, uptime 0:00:15
更多内容请参考官方手册
http://supervisord.org/