服务器运维时我们通常使用Zabbix作监视,在这里安装及配置monit,当检测到进程的停止时可自动启动进程。
推荐使用rpmforge源安装monit,因为rpmforge的monit版本较新。
安装monit
# cd /tmp
# rpm -ivh http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el7.rf.x86_64.rpm
# yum install --enablerepo=rpmforge monit
# rpm -q monit
monit-5.6-1.el7.x86_64
# rpm -ql monit
/etc/logrotate.d/monit
/etc/monit.d
/etc/monit.d/logging
/etc/monitrc
/usr/bin/monit
/usr/lib/systemd/system/monit.service
/usr/share/doc/monit-5.6
/usr/share/doc/monit-5.6/CHANGES
/usr/share/doc/monit-5.6/COPYING
/usr/share/doc/monit-5.6/PLATFORMS
/usr/share/doc/monit-5.6/README
/usr/share/man/man1/monit.1.gz
/var/log/monit.log
# monit -V
This is Monit version 5.6
Copyright (C) 2001-2013 Tildeslash Ltd. All Rights Reserved.
2015年9月9日的最新版本是5.6。
配置monit
查看默认的配置文件内容
# grep -v '^#' /etc/monitrc
set daemon 60 # check services at 1-minute intervals
set httpd port 2812 and
use address localhost # only accept connection from localhost
allow localhost # allow localhost to connect to the server and
allow admin:monit # require user 'admin' with password 'monit'
allow @monit # allow users of group 'monit' to connect (rw)
allow @users readonly # allow users of group 'users' to connect readonly
include /etc/monit.d/*
# cat /etc/logrotate.d/monit
/var/log/monit.log {
missingok
notifempty
size 100k
create 0644 root root
postrotate
/bin/systemctl reload monit.service > /dev/null 2>&1 || :
endscript
}
# cat /etc/monit.d/logging
# log to monit.log
set logfile /var/log/monit.log
监视周期为60秒,日志输出及日志滚动以配置好了。
配置monit
sshd
check process sshd with pidfile /var/run/sshd.pid
start program "/usr/bin/systemctl start sshd.service"
stop program "/usr/bin/systemctl stop sshd.service"
if failed port 22 protocol ssh then restart
if 5 restart within 5 cycles then timeout
Apache
CentOS6.5上的配置monit的Apache,和CentOS7相比启动/停止命令不同而已。
check process apache with pidfile /var/run/httpd/httpd.pid
start program = "/etc/init.d/httpd start" with timeout 60 seconds
stop program = "/etc/init.d/httpd stop"
if failed host www.zabbix.cc port 80 protocol http
and request "/readme.html"
then restart
if 3 restarts within 5 cycles then timeout
group apache
Nginx
check process nginx with pidfile /var/run/nginx.pid
start program = "/usr/bin/systemctl start nginx.service"
stop program = "/usr/bin/systemctl stop nginx.service"
MySQL
CentOS6.5上的配置monit的MySQL,和CentOS7相比只是启动/停止的命令不同。
check process mysqld with pidfile "/var/run/mysqld/mysqld.pid"
start = "/etc/init.d/mysqld start"
stop = "/etc/init.d/mysqld stop"
if failed unixsocket /var/lib/mysql/mysql.sock with timeout 60 seconds then restart
if 5 restarts within 5 cycles then timeout
MariaDB
check process mariadb with pidfile "/var/run/mariadb/mariadb.pid"
start = "/usr/bin/systemctl start mariadb.service"
stop = "/usr/bin/systemctl stop mariadb.service"
if failed host 127.0.0.1 port 3306 protocol mysql then restart
if 5 restarts within 5 cycles then timeout
查看monit状态
# monit status
The Monit daemon 5.6 uptime: 8m
Process 'sshd'
status Running
monitoring status Monitored
pid 884
parent pid 1
uptime 19d 11h 57m
children 4
memory kilobytes 3016
memory kilobytes total 19420
memory percent 0.0%
memory percent total 0.5%
cpu percent 0.0%
cpu percent total 0.0%
port response time 0.008s to localhost:22 [SSH via TCP]
data collected Sun, 05 Apr 2015 21:41:18
Process 'nginx'
status Running
monitoring status Monitored
pid 13963
parent pid 1
uptime 6m
children 3
memory kilobytes 2428
memory kilobytes total 67520
memory percent 0.0%
memory percent total 1.8%
cpu percent 0.0%
cpu percent total 0.0%
data collected Sun, 05 Apr 2015 21:41:18
Process 'mariadb'
status Running
monitoring status Monitored
pid 24790
parent pid 24354
uptime 10d 4h 36m
children 0
memory kilobytes 216168
memory kilobytes total 216168
memory percent 5.9%
memory percent total 5.9%
cpu percent 0.0%
cpu percent total 0.0%
port response time 0.000s to 127.0.0.1:3306 [MYSQL via TCP]
data collected Sun, 05 Apr 2015 21:41:18
System 'zabbix.cc'
status Running
monitoring status Monitored
load average [0.00] [0.01] [0.05]
cpu 0.8%us 0.1%sy 0.1%wa
memory usage 1524496 kB [42.1%]
swap usage 0 kB [0.0%]
data collected Sun, 05 Apr 2015 21:41:18
确认monit自动启动进程
停止nginx进程之后,查看monit.log文件。
# systemctl stop nginx.service
# tailf /var/log/monit.log
[CST Apr 5 21:35:18] error : 'nginx' process is not running
[CST Apr 5 21:35:18] info : 'nginx' trying to restart
[CST Apr 5 21:35:18] info : 'nginx' start: /usr/bin/systemctl
配置OS自动启动
配置OS启动时的自动启动。根据系统及版本自动启动的命令不同,在这里介绍CentOS7上配置自动启动的方法。
# systemctl list-unit-files | grep monit.service
monit.service disabled
# systemctl enable monit.service
ln -s '/usr/lib/systemd/system/monit.service' '/etc/systemd/system/multi-user.target.wants/monit.service'
# systemctl list-unit-files | grep monit.service
monit.service enabled
Zabbix监视monit
当检测到进程停止时自动启动该进程的环境已经能够搭建好了,但是monit本身停止了就无法检测到了。在这里使用Zabbix监视monit。
监视monit进程
监控对象(Item)
监控对象(Item)
项目 | 配置 |
---|---|
Name | Process monit daemon running |
Type | Zabbix agent |
Key | proc.num[monit] |
数据类型 | Numeric(整数) |
项目 | 配置 |
---|---|
Name | Process monit daemon down |
逻辑条件式 | {Zabbix server:proc.num[monit].last()}=0 |
严重度 | Warning(警告) |
项目 | 配置 |
---|---|
Name | Process monit daemon running |
Type | Zabbix agent |
Key | log[/var/log/monit,error] |
数据类型 | 日志(Log) |
项目 | 配置 |
---|---|
Name | Process monit daemon error |
逻辑条件式 | (({Zabbix server:log[/var/log/monit,error].regexp(error)})#0)&({Zabbix server:log[/var/log/monit,error].nodata(300)}=0) |
严重度 | Warning(警告) |