使用shell脚本监控网站运行状态
前言:好久没有写博客了,上来把之前写的博客几乎全都清理掉了,想写的时候写上一些,蛮不错。
shell监控网站/tomcat状态,依靠返回状态码来进行判断,返回200,302认为状态是正常的,否则认为tomcat/nginx/LB/Haproxy/apache挂掉了,脚本实现如下:
1. 创建一个站点文件夹,吧需要监控的地址都写到http_site文件里面
vim http_site
### Nginx site begin ###
http://192.168.129.86:38020
http://192.168.129.86:38021
### Nginx site end ###
### LB site begin ###
http://192.168.2.30:38020
http://192.168.2.30:38024/38025task
### LB site end ###
### Web site begin ###
http://192.168.129.91:8030
http://192.168.129.93:8030
### Web site end ###
### Task site begin ###
http://192.168.129.95:8032/38023task
http://192.168.129.95:8033/38027task
### Task site end ###
### Mobile site begin ###
http://192.168.129.92:8030
http://192.168.129.92:8040
### Mobile site end ###
2. 编写shell脚本实现监控功能,使用curl访问网站,过滤出返回的状态码当做判断条件,如有返回状态码非200/302则发送邮件报警
vim check_site.sh
#!/bin/bash
mysite=/root/script/check_http/http_site
check_status=/root/script/check_http/temp_status
historyfile=/root/script/check_http/history/`date +%Y-%m-%d`/`date +%T`
failurefile=/root/script/check_http/history/`date +%Y-%m-%d`/`date +%T`_failure
mkdir /root/script/check_http/history/`date +%Y-%m-%d` &>/dev/null
for site in `grep -v -E "^#|^$" $mysite`
do
curl -s -I --connect-timeout 3 -m 5 $site | grep "HTTP/1.1" | awk '{print $2}' > $check_status
status=`cat $check_status`
if [[ $status -eq 200 ]] || [[ $status -eq 302 ]]
then
echo "###########################" >>$historyfile
echo "http_site $site Access Successful" >>$historyfile
else
echo "###########################" >>$historyfile
echo "http_site $site Access Failure" >>$historyfile
fi
done
grep "Access Failure" $historyfile &>/dev/null
if [ $? -eq 0 ]
then
echo -e "\n\nThe following tomcat is not started !!!\n" >> $failurefile
echo -e "Please check the services !!!\n" >> $failurefile
echo -e "#############################################\n" >> $failurefile
grep "Access Failure" $historyfile >> $failurefile
echo -e "\n#############################################" >> $failurefile
mail -s "SFA_Liby_Tomcat_Check !!!" baiyongjie@winchannel.net misterbyj@163.com tangzhiyu@winchannel.net < $failurefile
fi
3. 配置报警邮箱
vim /etc/mail.rc
set hold
set append
set ask
set crt
set dot
set keep
set emptybox
set indentprefix="> "
set quote
set sendcharsets=iso-8859-1,utf-8
set showname
set showto
set newmail=nopoll
set autocollapse
ignore received in-reply-to message-id references
ignore mime-version content-transfer-encoding
fwdretain subject date from to
set bsdcompat
set from=15600970600@163.com
set smtp=smtp.163.com
set smtp-auth-user=15600970600@163.com smtp-auth-password=Password smtp-auth=login
4.添加计划任务,每5分钟运行一次
crontab -e
*/5 * * * * /bin/bash /root/script/check_http/check_site.sh
5. 测试脚本,写好以后已经运行好几天了,效果还不错,分享给大家
为了验证效果,当时停了几个tomcat,6月14号23:12分停掉的,停掉后运行脚本检测到有tomcat没有运行,会生成 _failure文件记录,并发出邮件,达到报警效果