我們在用mha自帶的masterha_manager腳本做MySQL主庫故障自動切換時,需要考慮如何讓masterha_manager監(jiān)控進程一直處于正常運行的狀態(tài)。而supervisor可以很好地解決這個問題,它可以將一個普通的命令行進程變?yōu)楹笈_daemon,并監(jiān)控進程狀態(tài),異常退出時能自動重啟。
公司主營業(yè)務:成都網(wǎng)站建設、做網(wǎng)站、移動網(wǎng)站開發(fā)等業(yè)務。幫助企業(yè)客戶真正實現(xiàn)互聯(lián)網(wǎng)宣傳,提高企業(yè)的競爭能力。創(chuàng)新互聯(lián)是一支青春激揚、勤奮敬業(yè)、活力青春激揚、勤奮敬業(yè)、活力澎湃、和諧高效的團隊。公司秉承以“開放、自由、嚴謹、自律”為核心的企業(yè)文化,感謝他們對我們的高要求,感謝他們從不同領域給我們帶來的挑戰(zhàn),讓我們激情的團隊有機會用頭腦與智慧不斷的給客戶帶來驚喜。創(chuàng)新互聯(lián)推出余姚免費做網(wǎng)站回饋大家。
這里列一下部署要點和管理命令
sudo pip install supervisor
二,supervisor配置:
mkdir -p /etc/supervisor/conf.d/
生成配置文件
# echo_supervisord_conf > /etc/supervisor/supervisord.conf
這一步可能會遇到以下報錯
Traceback (most recent call last): File "/usr/bin/echo_supervisord_conf", line 5, infrom pkg_resources import load_entry_point File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2655, in working_set.require(__requires__) File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 648, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 546, in resolve raise DistributionNotFound(req) pkg_resources.DistributionNotFound: meld3>=0.6.5
在網(wǎng)上查了一下原因,大概和python或者pip版本相關,通過源碼安裝一次meld3好了,簡單三步搞定:
git clone https://github.com/Supervisor/meld3 cd meld3 python setup.py install
查看配置文件
cat /etc/supervisor/supervisord.conf
[unix_http_server]
file=/tmp/supervisor.sock ; the path to the socket file
[supervisord]
logfile=/tmp/supervisord.log ; main log file; default $CWD/supervisord.log
logfile_maxbytes=50MB ; max main logfile bytes b4 rotation; default 50MB
logfile_backups=10 ; # of main logfile backups; 0 means none, default 10
loglevel=info ; log level; default info; others: debug,warn,trace
pidfile=/tmp/supervisord.pid ; supervisord pidfile; default supervisord.pid
nodaemon=false ; start in foreground if true; default false
minfds=1024 ; min. avail startup file descriptors; default 1024
minprocs=200 ; min. avail process descriptors;default 200
user=dbadmin ; default is current user, required if root
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
[include]
files = /etc/supervisor/conf.d/*.conf
配置要點:
1,其它的配置將可以使用生成的默認配置,但是user需要改成做免密碼登陸的用戶,比如這里的dbadmin,不然masterha_manager啟動會出錯,因為mha的免密碼登陸全部是用的dbadmin的帳號
2,管理進程的配置可以直接放在supervisor的主配置文件中的[program:xxx]段,但是最好每個進程準備一個配置文件,以方便管理,通過[include]段的file配置指定配置文件目錄。
cat /etc/supervisor/conf.d/masterha_manager_test.conf
[program:masterha_manager_test]
command=masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover ; 啟動命令
stdout_logfile=/tmp/manager.log ; stdout 日志輸出位置
stderr_logfile=/tmp/manager.log ; stderr 日志輸出位置
autostart=true ; 在 supervisord 啟動的時候自動啟動
autorestart=true ; 程序異常退出后自動重啟
startsecs=10 ; 啟動 10 秒后沒有異常退出,就當作已經(jīng)正常啟動
# supervisord -c /etc/supervisor/supervisord.conf
# ps -ef | grep super
dbadmin 11892 1 0 02:56 ? 00:00:00 /usr/bin/python /usr/bin/supervisord
root 13340 31610 0 02:56 pts/0 00:00:00 grep super
# supervisorctl status
masterha_manager_test RUNNING pid 11912, uptime 0:03:08
# ps -ef | grep master
root 1343 31610 0 02:59 pts/0 00:00:00 grep master
root 3228 1 0 2016 ? 00:01:33 /usr/libexec/postfix/master
dbadmin 11912 11892 0 02:56 ? 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover
可以看到masterha_manager已經(jīng)啟起來了
直接殺掉masterha_manager進程模擬masterha_manager異常退出:
# ps -ef | grep master
root 1343 31610 0 02:59 pts/0 00:00:00 grep master
root 3228 1 0 2016 ? 00:01:33 /usr/libexec/postfix/master
dbadmin 11912 11892 0 02:56 ? 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover
# kill -9 11912
# ps -ef | grep master
dbadmin 1707 11892 5 03:30 ? 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/mha/test.cnf --ignore_last_failover
root 2054 31610 0 03:30 pts/0 00:00:00 grep master
root 3228 1 0 2016 ? 00:01:33 /usr/libexec/postfix/master
可以看到supervisor又重新啟了masterha_manager監(jiān)控進程
supervisord: 初始啟動Supervisord,啟動、管理配置中設置的進程;
supervisorctl stop(start, restart) xxx,停止(啟動,重啟)某一個進程(xxx);
supervisorctl reread: 只載入最新的配置文件, 并不重啟任何進程;
supervisorctl reload: 載入最新的配置文件,停止原來的所有進程并按新的配置啟動管理所有進程;
supervisorctl update: 根據(jù)最新的配置文件,啟動新配置或有改動的進程,配置沒有改動的進程不會受影響而重啟;
準備啟動腳本supervisord.sh
# chmod +x supervisord.sh # mv supervisord.sh /etc/init.d/supervisord # chkconfig --add supervisord # chkconfig --level 345 supervisord on cat /etc/rc.d/init.d/supervisord #!/bin/sh # # /etc/rc.d/init.d/supervisord # # Supervisor is a client/server system that # allows its users to monitor and control a # number of processes on UNIX-like operating # systems. # # chkconfig: - 64 36 # description: Supervisor Server # processname: supervisord # Source init functions . /etc/rc.d/init.d/functions prog="supervisord" prog_bin="/usr/bin/supervisord" PIDFILE="/tmp/supervisord.pid" CONFILE="/etc/supervisor/supervisord.conf" start() { echo -n $"Starting $prog: " daemon $prog_bin -c $CONFILE --pidfile $PIDFILE [ -f $PIDFILE ] && success $"$prog startup" || failure $"$prog startup" echo } stop() { echo -n $"Shutting down $prog: " [ -f $PIDFILE ] && killproc $prog || success $"$prog shutdown" echo } case "$1" in start) start ;; stop) stop ;; status) status $prog ;; restart) stop start ;; *) echo "Usage: $0 {start|stop|restart|status}" ;; esac