1、NRPE簡介
創(chuàng)新互聯(lián)專業(yè)提供四川聯(lián)通機(jī)房服務(wù)器托管服務(wù),為用戶提供五星數(shù)據(jù)中心、電信、雙線接入解決方案,用戶可自行在線購買四川聯(lián)通機(jī)房服務(wù)器托管服務(wù),并享受7*24小時(shí)金牌售后服務(wù)。Nagios監(jiān)控遠(yuǎn)程主機(jī)的方法有多種,其方式包括SNMP、NRPE、SSH和NCSA等。這里介紹其通過NRPE監(jiān)控遠(yuǎn)程Linux主機(jī)的方式。
NRPE(Nagios RemotePluginExecutor)是用于在遠(yuǎn)端服務(wù)器上運(yùn)行檢測命令的守護(hù)進(jìn)程,它用于讓Nagios監(jiān)控端基于安裝的方式觸發(fā)遠(yuǎn)端主機(jī)上的檢測命令,并將檢測結(jié)果輸出至監(jiān)控端。而其執(zhí)行的開銷遠(yuǎn)低于基于SSH的檢測方式,而且檢測過程并不需要遠(yuǎn)程主機(jī)上的系統(tǒng)帳號(hào)等信息,其安全性也高于SSH的檢測方式。
2、安裝配置被監(jiān)控端
1)先添加nagios用戶
# useradd -s/sbin/nologin nagios
2)NRPE依賴于nagios-plugins,因此,需要先安裝之
[root@node3 ~]# ls
nrpe-2.15.tar.gz
nagios-plugins-1.5.tar.gz
[root@node3 ~]# tar-xf nrpe-2.15.tar.gz
安裝編譯環(huán)境
[root@node3 ~]# yuminstall gcc make -y
# tar zxfnagios-plugins-1.4.15.tar.gz
# cdnagios-plugins-1.4.15
# ./configure--with-nagios-user=nagios --with-nagios-group=nagios
# make all
# make instal
3)安裝NRPE
[root@node3nrpe-2.15]# yum install openssl openssl-devel -y
# tar -zxvfnrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure--with-nrpe-user=nagios \
--with-nrpe-group=nagios \
--with-nagios-user=nagios \
--with-nagios-group=nagios \
--enable-command-args \
--enable-ssl #如果要啟用ssl的話,需要安裝opensslopenssl-devel包
# make all
# make install-plugin
# make install-daemon
# makeinstall-daemon-config
4)配置NRPE
# vim /usr/local/nagios/etc/nrpe.cfg
log_facility=daemon
pid_file=/var/run/nrpe.pid
server_address=192.168.0.3
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=192.168.0.4
command_timeout=60
connection_timeout=300
debug=0
上述配置指令可以做到見名知義,因此,配置過程中根據(jù)實(shí)際需要進(jìn)行修改即可。其中,需要特定說明的是allowed_hosts指令用于定義本機(jī)所允許的監(jiān)控端的IP地址。
5)啟動(dòng)NRPE
#/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d
為了便于NRPE服務(wù)的啟動(dòng),可以將如下內(nèi)容定義為/etc/init.d/nrped腳本:
#!/bin/bash # chkconfig: 2345 8812 # description: NRPEDAEMON NRPE=/usr/local/nagios/bin/nrpe NRPECONF=/usr/local/nagios/etc/nrpe.cfg case "$1"in start) echo-n "Starting NRPE daemon..." $NRPE-c $NRPECONF -d echo" done." ;; stop) echo-n "Stopping NRPE daemon..." pkill-u nagios nrpe echo" done." ;; restart) $0stop sleep2 $0start ;; *) echo"Usage: $0 start|stop|restart" ;; esac exit 0
或者,也可以在/etc/xinetd.d目錄中創(chuàng)建nrpe文件,使其成為一個(gè)基于非獨(dú)立守護(hù)進(jìn)程的服務(wù),文件內(nèi)容如下:
service nrpe { flags= REUSE socket_type= stream wait= no user= nagios group= nagios server= /usr/local/nagios/bin/nrpe server_args= -c /etc/nagios/nrpe.cfg -i log_on_failure+= USERID disable= no }
此種情況下啟動(dòng)NRPE進(jìn)程需要通過重啟xinetd來實(shí)現(xiàn)。
6)配置允許遠(yuǎn)程主機(jī)監(jiān)控的對象
在被監(jiān)控端,可以通過NRPE監(jiān)控的服務(wù)或資源需要通過nrpe.cfg文件使用命令進(jìn)行定義,定義命令的語法格式為:command[
command[check_rootdisk]=/usr/local/nagios/libexec/check_disk-w 20% -c 10% -p /
command[check_swap]=/usr/local/nagios/libexec/check_disk-w 40% -c 20%
command[check_sensors]=/usr/local/nagios/libexec/check_sensors #需要安裝sensor
command[check_users]=/usr/local/nagios/libexec/check_users-w 10 -c 20
command[check_load]=/usr/local/nagios/libexec/check_load-w 10,8,5 -c 20,18,15
command[check_sda1]=/usr/local/nagios/libexec/check_disk-w 20% -c 10% -p /dev/sda1
command[check_sda2]=/usr/local/nagios/libexec/check_disk-w 20% -c 10% -p /dev/sda2
command[check_zombies]=/usr/local/nagios/libexec/check_procs-w 5 -c 10 -s Z
command[check_all_procs]=/usr/local/nagios/libexec/check_procs-w 150 -c 200
7)啟動(dòng)進(jìn)程
[root@node3 ~]#chkconfig --add nrped [root@node3 ~]#chkconfig --list nrped nrped 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@node3 ~]#service nrped start Starting NRPEdaemon...done. [root@node3 ~]#service nrped restart Stoping NRPEdaemon...done. Starting NRPEdaemon...done. [root@node3 ~]#netstat -tlnp Active Internetconnections (only servers) Proto Recv-Q Send-QLocal Address ForeignAddress State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1212/sshd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1296/master tcp 0 0 192.168.0.3:5666 0.0.0.0:* LISTEN 35058/nrpe tcp 0 0 :::22 :::* LISTEN 1212/sshd tcp 0 0 ::1:25 :::* LISTEN 1296/master
3、配置監(jiān)控端,nagios的服務(wù)端
1)安裝NRPE
# tar -zxvfnrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure--with-nrpe-user=nagios \
--with-nrpe-group=nagios \
--with-nagios-user=nagios \
--with-nagios-group=nagios \
--enable-command-args \
--enable-ssl
# make all
# make install-plugin
2)定義如何監(jiān)控遠(yuǎn)程主機(jī)及服務(wù):
通過NRPE監(jiān)控遠(yuǎn)程Linux主機(jī)要使用chech_nrpe插件進(jìn)行,其語法格式如下:
check_nrpe -H
測試遠(yuǎn)端主機(jī)是否正常
[root@node4 libexec]#./check_nrpe -H 192.168.0.3
NRPE v2.15
使用示例1:
定義監(jiān)控遠(yuǎn)程Linux主機(jī)swap資源的命令:
[root@node4 objects]# vim /usr/local/nagios/etc/objects/commands.cfg #添加下面的命令
definecommand{
command_namecheck_swap_nrpe
command_line $USER1$/check_nrpe –H"$HOSTADDRESS$" -c "ARG1"
}
定義遠(yuǎn)程Linux主機(jī)的swap資源:
defineservice
{
usegeneric-service
host_namelinuxserver1,linuxserver2
hostgroup_namelinux-servers
service_descriptionSWAP
check_commandcheck_swap_nrpe
normal_check_interval30
}
使用示例2:
如果希望上面的command定義更具有通用性,那么上面的定義也可以修改為如下:
定義監(jiān)控遠(yuǎn)程Linux主機(jī)的命令:
definecommand
{
command_namecheck_nrpe
command_line$USER1$/check_nrpe –H "$HOSTADDRESS$" -c $ARG1$
}
$USER1$/check_nrpe –H"$HOSTADDRESS$" -c $ARG1$ $ARG2$
定義遠(yuǎn)程Linux主機(jī)的swap資源:
defineservice
{
usegeneric-service
host_namelinuxserver1,linuxserver2
hostgroup_namelinux-servers
service_descriptionSWAP
check_commandcheck_nrpe!check_swap
normal_check_interval30
}
使用示例3:
如果還希望在監(jiān)控遠(yuǎn)程Linux主機(jī)時(shí)還能向其傳遞參數(shù),則可以使用類似如下方式進(jìn)行:
定義監(jiān)控遠(yuǎn)程Linux主機(jī)disk資源的命令:
definecommand
{
command_namecheck_swap_nrpe
command_line$USER1$/check_nrpe –H "$HOSTADDRESS$" -c "check_swap" -a$ARG1$ $ARG2$
}
定義遠(yuǎn)程Linux主機(jī)的swap資源:
defineservice
{
usegeneric-service
host_namelinuxserver1,linuxserver2
hostgroup_namelinux-servers
service_descriptionSWAP
check_commandcheck_swap_nrpe!20!10
normal_check_interval30
}
實(shí)際操作:
[root@node4 objects]# vim /usr/local/nagios/etc/objects/commands.cfg #添加下面的命令 define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H"$HOSTADDRESS$" -c $ARG1$ } [root@node4 ~]# cd/usr/local/nagios/etc/objects/ [root@node4 objects]#vim commands.cfg [root@node4 objects]#cp -p windows.cfg linuxserver.cfg [root@node4 objects]#ll linuxserver.cfg -rw-rw-r--. 1 nagiosnagios 4019 Feb 23 21:49 linuxserver.cfg 根據(jù)被監(jiān)控端node3的關(guān)于命令的定義來定義這個(gè) [root@node4 objects]#vim linuxserver.cfg define host{ use linux-server ; Inherit default values from a template host_name linuxserver ; The name we're giving to this host alias My linux Server ; A longer nameassociated with the host address 192.168.0.3 ; IP address of the host } ############################################################################### ############################################################################### # # HOST GROUPDEFINITIONS # ############################################################################### ############################################################################### # Define a hostgroupfor Windows machines # All hosts that usethe windows-server template will automatically be a member of this group #define hostgroup{ # hostgroup_name windows-servers ; The name of the hostgroup # alias windows Servers ; Long name of thegroup # } ############################################################################### ############################################################################### # # SERVICE DEFINITIONS # ############################################################################### ############################################################################### # Create a servicefor monitoring the version of NSCLient++ that is installed # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description CHECK USERS check_command check_nrpe!check_users } # Create a servicefor monitoring the uptime of the server # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description Load check_command check_nrpe!check_load } # Create a servicefor monitoring CPU load # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description SDA1 check_command check_nrpe!check_sda1 } define service{ use generic-service host_name linuxserver service_description SDA2 check_command check_nrpe!check_sda2 } # Create a servicefor monitoring memory usage # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description Zombie check_command check_nrpe!check_zombie_procs } # Create a servicefor monitoring C:\ disk usage # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description Total_procs check_command check_nrpe!check_total_procs } # Create a servicefor monitoring the W3SVC service # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description Swap check_command check_nrpe!check_swap } # Create a servicefor monitoring the Explorer.exe process # Change thehost_name to match the name of the host you defined above define service{ use generic-service host_name linuxserver service_description Rootdisk check_command check_nrpe!check_rootdisk } define service{ use generic-service host_name linuxserver service_description Sensor check_command check_nrpe!check_sensors
然后修改nagios.cfg,添加cfg_file一個(gè)條目
[root@node4 ~]# vim/usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/linuxserver.cfg
檢查語法:
[root@node4 objects]#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
………………………….
Checking timeperiods...
Checked5 time periods.
Checking for circularpaths between hosts...
Checking for circularhost and service dependencies...
Checking global eventhandlers...
Checking obsessivecompulsive processor commands...
Checking miscsettings...
Total Warnings: 0
Total Errors: 0
然后重啟nagios服務(wù)
[root@node4 ~]# clear
[root@node4 ~]#service nagios restart
Running configurationcheck...done.
Stopping nagios:.done.
Starting nagios:done.
[root@node4 ~]#
然后進(jìn)nagios界面,就可以看到我們剛剛加入的機(jī)器,等一會(huì)就會(huì)檢查
過一段時(shí)間監(jiān)測完成以后就可以看到全是ok了
另外有需要云服務(wù)器可以了解下創(chuàng)新互聯(lián)scvps.cn,海內(nèi)外云服務(wù)器15元起步,三天無理由+7*72小時(shí)售后在線,公司持有idc許可證,提供“云服務(wù)器、裸金屬服務(wù)器、高防服務(wù)器、香港服務(wù)器、美國服務(wù)器、虛擬主機(jī)、免備案服務(wù)器”等云主機(jī)租用服務(wù)以及企業(yè)上云的綜合解決方案,具有“安全穩(wěn)定、簡單易用、服務(wù)可用性高、性價(jià)比高”等特點(diǎn)與優(yōu)勢,專為企業(yè)上云打造定制,能夠滿足用戶豐富、多元化的應(yīng)用場景需求。