本篇內(nèi)容主要講解“怎么部署docker swarm集群監(jiān)控”,感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷,實(shí)用性強(qiáng)。下面就讓小編來帶大家學(xué)習(xí)“怎么部署docker swarm集群監(jiān)控”吧!
專注于為中小企業(yè)提供成都做網(wǎng)站、網(wǎng)站設(shè)計(jì)服務(wù),電腦端+手機(jī)端+微信端的三站合一,更高效的管理,為中小企業(yè)順昌免費(fèi)做網(wǎng)站提供優(yōu)質(zhì)的服務(wù)。我們立足成都,凝聚了一批互聯(lián)網(wǎng)行業(yè)人才,有力地推動(dòng)了上千多家企業(yè)的穩(wěn)健成長,幫助中小企業(yè)通過網(wǎng)站建設(shè)實(shí)現(xiàn)規(guī)模擴(kuò)充和轉(zhuǎn)變。
Docker
現(xiàn)在Docker Swarm已經(jīng)徹底輸給了K8S,但是現(xiàn)在K8S依然很復(fù)雜,上手難度較Docker Swarm高,如果是小規(guī)模團(tuán)隊(duì)且需要容器編排的話,使用Docker Swarm還是適合的。
目前Docker Swarm有一個(gè)問題一直沒有解決,如果業(yè)務(wù)需要知道用戶的請求IP,則Docker Swarm滿足不了要求。目前部署在Docker Swarm內(nèi)的服務(wù),無法獲取到用戶的請求IP。
具體可以看看這個(gè)ISSUE->Unable to retrieve user's IP address in docker swarm mode
思路整體來說是使用Influxdb+Grafana+cadvisor,其中cadvisor
負(fù)責(zé)數(shù)據(jù)的收集,每一臺(tái)節(jié)點(diǎn)都部署一個(gè)cadvisor服務(wù),Influxdb負(fù)責(zé)數(shù)據(jù)的存儲(chǔ),Grafana負(fù)責(zé)數(shù)據(jù)的可視化。
主機(jī) | IP |
---|---|
master(manager) | 192.168.1.60 |
node1(worker) | 192.168.1.61 |
node2(worker) | 192.168.1.62 |
我這里是將master節(jié)點(diǎn)當(dāng)作監(jiān)控?cái)?shù)據(jù)存儲(chǔ)以及可視化服務(wù)的節(jié)點(diǎn)作為演示,一般是拿一個(gè)worker節(jié)點(diǎn)做這樣的工作。
在master機(jī)器上初始化集群,運(yùn)行docker swarm init --advertise-addr {MASTER-IP}
[root@master ~]# docker swarm init --advertise-addr 192.168.1.60 Swarm initialized: current node (138n5rwjz83y8goyzepp1cdo7) is now a manager. To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-67je7chylnpyt0s4k1ee63rhxgh0qijiah9gadvcr7i6uab909-535nf6qu6v7b8dscc0plghr9j 192.168.1.60:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions
在node節(jié)點(diǎn)運(yùn)行提示的命令加入到集群中docker swarm join --token SWMTKN-1-67je7chylnpyt0s4k1ee63rhxgh0qijiah9gadvcr7i6uab909-535nf6qu6v7b8dscc0plghr9j 192.168.1.60:2377
manager節(jié)點(diǎn)初始化集群后,都會(huì)有這樣一個(gè)提示,這個(gè)的命令只是給個(gè)示例,實(shí)際命令需要根據(jù)初始化集群后的真實(shí)情況來運(yùn)行。
[root@node1 ~]# docker swarm join --token SWMTKN-1-67je7chylnpyt0s4k1ee63rhxgh0qijiah9gadvcr7i6uab909-535nf6qu6v7b8dscc0plghr9j 192.168.1.60:2377 This node joined a swarm as a worker.
[root@node2 ~]# docker swarm join --token SWMTKN-1-67je7chylnpyt0s4k1ee63rhxgh0qijiah9gadvcr7i6uab909-535nf6qu6v7b8dscc0plghr9j 192.168.1.60:2377 This node joined a swarm as a worker.
docker node ls
[root@master ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION 138n5rwjz83y8goyzepp1cdo7 * master Ready Active Leader 18.09.8 q03by75rqur63lx36cmordf11 node1 Ready Active 18.09.8 6shdf5ej4b5u7x877bg9nyjk3 node2 Ready Active
到目前為止集群已經(jīng)搭建完成了,接下來開始部署服務(wù)
docker stack deploy -c docker-compose-monitor.yml monitor
[root@master ~]# docker stack deploy -c docker-compose-monitor.yml monitor Creating network monitor_default Creating service monitor_influx Creating service monitor_grafana Creating service monitor_cadvisor
docker-compose-monitor.yml
文件內(nèi)容
version: '3' services: influx: image: influxdb volumes: - influx:/var/lib/influxdb deploy: replicas: 1 placement: constraints: - node.role == manager grafana: image: grafana/grafana ports: - 0.0.0.0:80:3000 volumes: - grafana:/var/lib/grafana depends_on: - influx deploy: replicas: 1 placement: constraints: - node.role == manager cadvisor: image: google/cadvisor hostname: '{{.Node.Hostname}}' command: -logtostderr -docker_only -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=influx:8086 volumes: - /:/rootfs:ro - /var/run:/var/run:rw - /sys:/sys:ro - /var/lib/docker/:/var/lib/docker:ro depends_on: - influx deploy: mode: global volumes: influx: driver: local grafana: driver: local
下載docker-compose-monitor.yml
docker service ls
[root@master ~]# docker service ls ID NAME MODE REPLICAS IMAGE PORTS qth5tssf2sm1 monitor_cadvisor global 3/3 google/cadvisor:latest p2vbxe7ic175 monitor_grafana replicated 1/1 grafana/grafana:latest *:80->3000/tcp von1rpeqq7vj monitor_influx replicated 1/1 influxdb:latest
到目前為止,服務(wù)已經(jīng)部署完成了,三臺(tái)機(jī)器各自部署一個(gè)cadvisor
,在master節(jié)點(diǎn)部署了grafana
和influxdb
查看一下master機(jī)器上的服務(wù)docker ps
[root@master ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 55965fdf13a3 grafana/grafana:latest "/run.sh" 3 hours ago Up 3 hours 3000/tcp monitor_grafana.1.l9uh0ov7ltk7q2yollmk4x1v9 0bf544c7d81c google/cadvisor:latest "/usr/bin/cadvisor -…" 3 hours ago Up 3 hours 8080/tcp monitor_cadvisor.138n5rwjz83y8goyzepp1cdo7.l53vufoivp0oe8tyy14nh0jof 3ce050f0483e influxdb:latest "/entrypoint.sh infl…" 3 hours ago Up 3 hours 8086/tcp monitor_influx.1.vraeh8ektium1j1jd27qvq1au [root@master ~]#
可以看到是符合預(yù)期的,接下來進(jìn)一步查看cadvisor
容器的日志docker logs -f 0bf544c7d81c
[root@master ~]# docker logs -f 0bf544c7d81c W0209 09:32:15.730951 1 manager.go:349] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory E0209 09:33:15.783705 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""} E0209 09:34:15.818661 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""} E0209 09:35:16.009312 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""} E0209 09:36:16.027113 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""} E0209 09:37:16.107051 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""} E0209 09:38:16.215684 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""} E0209 09:39:16.305772 1 memory.go:94] failed to write stats to influxDb - {"error":"database not found: \"cadvisor\""}
可以看到現(xiàn)在一直是在報(bào)錯(cuò)的,因?yàn)槟壳暗?code>influx容器中沒有cadvisor
這樣的數(shù)據(jù)庫存在,接下來我們進(jìn)入influx
容器并創(chuàng)建對應(yīng)的cadvisor
數(shù)據(jù)庫,在master機(jī)器上執(zhí)行以下命令即可。
docker exec `docker ps | grep -i influx | awk '{print $1}'` influx -execute 'CREATE DATABASE cadvisor'
當(dāng)然,也可以分步驟執(zhí)行
找到influxdb的容器
進(jìn)入到influxdb容器內(nèi)并登陸influx
創(chuàng)建數(shù)據(jù)庫
這里就不演示了。
到目前為止,數(shù)據(jù)已經(jīng)在收集了,并且數(shù)據(jù)存儲(chǔ)在influxdb
中。接下來配置grafana將數(shù)據(jù)進(jìn)行可視化。
因?yàn)閐ocker-compose-monitor.yml文件內(nèi)給grafna配置的端口是80,這里直接訪問master機(jī)器的IP就可以訪問到grafana,在瀏覽器打開192.168.1.60
.
grafana
默認(rèn)的帳號(hào)是admin
默認(rèn)的密碼是admin
首次登陸后會(huì)提示修改密碼,新密碼繼續(xù)設(shè)置為admin
也沒關(guān)系。
登陸成功后開始設(shè)置數(shù)據(jù)源
打開左邊菜單欄進(jìn)入數(shù)據(jù)源配置頁面
添加新的數(shù)據(jù)源,我這里是添加過了,所以會(huì)有一個(gè)influxdb的數(shù)據(jù)源顯示。
選擇influxdb類型的數(shù)據(jù)源
填寫influxdb對應(yīng)的信息,Name填寫influx
,因?yàn)榇龝?huì)要用到一個(gè)grafana模版,所以這里叫influx名字,URL填http://influx:8086
,這個(gè)也不是固定的,本次docker-compose-monitor.yml
文件內(nèi)influxdb
的容器名叫influx
,端口開放出來的為8086(默認(rèn)
),所以這里填influx:8086
到目前為止,數(shù)據(jù)源相關(guān)的內(nèi)容已經(jīng)配置完成了。
這里使用模版只是為了演示效果,如果模版的樣式不太滿意,可以研究下grafana自行調(diào)整。
首先打開grafana的dashboard市場下載模版https://grafana.com/grafana/dashboards/4637/reviews
選中dashboard菜單,選中import進(jìn)行導(dǎo)入
打開dashboard就已經(jīng)可以看到dashboard模版的內(nèi)容了.
一個(gè)基本的Docker Swarm集群監(jiān)控就搭建完成了
還有更高級的也許后面會(huì)更新一篇blog進(jìn)行講述.例如當(dāng)某個(gè)值(CPU)達(dá)到某個(gè)閥值,發(fā)送釘釘或者slack消息進(jìn)行告警
只要明白思路,實(shí)操基本上沒有什么問題。
到此,相信大家對“怎么部署docker swarm集群監(jiān)控”有了更深的了解,不妨來實(shí)際操作一番吧!這里是創(chuàng)新互聯(lián)網(wǎng)站,更多相關(guān)內(nèi)容可以進(jìn)入相關(guān)頻道進(jìn)行查詢,關(guān)注我們,繼續(xù)學(xué)習(xí)!