Prometheus+AlertManager Install HandBook

Prometheus 安装


官网 https://prometheus.io/download/*

Github https://github.com/prometheus/prometheus/releases

1
2
3
4
5
6
7
8
9
cd /tmp

wget https://github.com/prometheus/prometheus/releases/download/v2.10.0/prometheus-2.10.0.linux-amd64.tar.gz

tar zxvf prometheus-2.10.0.linux-amd64.tar.gz

cp -R prometheus-2.10.0.linux-amd64 /usr/local/prometheus

cd /usr/local/prometheus

默认端口9090

1
nohup ./prometheus --config.file=prometheus.yml > /var/log/prometheus.log 2>&1 &

开启防火墙

1
2
3
firewall-cmd --zone=public --add-port=9090/tcp --permanent

firewall-cmd --reload

Node_exporter 安装


Github https://github.com/prometheus/node_exporter/releases

1
2
3
4
5
6
7
8
9
cd /tmp

wget https://github.com/prometheus/node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.gz

tar zxvf node_exporter-0.18.1.linux-amd64.tar.gz

cp -R node_exporter-0.18.1.linux-amd64 /usr/local/node_exporter

cd /usr/local/node_exporter

默认端口9100

1
nohup ./node_exporter  > /var/log/node_exporter.log 2>&1 &

开启防火墙

1
2
3
firewall-cmd --zone=public --add-port=9100/tcp --permanent

firewall-cmd --reload

修改prometheus配置文件 prometheus.yml ,例下:

scrape_configs:

- job_name: ‘prometheus’

​ static_configs:

​ - targets: [‘prometheusHostIp:9090’]

- job_name: ‘server’

​ static_configs:

​ - targets: [‘node_exporterHostIp:9100’]

AlertManager安装


Github https://github.com/prometheus/alertmanager/releases

1
2
3
cd /tmp

wget https://github.com/prometheus/alertmanager/releases/download/v0.17.0/alertmanager-0.17.0.linux-amd64.tar.gz

默认端口9093

1
nohup ./alertmanager  >> /var/log/alertmanager.log 2>&1 &

开启防火墙

1
2
3
firewall-cmd --zone=public --add-port=9093/tcp --permanent

firewall-cmd --reload

开机启动脚本


#开机自启动脚本及配置方法

#脚本放置目录/etc/rc.d/init.d

#================nodeexporter==================

#!/bin/bash

#chkconfig: - 85 15

#description:开机自启脚本

nohup /usr/local/node_exporter/node_exporter > /var/log/node_exporter.log 2>&1 &

#===================END=======================

#================prometheus==================

#chkconfig: - 85 15

#description:开机自启脚本

nohup /usr/local/prometheus/prometheus –config.file=/usr/local/prometheus/prometheus.yml > /var/log/prometheus.log 2>&1 &

#===================END=======================

#================AlertManager==================

#chkconfig: - 85 15

#description:开机自启脚本

nohup /usr/local/alertmanager/alertmanager >> /var/log/alertmanager.log 2>&1 &

#===================END=======================

#==========shell命令=================

chmod +x XXX.sh

chkconfig –add XXX.sh

chkconfig XXX.sh on

Prometheus基本配置文件模板


#=============Prometheus基本配置文件模板===================

# my global config

global:

scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.

evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

scrape_timeout: 10s # scrape_timeout is set to the global default (10s).

# Alertmanager configuration

alerting:

alertmanagers:

- scheme: http

​ static_configs:

​ - targets:

​ - “localhost:9093”

rule_files:

- ‘/usr/local/prometheus/rules/hostBase.yml’

# Here it’s Prometheus itself.

scrape_configs:

# The job name is added as a label job=<job_name> to any timeseries scraped from this config.

- job_name: ‘prometheus’

​ # metrics_path defaults to ‘/metrics’

​ # scheme defaults to ‘http’.

​ static_configs:

​ - targets: [‘localhost:9090’]

- job_name: ‘server_ansible & Jenkins’

​ static_configs:

​ - targets: [‘主机IP:9100’]

AlertManager 相关配置


AlertManager.yml 配置

global:

resolve_timeout: 5m

wechat_api_url: “https://qyapi.weixin.qq.com/cgi-bin/" # 这个暂时不用改,照抄

wechat_api_secret: “相应的api_secret”

wechat_api_corp_id: “相应的corp_id”

templates:

  • ‘/usr/local/alertmanager/template/wechat.tmpl’

route:

group_by: [‘alertname’]

group_wait: 10s

group_interval: 10s

repeat_interval: 20m

receiver: wechat

receivers:

  • name: “wechat”

wechat_configs:

  • send_resolved: true

​ to_user: “@all”

​ corp_id: “Corp_id同上”

​ #to_party: “1”

​ agent_id: “1000002”

​ api_secret: “Api_secret同上”

wechat告警通知模板
{{define "wechat.default.message" }} {{ if gt (len .Alerts.Firing) 0 -}}

Alerts Firing:

{{ range .Alerts}}

========start==========

告警程序:prometheus_alert

告警级别: {{ .Labels.severity }}

告警类型: {{ .Labels.alertname }}

故障主机: {{ .Labels.instance }}

告警主题: {{ .Annotations.summary }}

告警详情: {{ .Annotations.description }}

触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}

=========end==========

{{- end }} {{- end }} {{ if gt (len .Alerts.Resolved) 0 -}}

Alerts Resolved:

{{ range .Alerts}}

========start==========

告警程序:prometheus_alert

告警级别: {{ .Labels.severity }}

告警类型: {{ .Labels.alertname }}

故障主机: {{ .Labels.instance }}

告警主题: {{ .Annotations.summary }}

触发时间: {{ .StartsAt.Format "2006-01-02 15:04:05" }}

恢复时间: {{ .EndsAt.Format "2006-01-02 15:04:05" }}

=========end==========

{{- end }} {{- end }} {{- end }}