Docker部署Prometheus+AlertManager

g66x

已于 2023-12-20 09:34:28 修改

阅读量705

点赞数 5

CC 4.0 BY-SA版权

文章标签：云原生 prometheus

于 2023-12-20 09:33:41 首次发布

本文链接：https://blog.csdn.net/weixin_42697921/article/details/135099864

prometheus配置

#创建目录
mkdir -p /usr/local/soft/docker/prometheus
cd /usr/local/soft/docker/prometheus

#创建配置
vi prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).


# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['127.0.0.1:9091']


  - job_name: 'node'


    static_configs:
    - targets: ['ip:9100']
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - ip:9093  
rule_files:
  - /etc/prometheus/rule/*.yml

启动prometheus

docker run -d -p 9091:9091 --name prom    -v /usr/local/soft/docker/prometheus/:/etc/prometheus/     prom/prometheus --config.file=/etc/prometheus/prometheus.yml --web.enable-lifecycle --web.listen-address="0.0.0.0:9091"

访问http://ip:9091可以打开页面

alertmanager

mkdir -p /usr/local/soft/docker/alertmanager/
cd /usr/local/soft/docker/alertmanager/
vi alertmanager.yml

# 全局配置,全局配置，包括报警解决后的超时时间、SMTP 相关配置、各种渠道通知的 API 地址等等。
global:
  # 告警超时时间
  resolve_timeout: 5m
# 路由配置,设置报警的分发策略，它是一个树状结构，按照深度优先从左向右的顺序进行匹配。
route:
  # 用于将传入警报分组在一起的标签。
  # 基于告警中包含的标签，如果满足group_by中定义标签名称，那么这些告警将会合并为一个通知发送给接收器。
  group_by: ['alertname']
  # 发送通知的初始等待时间
  group_wait: 30s
  # 在发送有关新警报的通知之前需要等待多长时间
  group_interval: 5m
  # 如果已发送通知，则在再次发送通知之前要等待多长时间，通常约3小时或更长时间
  repeat_interval: 30s
  # 接受者名称
  receiver: 'web.hook'
# 配置告警消息接受者信息，例如常用的 email、wechat、slack、webhook 等消息通知方式
receivers:
# 接受者名称
- name: 'web.hook'
  # webhook URL
  webhook_configs:
  - url: 'http://ip:9111/alertmanager/hook'

启动

docker run --name alertmanager -d -p 9093:9093 \
 -v /usr/local/soft/docker/alertmanager/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
  prom/alertmanager:v0.25.0

创建rule

mkdir -p /usr/local/soft/docker/prometheus/rule
vi node_up.yml

groups:
- name: node-up
  rules:
  - alert: node-up
    expr: up{job="node"} == 0  
    for: 10s                   
    labels:                    
      severity: 1              
      team: node
    annotations:               
      #summary: "{{ $labels.instance }} 已停止运行超过 15s"
      #description: hello world
      content: "hello world"

webhook简单测试

from flask import Flask, request
import json

app = Flask(__name__)

@app.route('/alertmanager/hook', methods=['POST'])
def webhook():
    data = request.get_json()
    result = json.dumps(data)
    print("Received alert:")
    print(result)
    return 'OK'

if __name__ == '__main__':
    app.run(debug=True,host='0.0.0.0', port=9111)