- prometheus 官网 下载prometheus和altermanager
- altermanager配置文件:altermanager.yml 配置文件
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 1h
receiver: 'email.notice'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://127.0.0.1:5001/'
- name: 'email.notice'
email_configs:
- to: '[email protected]'
smarthost: 'smtp.exmail.qq.com:465'
from: '[email protected]'
auth_username: '[email protected]'
auth_password: 'xxxxxx'
require_tls: false
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
send_resolved 当问题解决了是否也要通知一下
route 是个重点,告警内容从这里进入,寻找自己应该用那种策略发送出去
receiver 一级的receiver,也就是默认的receiver,当告警进来后没有找到任何子节点和自己匹配,就用这个receiver
group_by 告警应该根据那些标签进行分组
group_wait 同一组的告警发出前要等待多少秒,这个是为了把更多的告警一个批次发出去
group_interval 同一组的多批次告警间隔多少秒后,才能发出
repeat_i