Prometheus AlertManager
Route alerts from Prometheus to a11ops
Connect Prometheus AlertManager to a11ops to centralize your metrics-based alerts. This integration preserves all labels, annotations, and alert metadata.
Prerequisites
- Prometheus with AlertManager installed and configured
- An a11ops workspace with an active API key
- Network access from AlertManager to api.a11ops.com
Setup Instructions
Step 1: Create Integration in a11ops
- Navigate to your workspace settings
- Go to “Integrations” → “Add Integration”
- Select “Prometheus AlertManager”
- Name it (e.g., “Production Prometheus”)
- Copy the generated webhook URL
Step 2: Configure AlertManager
Add a11ops as a webhook receiver in your AlertManager configuration:
# alertmanager.yml
global:
resolve_timeout: 5m
route:
group_by: ['alertname', 'cluster', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h
receiver: 'default'
routes:
- receiver: 'a11ops-critical'
match:
severity: 'critical'
- receiver: 'a11ops-warning'
match:
severity: 'warning'
receivers:
- name: 'default'
webhook_configs:
- url: 'https://api.a11ops.com/v1/webhooks/YOUR_INTEGRATION_ID'
send_resolved: true
- name: 'a11ops-critical'
webhook_configs:
- url: 'https://api.a11ops.com/v1/webhooks/YOUR_INTEGRATION_ID'
send_resolved: true
- name: 'a11ops-warning'
webhook_configs:
- url: 'https://api.a11ops.com/v1/webhooks/YOUR_INTEGRATION_ID'
send_resolved: trueStep 3: Reload AlertManager
Apply the configuration changes:
# Reload AlertManager configuration
curl -X POST http://localhost:9093/-/reload
# Or if using systemd
sudo systemctl reload alertmanagerExample Alert Rules
High CPU Usage Alert
# prometheus-rules.yml
groups:
- name: system
interval: 30s
rules:
- alert: HighCPUUsage
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 80
for: 5m
labels:
severity: warning
team: infrastructure
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage is above 80% (current value: {{ $value }}%)"
runbook_url: "https://wiki.company.com/runbooks/high-cpu"
dashboard_url: "https://grafana.company.com/d/node-exporter"Service Down Alert
- alert: ServiceDown
expr: up{job="api-service"} == 0
for: 1m
labels:
severity: critical
team: backend
annotations:
summary: "Service {{ $labels.job }} is down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute."
impact: "API requests will fail, affecting all users"
action: "Check service logs and restart if necessary"Disk Space Alert
- alert: DiskSpaceLow
expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 15
for: 10m
labels:
severity: warning
team: infrastructure
annotations:
summary: "Low disk space on {{ $labels.instance }}"
description: "Disk space on {{ $labels.instance }} is below 15% (current: {{ $value }}%)"
filesystem: "{{ $labels.mountpoint }}"
device: "{{ $labels.device }}"Alert Data Mapping
a11ops automatically maps Prometheus alert data to our alert format:
| Prometheus Field | a11ops Field | Example |
|---|---|---|
alertname | title | HighCPUUsage |
annotations.summary | message | High CPU usage on server-01 |
labels.severity | severity | critical |
startsAt | timestamp | 2024-01-15T10:30:00Z |
labels + annotations | metadata | All fields preserved |
Advanced Configuration
Custom HTTP Headers
Add authentication or tracking headers:
webhook_configs:
- url: 'https://api.a11ops.com/v1/webhooks/YOUR_INTEGRATION_ID'
send_resolved: true
http_config:
headers:
X-Environment: 'production'
X-Source: 'prometheus-us-east-1'TLS Configuration
For enhanced security with custom certificates:
webhook_configs:
- url: 'https://api.a11ops.com/v1/webhooks/YOUR_INTEGRATION_ID'
http_config:
tls_config:
insecure_skip_verify: false
ca_file: /etc/ssl/certs/ca-certificates.crtRetry Configuration
Customize retry behavior for failed webhook deliveries:
webhook_configs:
- url: 'https://api.a11ops.com/v1/webhooks/YOUR_INTEGRATION_ID'
max_alerts: 100 # Max alerts per webhook message
timeout: 30s # HTTP request timeoutTroubleshooting
Alerts not appearing in a11ops
- Check AlertManager logs:
journalctl -u alertmanager -f - Verify webhook URL is correct and includes integration ID
- Ensure AlertManager can reach api.a11ops.com (check firewalls)
- Test with curl:
curl -X POST https://api.a11ops.com/v1/webhooks/YOUR_ID -d ''
Missing alert metadata
- Ensure labels and annotations are properly defined in alert rules
- Check that severity label matches expected values
- Verify template variables are resolving correctly
Duplicate alerts
- Review group_by configuration to ensure proper deduplication
- Check repeat_interval settings
- Verify you're not sending to multiple receivers unintentionally
Successfully Integrated?
Learn more about managing alerts and reducing noise.