Datadog
Forward Datadog monitors to a11ops
Integrate Datadog monitors with a11ops to centralize alerts from your APM, infrastructure monitoring, and log management. Preserves all tags, priority levels, and monitor metadata.
Prerequisites
- Active Datadog account with monitor management permissions
- An a11ops workspace with an active API key
- Network access from Datadog to api.a11ops.com
Setup Instructions
Step 1: Create Integration in a11ops
- Navigate to your workspace settings
- Go to “Integrations” → “Add Integration”
- Select “Datadog”
- Name it (e.g., “Production Datadog”)
- Copy the generated webhook URL
Step 2: Create Webhook in Datadog
In Datadog, navigate to Integrations → Webhooks:
- Click “New” to create a webhook
- Name: “a11ops-alerts”
- URL: Paste your a11ops webhook URL
- Enable “Use custom payload”
- Add the custom payload template (see below)
- Save the webhook
Step 3: Configure Custom Payload
Use this payload template to ensure proper data mapping:
{
"alert": "$EVENT_TITLE",
"description": "$EVENT_MSG",
"severity": "$ALERT_TRANSITION",
"source": "datadog",
"timestamp": "$LAST_UPDATED",
"labels": {
"monitor_id": "$MONITOR_ID",
"alert_type": "$ALERT_TYPE",
"priority": "$PRIORITY",
"host": "$HOSTNAME",
"environment": "$TAGS[env]",
"service": "$TAGS[service]",
"team": "$TAGS[team]"
},
"annotations": {
"monitor_url": "$LINK",
"event_url": "$EVENT_URL",
"snapshot_url": "$SNAPSHOT",
"aggregation_key": "$AGGREGATION_KEY",
"alert_query": "$ALERT_QUERY",
"metric_namespace": "$METRIC_NAMESPACE"
}
}Configuring Monitors
Add Webhook to Monitors
For each monitor you want to send to a11ops:
- Edit the monitor configuration
- In the “Notify your team” section, add:
@webhook-a11ops-alerts - Configure notification message with context
- Save the monitor
Example Monitor Message
Include helpful context in your monitor messages:
{{#is_alert}}
🚨 ALERT: {{monitor.name}}
**Issue:** {{value}} exceeds threshold of {{threshold}}
**Service:** {{service.name}}
**Environment:** {{env.name}}
**Time:** {{last_triggered_at}}
**Impact:** Customers may experience slow response times
**Action:** Check service logs and scale if necessary
📊 [View Dashboard](https://app.datadoghq.com/dashboard/abc-123)
📖 [Runbook](https://wiki.company.com/runbooks/high-latency)
{{/is_alert}}
{{#is_recovery}}
✅ RECOVERED: {{monitor.name}}
Service has returned to normal operation.
{{/is_recovery}}
@webhook-a11ops-alertsPriority Mapping
Configure monitor priority in Datadog to map to a11ops severity levels:
| Datadog Priority | a11ops Severity | Use Case |
|---|---|---|
| P1 - Critical | critical | Service outages, data loss risk |
| P2 - High | high | Degraded performance, errors |
| P3 - Medium | medium | Warning conditions |
| P4 - Low | low | Minor issues |
| P5 - Info | info | Informational |
Monitor Examples
APM Service Monitor
Monitor service latency and error rates:
Monitor Type: APM Metrics
Metric: trace.servlet.request.p95
Evaluation: avg(last_5m)
Alert Conditions:
- Alert threshold: > 1000ms
- Warning threshold: > 500ms
Tags:
- env:production
- service:api-gateway
- team:backend
- priority:P2
Message: Include @webhook-a11ops-alertsLog Pattern Monitor
Alert on error log patterns:
Monitor Type: Log Alert
Query: service:payment-api status:error @error.kind:*
Alert Conditions:
- Alert when count > 100 in last 5 minutes
- Group by: @error.kind
Tags:
- env:production
- service:payment-api
- team:payments
- priority:P1
Alert includes error type and countComposite Monitor
Combine multiple conditions:
Monitor Type: Composite
Logic: (A && B) || C
Where:
A = High CPU usage (> 80%)
B = High memory usage (> 90%)
C = Service health check failing
Tags:
- env:production
- priority:P1
- team:infrastructure
Triggers when resource exhaustion detectedTagging Best Practices
Use consistent tags across monitors for better organization in a11ops:
Required Tags
env:Environment (prod, staging, dev)service:Service nameteam:Responsible teampriority:P1-P5 priority level
Optional Tags
component:Specific componentregion:Geographic regioncustomer:For multi-tenantsla:SLA tier
Advanced Features
Include Metric Snapshots
Datadog can include graph snapshots in webhooks. The URL will be available in the $SNAPSHOT variable.
Multi-Alert Monitors
For monitors that trigger per host/tag combination, each alert instance will create a separate a11ops alert with specific metadata.
Anomaly Detection
Datadog's anomaly detection monitors work seamlessly with a11ops. The predicted bounds and deviation will be included in the alert metadata.
Troubleshooting
Test the webhook
In Datadog webhook settings, use the “Test” button to send a sample payload:
- Go to Integrations → Webhooks
- Find your a11ops webhook
- Click “Test”
- Verify the test alert appears in a11ops
Check webhook history
View webhook delivery status in Datadog:
- Navigate to Events → Event Stream
- Filter by
source:webhooks - Look for delivery failures or errors
Variable substitution issues
If variables aren't substituting correctly, ensure you're using the correct syntax ($VARIABLE) and that the variable is available for your monitor type.
Integration Complete!
Start routing your Datadog monitors to a11ops for centralized alerting.