PagerDuty Webhook Delivery Failure — Alerts Not Reaching On-Call
PagerDuty webhook integrations for all monitoring tools have been failing for 6 hours due to an expired mutual TLS credential on the PagerDuty integration proxy. Alerts are being generated by monitoring systems but never reaching the on-call team. Multiple production incidents have gone unnoticed.
Pattern
UNKNOWN
Severity
CRITICAL
Confidence
95%
Remediation
Auto-Heal
Test Results
Metric
Expected
Actual
Result
Pattern Recognition
UNKNOWN
UNKNOWN
Severity Assessment
CRITICAL
CRITICAL
Incident Correlation
Yes
18 linked
Cascade Escalation
N/A
No
Remediation
—
Auto-Heal — Corax resolves autonomously
Scenario Conditions
PagerDuty as primary alerting platform. Webhook proxy with expired mutual TLS credential. All integrations failing silently. 6 hours of missed alerts. 3 production incidents currently active and unacknowledged. On-call team unaware.
Injected Error Messages (2)
PagerDuty integration proxy failing — all outbound webhook deliveries to events.pagerduty.com failing with handshake error, mutual-TLS authentication rejected, integration proxy credential expired 2026-03-28T22:00:00Z (6 hours ago), 847 alert payloads queued and undeliverable, retry queue growing at 140 alerts/hour, PagerDuty receiving zero events from our monitoring infrastructure
alert delivery pipeline CRITICAL — zero alerts delivered to PagerDuty in last 6 hours, 3 active production incidents (database replication lag, memory issue on app-srv-03, VPN connectivity loss) not escalated to on-call team, on-call engineer has received zero pages/notifications, monitoring systems generating alerts normally but delivery endpoint returning handshake errors, 847 alerts in dead letter queue
Neural Engine Root Cause Analysis
The PagerDuty integration proxy is failing due to expired mutual-TLS authentication credentials that expired 6 hours ago on 2026-03-28T22:00:00Z. The expired certificate is causing handshake failures with events.pagerduty.com, preventing all webhook deliveries and resulting in 847 queued alerts with a growing backlog of 140 alerts/hour. This is a classic certificate expiration issue that completely breaks the integration's ability to deliver critical monitoring alerts to PagerDuty.
Remediation Plan
1. Immediately check and renew the expired mutual-TLS certificate for the PagerDuty integration proxy. 2. Restart the integration proxy service to reload the new certificate. 3. Verify handshake connectivity to events.pagerduty.com. 4. Process the queued alert backlog (847 alerts) to ensure no critical alerts are lost. 5. Implement certificate expiration monitoring to prevent future occurrences.