Back to All Scenarios
PASSEDcloud / gcp_pubsub_dead_letter_full

GCP Pub/Sub Dead Letter Queue Full — Message Processing Failure

A GCP Pub/Sub subscription's dead letter topic accumulates millions of unprocessable messages after a schema change breaks the consumer application. The dead letter queue has no consumer, messages are piling up, and the original subscription's backlog is growing exponentially.

Pattern
UNKNOWN
Severity
CRITICAL
Confidence
85%
Remediation
Remote Hands

Test Results

MetricExpectedActualResult
Pattern RecognitionUNKNOWNUNKNOWN
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes21 linked
Cascade EscalationN/ANo
RemediationRemote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

Pub/Sub topic receiving 50K messages/minute. Consumer app crashes on new message schema. Dead letter policy: 5 retries then forward to DLT. DLT has no subscriber. 12 million messages in DLT. Original subscription backlog: 3.2 million and growing.

Injected Error Messages (2)

Pub/Sub subscription 'order-events-sub' backlog critical — 3.2 million unacked messages, oldest unacked message age: 4 hours, delivery rate dropped to near zero, dead letter topic 'order-events-dlt' has accumulated 12 million messages with no active subscriber, processing pipeline completely stalled, consumer application rejecting all messages due to schema incompatibility
order processing service failing on all incoming messages — deserialization error on new field 'shipping_method' not present in consumer schema v2.1, every message nacked and retried 5 times before being sent to dead letter topic, effective processing rate: 0 messages/second, order fulfillment pipeline halted for 4 hours, 3.2 million orders unprocessed

Neural Engine Root Cause Analysis

The order-events-sub Pub/Sub subscription has suffered a complete processing pipeline failure due to schema incompatibility causing the consumer application to reject all messages. This has created a cascading failure where 3.2 million messages are backing up in the main subscription while 12 million messages have accumulated in the dead letter topic with no active subscriber. The HTTP 403 error indicates the monitoring system also lacks proper permissions to access the subscription, suggesting broader IAM or service account issues that may be related to a recent deployment or configuration change.

Remediation Plan

1. Immediately investigate and resolve schema incompatibility in the consumer application by checking recent schema changes or deployments. 2. Fix IAM permissions for both the consumer service and monitoring system (address HTTP 403). 3. Deploy corrected consumer application version that can handle the message schema. 4. Set up subscriber for dead letter topic to process accumulated messages. 5. Monitor subscription backlog reduction and processing rates. 6. Implement schema validation and backward compatibility checks in CI/CD pipeline to prevent recurrence.
Tested: 2026-03-30Monitors: 2 | Incidents: 2Test ID: cmnckd6hi08mkobqe8h0jctbi