Back to All Scenarios
PASSEDcloud / ecs_task_placement_failure

AWS ECS Task Placement Failure — Insufficient Resources

ECS service cannot place new tasks because all container instances in the cluster have exhausted their CPU and memory reservations. Auto-scaling group is at max capacity. Deployments are stuck with desired count never matching running count.

Pattern
AWS_CLOUD
Severity
CRITICAL
Confidence
90%
Remediation
Auto-Heal

Test Results

MetricExpectedActualResult
Pattern RecognitionAWS_CLOUDAWS_CLOUD
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes21 linked
Cascade EscalationN/ANo
RemediationAuto-Heal — Corax resolves autonomously

Scenario Conditions

ECS cluster with 10 container instances (max ASG). All instances at 95%+ memory reservation. New deployment requires 2GB memory per task. Desired count: 8, running count: 3. Service stuck in deployment.

Injected Error Messages (2)

aws ecs service 'order-service' unable to place tasks — cluster 'prod-cluster' has insufficient resources, 10 container instances at 95% memory reservation, task placement constraint cannot be satisfied, desired count 8 but only 3 tasks running, auto scaling group at maximum capacity (10 instances), deployment stuck for 45 minutes, service event: 'unable to place a task because no container instance met all of its requirements'
order-service capacity insufficient — only 3 of 8 desired tasks running, existing tasks overwhelmed with traffic, response latency increased from 120ms to 4200ms, connection queue depth growing, aws ecs deployment failing to reach steady state

Neural Engine Root Cause Analysis

The ECS cluster 'prod-cluster' is experiencing a resource exhaustion issue where all 10 container instances are at 95% memory reservation, preventing new tasks from being placed. The auto scaling group has reached its maximum capacity of 10 instances, creating a bottleneck. This is causing the order-service to run only 3 out of 8 desired tasks, with deployments stuck for 45 minutes due to insufficient memory resources to satisfy task placement constraints.

Remediation Plan

1. Immediately increase the auto scaling group maximum capacity to allow more EC2 instances to be launched 2. Scale out the ECS cluster by adding more container instances to provide additional memory capacity 3. Verify that new instances join the cluster and register properly 4. Monitor task placement to ensure the order-service can reach its desired count of 8 tasks 5. Consider rightsizing instance types or optimizing memory allocation for long-term capacity planning
Tested: 2026-03-30Monitors: 2 | Incidents: 2Test ID: cmnckcbtn08fnobqe8yaoutkj