We test Corax against real-world infrastructure failures across every vendor, platform, and scenario. Browse the results below.
A CloudWatch Log Group has 'never expire' retention and has accumulated 5TB of logs. Monthly cost jumped from $50 to $2,500.
An application's IAM access key was rotated but the old key was deactivated before the new one was deployed. All AWS API calls failing.
A Route 53 health check is failing but the failover record is not activating because the failover record was misconfigured.
All targets in an ALB target group are failing health checks. ALB returning 502 to all clients. Health check path changed during deployment.
An ECS Fargate service cannot start any tasks because the task memory reservation exceeds the available capacity.
An EC2 instance with instance store volumes was stopped and started. All ephemeral data lost including application state and temp files.
An EC2 instance became unreachable after a security group rule was accidentally removed. SSH access also blocked.
A GCP service account key used by the application expired. All GCP API calls from the application failing with authentication errors.
GKE detected an unhealthy node and triggered auto-repair, which involves draining and recreating the node. Disruption to pods during repair.
An Azure Cosmos DB container exceeded its provisioned RU/s. All writes being throttled with HTTP 429 responses.
Application hitting AWS SSM Parameter Store rate limits during deployment. Config fetches failing across all services.
A DynamoDB table is being throttled because read capacity is set too low for a traffic spike. ProvisionedThroughputExceededException on 40% of reads.
An AWS ACM certificate auto-renewal failed because the DNS validation CNAME record was removed. Certificate expires in 48 hours.
An AWS WAF rule is incorrectly blocking legitimate API requests. False positive rate spiked to 15% after a managed rule group update.
An AWS NAT Gateway is experiencing ErrorPortAllocation events. Private subnet instances cannot make outbound connections.
A CloudFront cache invalidation request failed silently. Users seeing old version of the website despite deployment.
An Azure App Service deployment broke the application. All requests returning 502. The deployment slot swap was not tested.
A security group attached to production servers was modified to allow SSH (port 22) from any IP address. AWS Config rule flagged the violation.
An Azure VM is running but unresponsive. The Azure Guest Agent stopped, preventing Azure from communicating with the VM. RDP and SSH connections timing out.
An Azure SQL Database hit 100% DTU utilization. All queries slow or timing out. The service tier is too small for the current workload.
Every scenario is tested against Corax's Neural Engine in a production environment with AI-powered root cause analysis.
Tests run continuously as new infrastructure patterns are added.