Infrastructure Scenario Tests

We test Corax against real-world infrastructure failures across every vendor, platform, and scenario. Browse the results below.

276
Total Tests
100.0%
Pass Rate
276
Passed
0
Failed

Kubernetes Ingress Controller Crash — External Traffic Blocked

PASS

The NGINX Ingress Controller in a production Kubernetes cluster crashes after a malformed Ingress resource is applied. The controller enters a CrashLoopBackOff state. All external HTTP/HTTPS traffic to the cluster is blocked because no ingress controller pods are running to route traffic to backend services.

CloudPattern: CONTAINER_EVENTSeverity: CRITICALConfidence: 95%Auto-Heal28 correlated

GCP Cloud Spanner Regional Outage

PASS

A Google Cloud Spanner regional instance experiences a zone-level failure in us-central1-a. The multi-zone configuration should provide automatic failover, but a configuration error in the instance's node count causes the remaining zones to be overloaded. Read and write latencies spike above SLA thresholds.

CloudPattern: STORAGE_IO_LATENCYSeverity: CRITICALConfidence: 95%Remote Hands42 correlated

Azure Cosmos DB Throttling — RU/s Budget Exhausted

PASS

Azure Cosmos DB begins aggressively throttling requests after a marketing campaign drives 10x normal traffic. The provisioned RU/s budget is exhausted and autoscale max is reached. Applications receive HTTP 429 (Too Many Requests) responses. Retry storms amplify the problem as clients retry throttled requests.

CloudPattern: AZURE_CLOUDSeverity: CRITICALConfidence: 95%Remote Hands42 correlated

Azure Key Vault Unavailable — Secrets Rotation Blocked

PASS

Azure Key Vault becomes unreachable due to a misconfigured private endpoint and NSG rule change during a network security audit. All applications that fetch secrets, encryption keys, or certificates from Key Vault at startup or rotation time fail. Services that cache secrets continue working but cannot rotate credentials.

CloudPattern: AZURE_CLOUDSeverity: CRITICALConfidence: 95%Auto-Heal42 correlated

AWS EC2 Instance Store Loss — Ephemeral Data Gone

PASS

An EC2 instance with instance store (ephemeral) volumes experiences a hardware failure on the underlying host. AWS stops and restarts the instance on new hardware, but all instance store data is lost. The application had been incorrectly storing session data and temporary processing files on instance store volumes instead of EBS.

CloudPattern: AWS_CLOUDSeverity: CRITICALConfidence: 95%Remote Hands20 correlated

AWS S3 Bucket Policy Lockout — Data Inaccessible

PASS

A misconfigured S3 bucket policy denies all access including the root account. The bucket contains 15TB of production assets (user uploads, documents, media). All applications that read from or write to the bucket receive AccessDenied errors. Even the AWS console shows access denied.

CloudPattern: AWS_CLOUDSeverity: CRITICALConfidence: 95%Remote Hands28 correlated

AWS Lambda Cold Start Timeout Spike

PASS

After a Lambda function deployment, cold start times spike from 2 seconds to 28 seconds due to a new heavy SDK dependency. API Gateway returns 504 Gateway Timeout for cold-start invocations. Provisioned concurrency was removed to save costs last month.

CloudPattern: AWS_CLOUDSeverity: CRITICALConfidence: 85%Auto-Heal16 correlated

AWS RDS Multi-AZ Failover

PASS

An AWS RDS PostgreSQL Multi-AZ instance experiences a hardware failure in the primary AZ. Automatic failover to the standby in the secondary AZ triggers. Applications experience 60-120 seconds of downtime. DNS endpoint resolves to new primary.

CloudPattern: AWS_CLOUDSeverity: CRITICALConfidence: 95%Remote Hands34 correlated

Azure AD Sync Failure — Users Can't Authenticate

PASS

Azure AD Connect sync fails due to an expired service account password. Password hash sync stops working. New users created in on-prem AD are not provisioned in Azure AD. Existing cloud users with changed passwords cannot authenticate to M365 services.

CloudPattern: AZURE_CLOUDSeverity: CRITICALConfidence: 85%Remote Hands23 correlated

Azure App Service Outage — 502 Errors

PASS

An Azure App Service Plan hosting 3 production web apps starts returning 502 Bad Gateway errors after an Azure platform update. The apps intermittently crash with out-of-memory exceptions. Azure Status page shows degraded performance in East US 2 region.

CloudPattern: AZURE_CLOUDSeverity: CRITICALConfidence: 85%Auto-Heal35 correlated
PreviousPage 2 of 2

Every scenario is tested against Corax's Neural Engine in a production environment with AI-powered root cause analysis.

Tests run continuously as new infrastructure patterns are added.