Infrastructure Scenario Tests

We test Corax against real-world infrastructure failures across every vendor, platform, and scenario. Browse the results below.

276
Total Tests
100.0%
Pass Rate
276
Passed
0
Failed

Database Connection Pool Exhausted

PASS

The application's database connection pool (PgBouncer) is exhausted after a slow query causes connections to pile up. New requests queue behind the pool, causing cascading timeouts. The application returns 503 errors while the database itself is healthy.

ServerPattern: DATABASE_EVENTSeverity: CRITICALConfidence: 85%Remote Hands23 correlated

PostgreSQL Replication Lag Critical

PASS

The PostgreSQL streaming replication replica falls 2GB behind the primary due to a long-running analytical query holding a replication slot open. Applications reading from the replica see stale data. If the primary fails, 2GB of WAL data would be lost.

ServerPattern: DATABASE_EVENTSeverity: CRITICALConfidence: 85%Remote Hands18 correlated

Container OOMKilled Repeatedly

PASS

A Java-based microservice container is being repeatedly OOMKilled because the JVM heap (-Xmx) is set to 512MB but the container memory limit is also 512MB, leaving no room for JVM metaspace, thread stacks, and native memory. The pod restarts every 3-5 minutes.

ServerPattern: CONTAINER_EVENTSeverity: CRITICALConfidence: 95%Auto-Heal9 correlated

Kubernetes Node NotReady — Pods Rescheduling

PASS

A Kubernetes worker node enters NotReady state due to kubelet losing contact with the control plane after a network partition. 25 pods on the node are marked for eviction after the 5-minute toleration period. Pods reschedule to other nodes but some fail due to resource constraints.

ServerPattern: CONTAINER_EVENTSeverity: CRITICALConfidence: 85%Remote Hands22 correlated

Docker Registry Unreachable — No Image Pulls

PASS

The private Docker registry (Harbor) becomes unreachable due to a TLS certificate renewal failure. All Kubernetes pods that need to pull or repull images fail with ImagePullBackOff. Existing running containers are fine but no new deployments or restarts work.

ServerPattern: CONTAINER_EVENTSeverity: CRITICALConfidence: 95%Remote Hands30 correlated

Print Server Spooler Crash — Enterprise-Wide Print Failure

PASS

The Windows Print Spooler service crashes on the central print server after processing a corrupted print job from an updated driver. All 45 network printers become inaccessible. 400 users across 3 floors cannot print.

ServerPattern: PRINTER_EVENTSeverity: CRITICALConfidence: 85%Auto-Heal7 correlated

PBX Server Crash — All Phones Unregistered

PASS

The FreePBX server crashes due to an Asterisk segfault after a problematic module update. All 150 SIP phones lose registration. IVR, voicemail, call queues, and ring groups all go offline. No redundant PBX in place.

ServerPattern: VOIP_QUALITYSeverity: CRITICALConfidence: 85%Auto-Heal18 correlated

SIP Trunk Registration Failure — All Outbound Calls Dead

PASS

The SIP trunk between the on-premises PBX and the ITSP loses registration after the provider changes their SBC IP without notice. All outbound and inbound PSTN calls fail. Internal extension-to-extension calls still work.

ServerPattern: VOIP_QUALITYSeverity: CRITICALConfidence: 85%Remote Hands18 correlated

Backup to Cloud Timeout — Offsite Backup Stalled

PASS

The offsite backup copy job to Wasabi S3 stalls at 23% due to ISP bandwidth throttling during business hours. The 8TB backup set cannot complete within the backup window. Offsite RPO violated for disaster recovery compliance.

ServerPattern: BACKUP_FAILURESeverity: CRITICALConfidence: 85%Remote Hands18 correlated

Backup Chain Broken — Incremental Backup Failed

PASS

A Veeam incremental backup fails because the previous restore point was corrupted by a storage error. The backup chain is broken, requiring a new active full backup. With 95 VMs, the full backup will take 18 hours and consume 4TB of additional space.

ServerPattern: BACKUP_FAILURESeverity: CRITICALConfidence: 95%Remote Hands5 correlated

Veeam Backup Job Failure — Repository Full

PASS

The Veeam backup repository runs out of disk space during the nightly backup window. All backup jobs fail with 'insufficient disk space' errors. No backups have completed for 3 nights. RPO violated for all protected VMs.

ServerPattern: BACKUP_FAILURESeverity: CRITICALConfidence: 95%Remote Hands16 correlated

Exchange Database Failover — DAG Switchover

PASS

An Exchange Database Availability Group (DAG) member server experiences a storage failure causing the active database copy to dismount. The passive copy on the second server activates but is 5 minutes behind due to log shipping lag.

ServerPattern: EXCHANGE_EVENTSeverity: CRITICALConfidence: 90%Remote Hands21 correlated

Exchange Mail Queue Growing — Transport Frozen

PASS

The Exchange 2019 Hub Transport service freezes after a malformed email triggers an anti-malware scanning loop. The mail queue grows to 15,000 messages. Internal and external email delivery stops completely. Users unaware until critical business emails bounce.

ServerPattern: EXCHANGE_EVENTSeverity: CRITICALConfidence: 92%Auto-Heal22 correlated

Group Policy Processing Failure

PASS

Group Policy processing fails across the domain after a SYSVOL replication issue leaves the DFS-R replicated SYSVOL share inconsistent between DCs. Workstations receive incomplete or conflicting policies. Security baselines not enforcing.

ServerPattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 92%Remote Hands22 correlated

FSMO Role Holder DC Offline

PASS

The domain controller holding all 5 FSMO roles (PDC Emulator, RID Master, Infrastructure Master, Schema Master, Domain Naming Master) goes down due to a motherboard failure. Password changes, account creation, and domain joins all fail.

ServerPattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 95%Remote Hands30 correlated

AD Replication Failure Between Domain Controllers

PASS

Active Directory replication between the two domain controllers fails due to a lingering object conflict. Users at the branch office (authenticating against DC-02) see stale group memberships and GPOs. Password changes on DC-01 not replicating to DC-02.

ServerPattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 90%Remote Hands22 correlated

Internal CA Certificate Chain Broken

PASS

The internal enterprise CA intermediate certificate is revoked by mistake during a PKI cleanup. All certificates issued by the intermediate CA are now untrusted. Internal web apps, RADIUS 802.1X auth, and LDAPS all fail certificate validation.

ServerPattern: CERTIFICATE_EXPIRYSeverity: CRITICALConfidence: 95%Remote Hands30 correlated

SSL Certificate Expired on Web Server

PASS

The SSL/TLS certificate on the public-facing customer portal expires at midnight. Chrome and Edge users see NET::ERR_CERT_DATE_INVALID. HSTS enforcement prevents bypass. API clients receiving TLS handshake failures. Revenue-impacting for e-commerce.

ServerPattern: CERTIFICATE_EXPIRYSeverity: CRITICALConfidence: 95%Remote Hands22 correlated

DHCP Server Failover — Primary DHCP Down

PASS

The primary Windows DHCP server crashes and the DHCP failover partner does not transition to 'Partner Down' state due to a misconfigured maximum client lead time (MCLT). Clients with expiring leases cannot renew and start losing connectivity.

ServerPattern: DHCP_EXHAUSTIONSeverity: CRITICALConfidence: 92%Remote Hands35 correlated

DHCP Scope Exhaustion — New Devices Can't Get IPs

PASS

The DHCP scope for the main user VLAN is 99% exhausted. New devices connecting to the network fail to obtain IP addresses. Users reporting 169.254.x.x APIPA addresses. Scope was sized for 200 but 210 devices now on the VLAN due to BYOD growth.

ServerPattern: DHCP_EXHAUSTIONSeverity: CRITICALConfidence: 92%Auto-Heal25 correlated
PreviousPage 2 of 3Next

Every scenario is tested against Corax's Neural Engine in a production environment with AI-powered root cause analysis.

Tests run continuously as new infrastructure patterns are added.