Infrastructure Scenario Tests

We test Corax against real-world infrastructure failures across every vendor, platform, and scenario. Browse the results below.

276
Total Tests
100.0%
Pass Rate
276
Passed
0
Failed

Password Hash Sync Failure

PASS

Active Directory password hash synchronization between on-premises AD and Azure AD breaks after a domain controller is decommissioned. Users who change their on-premises passwords find their Azure AD passwords still use the old value, causing login failures for cloud services.

InfrastructurePattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 92%Remote Hands6 correlated

ADFS Token Signing Certificate Expired

PASS

The ADFS token signing certificate expires, breaking all federated SSO authentication. Users cannot sign into Office 365, SaaS applications, or any relying party trusts configured to use ADFS for authentication.

InfrastructurePattern: CERTIFICATE_EXPIRYSeverity: CRITICALConfidence: 95%Remote Hands9 correlated

DFS Replication Backlog Critical

PASS

DFS Replication backlog between two domain controllers reaches critical levels, with SYSVOL replication lagging by thousands of files. Group Policy is inconsistent across the domain, and some users receive stale GPOs depending on which DC they authenticate against.

InfrastructurePattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 92%Remote Hands6 correlated

AD Sites and Services Replication Topology Broken

PASS

Active Directory inter-site replication breaks after a network change removes the IP link between two AD sites. The Knowledge Consistency Checker (KCC) cannot generate a replication topology, and AD objects created in one site are not replicating to the other.

InfrastructurePattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 92%Remote Hands14 correlated

Backup Tape Library Jam — Mechanical Failure

PASS

The robotic tape library experiences a mechanical jam in the tape picker mechanism, preventing all tape load/unload operations. Backup jobs queue indefinitely as no tapes can be mounted, and offsite tape rotation is halted.

InfrastructurePattern: BACKUP_FAILURESeverity: CRITICALConfidence: 95%Remote Hands8 correlated

S3-Compatible Object Storage Quota Exceeded

PASS

The S3-compatible object storage (MinIO) reaches its configured quota limit. Application uploads fail, log shipping stops, and backup jobs writing to object storage all fail simultaneously.

InfrastructurePattern: DISK_FULLSeverity: CRITICALConfidence: 92%Remote Hands14 correlated

Ceph OSD Failure — Data Rebalancing

PASS

Multiple Ceph OSDs fail simultaneously on a storage node after a power supply unit failure, triggering a massive data rebalancing operation. The cluster enters HEALTH_WARN state and client I/O is severely impacted during recovery.

InfrastructurePattern: PERFORMANCE_DEGRADATIONSeverity: CRITICALConfidence: 95%Remote Hands13 correlated

iSCSI Target Unreachable — Multipath Failover

PASS

An iSCSI storage target becomes unreachable on the primary path after a switch failure. Multipath failover engages but the secondary path is congested, causing severe performance degradation for all connected initiators.

InfrastructurePattern: CONNECTION_REFUSEDSeverity: CRITICALConfidence: 85%Remote Hands12 correlated

ZFS Pool Degraded — Drive Failure

PASS

A ZFS storage pool enters degraded state after a drive failure in a RAIDZ2 vdev. The pool remains operational but with reduced redundancy. A second drive in the same vdev is showing SMART warnings, indicating imminent failure.

InfrastructurePattern: PERFORMANCE_DEGRADATIONSeverity: CRITICALConfidence: 95%Remote Hands5 correlated

Pure Storage Controller Failover

PASS

The active controller on a Pure Storage FlashArray fails, triggering an automatic failover to the standby controller. During the failover, I/O is briefly paused and disk queue depth spikes, causing latency-sensitive applications to experience errors.

InfrastructurePattern: STORAGE_IO_LATENCYSeverity: CRITICALConfidence: 90%Remote Hands18 correlated

EMC Unity LUN Thinly Provisioned Overcommit

PASS

Thinly provisioned LUNs on an EMC Unity storage array exceed physical capacity. The storage pool runs out of space, causing write failures on all LUNs in the pool. VMware datastores become read-only and VMs begin crashing.

InfrastructurePattern: DISK_FULLSeverity: CRITICALConfidence: 95%Remote Hands18 correlated

NetApp ONTAP Volume Offline

PASS

A NetApp ONTAP volume goes offline due to an aggregate running out of space, causing all LUNs and NFS exports on that volume to become unavailable. Multiple application servers lose access to their primary storage.

InfrastructurePattern: STORAGE_IO_LATENCYSeverity: CRITICALConfidence: 95%Remote Hands18 correlated

Client Onboarding Discovery Scan Failure — Incomplete Asset Inventory

PASS

During a new client onboarding, the automated network discovery scan fails to complete due to aggressive IDS/IPS rules on the client firewall. The scan times out after 4 hours with only 30% of the network discovered. The MSP has an incomplete view of the client infrastructure.

InfrastructurePattern: TIMEOUTSeverity: CRITICALConfidence: 85%Remote Hands25 correlated

NOC Monitoring Blind Spot — SNMP Community String Rotation

PASS

During a scheduled SNMP community string rotation across client infrastructure, 40% of devices fail to update to the new community string. The NOC monitoring platform can no longer poll these devices, creating a critical blind spot across 6 client networks.

InfrastructurePattern: SNMP_TRAP_ERRORSeverity: CRITICALConfidence: 85%Auto-Heal35 correlated

PSA/Ticketing Platform Outage — Service Desk Paralyzed

PASS

The ConnectWise Manage PSA platform becomes completely unreachable after a database failover goes wrong. The MSP service desk cannot create, update, or view tickets. Automated ticket creation from monitoring alerts queues up and eventually starts dropping. SLA tracking is offline.

InfrastructurePattern: CONNECTION_REFUSEDSeverity: CRITICALConfidence: 85%Remote Hands42 correlated

Multi-Tenant Backup Failure — Cloud Repository Corruption

PASS

The shared cloud backup repository used for 15 MSP clients becomes corrupted after a storage controller firmware bug. Backup jobs for all tenants fail with integrity check errors. The most recent valid restore point for some clients is 72 hours old, violating SLA RPO requirements.

InfrastructurePattern: BACKUP_FAILURESeverity: CRITICALConfidence: 95%Remote Hands42 correlated

RMM Agent Mass Disconnect — Monitoring Blind Spot

PASS

A failed RMM platform update pushes a corrupt agent binary to all managed endpoints. The agent enters a crash loop on 400+ devices across 12 client organizations, leaving the MSP completely blind to endpoint health and unable to run remote management tasks.

InfrastructurePattern: PROCESS_CRASH_LOOPSeverity: CRITICALConfidence: 92%Remote Hands28 correlated

Captive Portal Failure — Guest Network Unusable

PASS

The captive portal web server crashes, preventing guests from completing the splash page authentication. Guest devices connect to WiFi but cannot access the internet because the portal redirect times out. The portal VM ran out of memory handling a spike in connections.

InfrastructurePattern: TIMEOUTSeverity: CRITICALConfidence: 85%Auto-Heal19 correlated

WiFi Channel Interference from Neighboring Tenant

PASS

A neighboring tenant in the shared office building installs high-power wireless equipment on overlapping channels, causing severe co-channel interference. Client devices experience packet loss, low throughput, and frequent disconnections across the entire 2.4GHz band and DFS channels on 5GHz.

InfrastructurePattern: WIFI_INTERFERENCESeverity: CRITICALConfidence: 90%Remote Hands21 correlated

802.1X RADIUS Certificate Mismatch — Enterprise WiFi Auth Failure

PASS

The RADIUS server certificate used for EAP-TLS authentication expires, causing all 802.1X wireless clients to fail authentication. Supplicants reject the expired certificate, and no clients can connect to the enterprise SSID. Guest network remains functional.

InfrastructurePattern: CERTIFICATE_EXPIRYSeverity: CRITICALConfidence: 95%Remote Hands21 correlated
PreviousPage 3 of 4Next

Every scenario is tested against Corax's Neural Engine in a production environment with AI-powered root cause analysis.

Tests run continuously as new infrastructure patterns are added.