Infrastructure Scenario Tests

We test Corax against real-world infrastructure failures across every vendor, platform, and scenario. Browse the results below.

276
Total Tests
100.0%
Pass Rate
276
Passed
0
Failed

Azure AD Connect Sync Loop

PASS

Azure AD Connect enters a synchronization loop where the delta sync cycle never completes, continuously restarting. Password hash synchronization stops working, and on-premises changes are not reflecting in Azure AD.

CloudPattern: AZURE_CLOUDSeverity: CRITICALConfidence: 85%Remote Hands5 correlated

ADFS Token Signing Certificate Expired

PASS

The ADFS token signing certificate expires, breaking all federated SSO authentication. Users cannot sign into Office 365, SaaS applications, or any relying party trusts configured to use ADFS for authentication.

InfrastructurePattern: CERTIFICATE_EXPIRYSeverity: CRITICALConfidence: 95%Remote Hands9 correlated

DFS Replication Backlog Critical

PASS

DFS Replication backlog between two domain controllers reaches critical levels, with SYSVOL replication lagging by thousands of files. Group Policy is inconsistent across the domain, and some users receive stale GPOs depending on which DC they authenticate against.

InfrastructurePattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 92%Remote Hands6 correlated

AD Sites and Services Replication Topology Broken

PASS

Active Directory inter-site replication breaks after a network change removes the IP link between two AD sites. The Knowledge Consistency Checker (KCC) cannot generate a replication topology, and AD objects created in one site are not replicating to the other.

InfrastructurePattern: ACTIVE_DIRECTORYSeverity: CRITICALConfidence: 92%Remote Hands14 correlated

Backup Tape Library Jam — Mechanical Failure

PASS

The robotic tape library experiences a mechanical jam in the tape picker mechanism, preventing all tape load/unload operations. Backup jobs queue indefinitely as no tapes can be mounted, and offsite tape rotation is halted.

InfrastructurePattern: BACKUP_FAILURESeverity: CRITICALConfidence: 95%Remote Hands8 correlated

S3-Compatible Object Storage Quota Exceeded

PASS

The S3-compatible object storage (MinIO) reaches its configured quota limit. Application uploads fail, log shipping stops, and backup jobs writing to object storage all fail simultaneously.

InfrastructurePattern: DISK_FULLSeverity: CRITICALConfidence: 92%Remote Hands14 correlated

Ceph OSD Failure — Data Rebalancing

PASS

Multiple Ceph OSDs fail simultaneously on a storage node after a power supply unit failure, triggering a massive data rebalancing operation. The cluster enters HEALTH_WARN state and client I/O is severely impacted during recovery.

InfrastructurePattern: PERFORMANCE_DEGRADATIONSeverity: CRITICALConfidence: 95%Remote Hands13 correlated

iSCSI Target Unreachable — Multipath Failover

PASS

An iSCSI storage target becomes unreachable on the primary path after a switch failure. Multipath failover engages but the secondary path is congested, causing severe performance degradation for all connected initiators.

InfrastructurePattern: CONNECTION_REFUSEDSeverity: CRITICALConfidence: 85%Remote Hands12 correlated

ZFS Pool Degraded — Drive Failure

PASS

A ZFS storage pool enters degraded state after a drive failure in a RAIDZ2 vdev. The pool remains operational but with reduced redundancy. A second drive in the same vdev is showing SMART warnings, indicating imminent failure.

InfrastructurePattern: PERFORMANCE_DEGRADATIONSeverity: CRITICALConfidence: 95%Remote Hands5 correlated

Pure Storage Controller Failover

PASS

The active controller on a Pure Storage FlashArray fails, triggering an automatic failover to the standby controller. During the failover, I/O is briefly paused and disk queue depth spikes, causing latency-sensitive applications to experience errors.

InfrastructurePattern: STORAGE_IO_LATENCYSeverity: CRITICALConfidence: 90%Remote Hands18 correlated

EMC Unity LUN Thinly Provisioned Overcommit

PASS

Thinly provisioned LUNs on an EMC Unity storage array exceed physical capacity. The storage pool runs out of space, causing write failures on all LUNs in the pool. VMware datastores become read-only and VMs begin crashing.

InfrastructurePattern: DISK_FULLSeverity: CRITICALConfidence: 95%Remote Hands18 correlated

NetApp ONTAP Volume Offline

PASS

A NetApp ONTAP volume goes offline due to an aggregate running out of space, causing all LUNs and NFS exports on that volume to become unavailable. Multiple application servers lose access to their primary storage.

InfrastructurePattern: STORAGE_IO_LATENCYSeverity: CRITICALConfidence: 95%Remote Hands18 correlated

Package Repository Corruption

PASS

The internal package repository mirror becomes corrupted after a failed rsync, causing all package installation and update operations across the Linux fleet to fail. Servers cannot install security patches or new application dependencies.

ServerPattern: UNKNOWNSeverity: CRITICALConfidence: 85%Auto-Heal4 correlated

SSH Key Rotation Breaking Automation

PASS

An automated SSH key rotation process replaces host keys on 20 servers but fails to update the known_hosts files on the Ansible control node and monitoring servers. All SSH-based automation, configuration management, and monitoring breaks simultaneously.

ServerPattern: UNKNOWNSeverity: CRITICALConfidence: 95%Auto-Heal6 correlated

Cron Daemon Crash — All Scheduled Jobs Stopped

PASS

The cron daemon on a critical infrastructure server crashes and enters a crash loop due to a corrupted crontab file. All scheduled jobs including log rotation, backup scripts, and health checks stop executing. The issue goes undetected for 48 hours.

ServerPattern: PROCESS_CRASH_LOOPSeverity: CRITICALConfidence: 92%Auto-Heal4 correlated

SELinux Denial Blocking Application

PASS

SELinux enforcing mode blocks a newly deployed application from binding to its configured port and accessing its data directory. The application fails to start with permission denied errors, and the audit log fills with AVC denial messages.

ServerPattern: UNKNOWNSeverity: CRITICALConfidence: 95%Auto-Heal4 correlated

NFS Mount Stale — Server Unreachable

PASS

An NFS mount on a production application server becomes stale after the NFS server becomes unreachable due to a network switch failure. All processes attempting to access the NFS mount hang in uninterruptible sleep, and the application server becomes partially unresponsive.

ServerPattern: CONNECTION_REFUSEDSeverity: CRITICALConfidence: 85%Remote Hands4 correlated

Inode Exhaustion — Disk Not Full but Cannot Create Files

PASS

A Linux server runs out of inodes on the root filesystem despite having 40% disk space free. Millions of small session files created by a PHP application consumed all available inodes. No new files can be created, causing application failures and log rotation to break.

ServerPattern: DISK_FULLSeverity: CRITICALConfidence: 95%Auto-Heal4 correlated

systemd Service Dependency Loop

PASS

A misconfigured systemd unit file creates a circular dependency between three services, causing them to enter a restart loop. Each service depends on another in the cycle, and systemd cannot satisfy all dependencies simultaneously.

ServerPattern: PROCESS_CRASH_LOOPSeverity: CRITICALConfidence: 92%Auto-Heal6 correlated

Kernel Panic — OOM Killer Invoked

PASS

A Linux production server experiences kernel panic after the OOM killer is invoked repeatedly, killing critical processes including the database and web server. The system becomes unresponsive after the OOM killer targets the init process.

ServerPattern: MEMORY_EXHAUSTIONSeverity: CRITICALConfidence: 92%Remote Hands10 correlated
PreviousPage 7 of 14Next

Every scenario is tested against Corax's Neural Engine in a production environment with AI-powered root cause analysis.

Tests run continuously as new infrastructure patterns are added.