We test Corax against real-world infrastructure failures across every vendor, platform, and scenario. Browse the results below.
During a new client onboarding, the automated network discovery scan fails to complete due to aggressive IDS/IPS rules on the client firewall. The scan times out after 4 hours with only 30% of the network discovered. The MSP has an incomplete view of the client infrastructure.
During a scheduled SNMP community string rotation across client infrastructure, 40% of devices fail to update to the new community string. The NOC monitoring platform can no longer poll these devices, creating a critical blind spot across 6 client networks.
A firewall policy update pushed to 8 client firewalls fails on 3 of them, leaving those clients with an incomplete ruleset that allows unrestricted outbound traffic. The policy push failure went unnoticed because the management platform showed a false success status.
A client's VPN credentials are found on a dark web dump. Unauthorized connections are detected from foreign IPs through the client's site-to-site VPN tunnel. The attacker is pivoting through the VPN to access internal resources. Immediate tunnel teardown and credential rotation required.
The ConnectWise Manage PSA platform becomes completely unreachable after a database failover goes wrong. The MSP service desk cannot create, update, or view tickets. Automated ticket creation from monitoring alerts queues up and eventually starts dropping. SLA tracking is offline.
The shared cloud backup repository used for 15 MSP clients becomes corrupted after a storage controller firmware bug. Backup jobs for all tenants fail with integrity check errors. The most recent valid restore point for some clients is 72 hours old, violating SLA RPO requirements.
A failed RMM platform update pushes a corrupt agent binary to all managed endpoints. The agent enters a crash loop on 400+ devices across 12 client organizations, leaving the MSP completely blind to endpoint health and unable to run remote management tasks.
The captive portal web server crashes, preventing guests from completing the splash page authentication. Guest devices connect to WiFi but cannot access the internet because the portal redirect times out. The portal VM ran out of memory handling a spike in connections.
A neighboring tenant in the shared office building installs high-power wireless equipment on overlapping channels, causing severe co-channel interference. Client devices experience packet loss, low throughput, and frequent disconnections across the entire 2.4GHz band and DFS channels on 5GHz.
The RADIUS server certificate used for EAP-TLS authentication expires, causing all 802.1X wireless clients to fail authentication. Supplicants reject the expired certificate, and no clients can connect to the enterprise SSID. Guest network remains functional.
A 500-person conference event overwhelms the venue WiFi infrastructure. The wireless controller reports channel utilization above 90% on all APs in the event hall. Client devices experience severe contention, with most unable to maintain stable connections.
The Aruba wireless controller cluster loses sync after a firmware mismatch between the primary and standby controllers. All APs managed by the failed controller go into standalone mode with degraded functionality. Roaming between controller zones fails completely.
A misconfigured ACL on the layer 3 switch allows traffic from the guest VLAN to reach the server VLAN, bypassing network segmentation. The IDS detects lateral scanning from a compromised guest device targeting internal servers.
Both RADIUS servers (backed by Active Directory) become unreachable after an AD domain controller crash. All 802.1X network authentication fails, preventing users from connecting to wired and wireless networks. Existing sessions remain active but no new authentications succeed.
The primary DNS server's zone transfer (AXFR) to the secondary fails due to a firewall rule change blocking TCP port 53. The secondary DNS server continues serving increasingly stale records, causing intermittent name resolution failures as TTLs expire.
The primary power supply in the core switch stack fails, causing the switch to reboot onto the secondary PSU. During the reboot, the switch stack ring breaks and a stack master re-election occurs, disrupting all traffic through the core for 90 seconds.
The primary network monitoring platform enters a crash loop after a database corruption event during a power fluctuation. All alerting stops, creating a blind spot where infrastructure failures go undetected. The secondary monitoring server was decommissioned last month.
After a firewall firmware upgrade, the MTU on the WAN interface drops from 1500 to 1400 without updating the MSS clamp. Jumbo frames from the server VLAN hit the firewall and get silently dropped, causing intermittent failures for large file transfers and database replication.
A junior admin accidentally changes a trunk port to access mode on a distribution switch, pruning all VLANs except the native VLAN. The spanning tree topology reconverges, causing a 30-second outage across multiple VLANs and triggering TCN flooding.
A misconfigured route-map on the border router leaks internal BGP prefixes to the upstream ISP. The ISP begins routing external traffic into a blackhole. Customer-facing services become unreachable from the internet while internal connectivity remains functional.
Every scenario is tested against Corax's Neural Engine in a production environment with AI-powered root cause analysis.
Tests run continuously as new infrastructure patterns are added.