Back to All Scenarios
PASSEDinfrastructure / captive_portal_failure

Captive Portal Failure — Guest Network Unusable

The captive portal web server crashes, preventing guests from completing the splash page authentication. Guest devices connect to WiFi but cannot access the internet because the portal redirect times out. The portal VM ran out of memory handling a spike in connections.

Pattern
TIMEOUT
Severity
CRITICAL
Confidence
85%
Remediation
Auto-Heal

Test Results

MetricExpectedActualResult
Pattern RecognitionTIMEOUTTIMEOUT
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes19 linked
Cascade EscalationN/ANo
RemediationAuto-Heal — Corax resolves autonomously

Scenario Conditions

Guest WiFi SSID with captive portal authentication. Portal running on Nginx VM (2GB RAM). Conference event driving 300+ guest connections. Nginx OOMKilled. Portal page redirect returning timeout. Guest VLAN has no internet without portal auth.

Injected Error Messages (2)

Captive portal server down — Nginx process killed, portal-vm memory exhausted handling 300+ concurrent sessions, portal redirect returning timeout after 30 seconds, guest authentication completely broken, all new guest connections stuck in walled garden
Guest WiFi splash page timeout — HTTP redirect to guest.company.com/splash returning timeout, guests connecting to WiFi but unable to reach internet, timeout on portal authentication page, 300+ guests affected at conference event, no fallback auth method configured

Neural Engine Root Cause Analysis

The portal-vm captive portal server has experienced a memory exhaustion event due to handling 300+ concurrent sessions, causing the Nginx process to be killed by the system (likely OOMKiller). This has rendered the captive portal completely non-functional, preventing all new guest authentication and leaving connections trapped in the walled garden. The presence of 8 correlated incidents suggests this may be part of a broader infrastructure issue or traffic spike affecting multiple systems.

Remediation Plan

1. Immediately restart the Nginx service on portal-vm to restore basic functionality. 2. Check memory usage and available resources on the VM. 3. If memory is still critically low, restart the portal-vm to clear memory leaks. 4. Monitor concurrent session counts and implement rate limiting if traffic spike continues. 5. Investigate the 8 correlated incidents to determine if this is part of a larger infrastructure issue. 6. Consider scaling up VM resources if high concurrent load is legitimate business traffic.
Tested: 2026-03-30Monitors: 2 | Incidents: 2Test ID: cmncjqs8k03tcobqezdn511to