Back to All Scenarios
PASSEDserver / dhcp_server_failover

DHCP Server Failover — Primary DHCP Down

The primary Windows DHCP server crashes and the DHCP failover partner does not transition to 'Partner Down' state due to a misconfigured maximum client lead time (MCLT). Clients with expiring leases cannot renew and start losing connectivity.

Pattern
DHCP_EXHAUSTION
Severity
CRITICAL
Confidence
92%
Remediation
Remote Hands

Test Results

MetricExpectedActualResult
Pattern RecognitionDHCP_EXHAUSTIONDHCP_EXHAUSTION
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes35 linked
Cascade EscalationYesYes
RemediationRemote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

Windows DHCP failover (hot standby mode). Primary DC-01 crashed. Partner DC-02 in 'Communication Interrupted' state. MCLT: 1 hour. 400 active leases. Leases expiring at staggered intervals.

Injected Error Messages (3)

DHCP primary server DC-01 down — DHCP scope full, no leases available for new clients, Windows DHCP service terminated unexpectedly (Event ID 1041), server unreachable, 400 active lease scopes unserved, DHCP pool exhausted
DHCP failover partner DC-02 in 'Communication Interrupted' state — DHCP scope full on failover partner, Event ID 20252: failover partner unreachable, not transitioning to Partner Down, MCLT period not yet expired, no leases available for new devices
Network-wide DHCP pool exhausted — clients with expired leases losing connectivity, new devices cannot obtain addresses, DHCP failover not fully active, 47 devices reporting APIPA addresses (169.254.x.x) in last 30 minutes

Neural Engine Root Cause Analysis

The DHCP primary server DC-01 has experienced a cascading failure where the DHCP scope pool became exhausted (400 active lease scopes), leading to the Windows DHCP service terminating unexpectedly with Event ID 1041. This service termination has rendered the server unreachable and unable to serve new IP lease requests, creating a network-wide outage. The 14 correlated incidents within the same timeframe strongly suggest this DHCP exhaustion is causing widespread connectivity issues across dependent systems.

Remediation Plan

1. Immediately restart the Windows DHCP service on DC-01 to restore basic functionality. 2. Verify DHCP scope configuration and expand IP address ranges if capacity limits are legitimate. 3. Identify and release stale/expired DHCP leases to free up address space. 4. Check for DHCP lease conflicts or rogue devices consuming excessive leases. 5. Implement DHCP scope monitoring and alerting to prevent future exhaustion. 6. Consider implementing DHCP failover/redundancy if not already configured.
Tested: 2026-03-30Monitors: 3 | Incidents: 3Test ID: cmncjhvcm01mxobqel9836s17