Back to All Scenarios
PASSEDcascade / dns_failure_cascade

DNS Server Failure — Both Internal DNS Servers Down

Both internal DNS servers fail simultaneously (primary: disk full, secondary: expired DNSSEC key). All internal name resolution fails. Every service that depends on DNS stops working.

Pattern
DNS_FAILURE
Severity
CRITICAL
Confidence
85%
Remediation
Auto-Heal

Test Results

MetricExpectedActualResult
Pattern RecognitionDNS_FAILUREDNS_FAILURE
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes75 linked
Cascade EscalationYesYes
RemediationAuto-Heal — Corax resolves autonomously

Scenario Conditions

Two internal DNS servers (BIND 9). Primary: disk full, cannot write zone files. Secondary: DNSSEC key expired, validation failures. 500+ internal DNS zones. All clients configured with these two DNS servers.

Injected Error Messages (5)

DNS primary server failure — DNS resolution failing for all zones, BIND named process unable to serve queries, SERVFAIL on all lookups, dns resolution failed for internal domains
DNS secondary server DNSSEC validation failure — expired signing key, all DNSSEC-signed zone lookups returning SERVFAIL, dns resolution failed
Internal web portal unreachable — ENOTFOUND portal.internal.company.com, DNS resolution failed, all internal hostname lookups timing out
Email server unreachable — dns resolution failed for mail.internal.company.com, ENOTFOUND, mail delivery queuing on external MTAs
Active Directory LDAP unreachable — ENOTFOUND dc01.internal.company.com, DNS resolution failed, user authentication failing campus-wide

Neural Engine Root Cause Analysis

The DNS primary server (ns1) at 10.0.0.53 has experienced a complete failure with the BIND named process unable to serve queries, returning SERVFAIL responses for all lookups. This appears to be a service-level failure rather than network connectivity issues, as the process is responding but cannot resolve any DNS queries. The presence of 19 correlated incidents within a 5-minute window strongly suggests this DNS failure is causing a cascade effect across dependent services and applications that rely on this primary DNS server for resolution.

Remediation Plan

1. Immediately attempt to restart the BIND named service on the DNS server to restore basic functionality. 2. Check system resources (disk space, memory) as BIND failures are commonly caused by resource exhaustion or corrupted zone files. 3. Verify DNS zone file integrity and configuration syntax. 4. If restart fails, check system logs for specific error messages and consider failover to secondary DNS servers. 5. Monitor resolution of correlated incidents as DNS restoration should resolve cascade failures.
Tested: 2026-03-30Monitors: 5 | Incidents: 5Test ID: cmncjfccy00ymobqea3esy4lv