Back to All Scenarios
PASSEDinfrastructure / ad_sites_replication_topology

AD Sites and Services Replication Topology Broken

Active Directory inter-site replication breaks after a network change removes the IP link between two AD sites. The Knowledge Consistency Checker (KCC) cannot generate a replication topology, and AD objects created in one site are not replicating to the other.

Pattern
ACTIVE_DIRECTORY
Severity
CRITICAL
Confidence
92%
Remediation
Remote Hands

Test Results

MetricExpectedActualResult
Pattern RecognitionACTIVE_DIRECTORYACTIVE_DIRECTORY
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes14 linked
Cascade EscalationN/ANo
RemediationRemote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

Two AD sites: HQ and Branch. Inter-site IP link removed during network migration. KCC failing to generate replication topology. AD replication latency: infinite (no replication occurring). New user accounts created at HQ not visible at Branch. GPO updates not propagating.

Injected Error Messages (2)

AD replication failure between sites — dc01 ad replication to dc02.corp.local failing with error 1722 (RPC server unavailable), KCC unable to generate inter-site replication topology, repadmin /replsummary showing 100% failure rate for HQ-to-Branch replication, last successful ad replication 72 hours ago
Branch domain controller dc02 ad replication stale — inbound ad replication from dc01 failing for 72 hours, USN not advancing, new user objects not replicating from HQ, group policy updates not propagating to branch site, ad replication event ID 1864: no replication partner available

Neural Engine Root Cause Analysis

The Active Directory domain controller dc01.corp.local is experiencing a complete RPC communication failure (error 1722) preventing replication with dc02.corp.local. This RPC server unavailability indicates either the RPC Endpoint Mapper service is down, network connectivity issues between sites, or firewall blocking required RPC ports (135, dynamic RPC range). With 8 correlated incidents and 72 hours of failed replication, this suggests a systemic infrastructure issue affecting multiple services dependent on AD authentication and communication.

Remediation Plan

1. Verify network connectivity between dc01 and dc02 using telnet/ping tests on ports 135, 389, 636, and dynamic RPC range. 2. Check RPC Endpoint Mapper service status on both DCs and restart if stopped. 3. Verify DNS resolution between sites - ensure both DCs can resolve each other's FQDNs. 4. Check Windows Firewall and network firewall rules for RPC port blocking. 5. Run 'dcdiag /test:connectivity' and 'repadmin /showrepl' for detailed diagnostics. 6. If services are healthy, check for recent network changes, security updates, or GPO modifications that might affect RPC communication. 7. Force replication sync once connectivity is restored using 'repadmin /syncall'.
Tested: 2026-03-30Monitors: 2 | Incidents: 2Test ID: cmnck0mp3062aobqe4dnf4hfb