Back to All Scenarios
PASSEDserver / raid_degradation

RAID 5 Degradation — Two Drives Showing SMART Warnings

A RAID 5 array on a file server has one failed drive and a second drive showing SMART predictive failure warnings. The array is rebuilding, but if the second drive fails before rebuild completes, all data is lost.

Pattern
RAID_DEGRADATION
Severity
CRITICAL
Confidence
95%
Remediation
Remote Hands

Test Results

MetricExpectedActualResult
Pattern RecognitionRAID_DEGRADATIONRAID_DEGRADATION
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes9 linked
Cascade EscalationN/ANo
RemediationRemote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

Dell PowerEdge R740. RAID 5 with 8x 2TB SAS drives. Drive 3 failed. Drive 7 showing SMART warnings (reallocated sectors). Rebuild in progress at 23% after 4 hours.

Injected Error Messages (1)

RAID degradation on FileServer-01 — drive 3 failed, RAID 5 array degraded, rebuild in progress (23%), predictive failure on drive 7 (SMART reallocated sectors: 847), risk of double disk failure and total data loss

Neural Engine Root Cause Analysis

FileServer-01 is experiencing a critical RAID 5 hardware failure with drive 3 completely failed and drive 7 showing predictive failure indicators (847 reallocated SMART sectors). The system is currently rebuilding the array at 23% completion, but with a second drive showing imminent failure signs, there is high risk of total data loss if drive 7 fails before the rebuild completes. The 5 correlated incidents suggest this hardware failure may be impacting multiple dependent systems or services that rely on this file server.

Remediation Plan

1. IMMEDIATE: Contact datacenter team to replace failed drive 3 and prepare replacement for drive 7. 2. Monitor RAID rebuild progress closely and ensure no interruptions. 3. Initiate emergency backup of critical data if not already running. 4. Prepare for immediate replacement of drive 7 once current rebuild completes. 5. Consider temporarily migrating critical services to alternate storage if available. 6. Schedule maintenance window for full hardware replacement once immediate crisis is resolved.
Tested: 2026-03-30Monitors: 1 | Incidents: 1Test ID: cmncjena200qwobqe6v9n9ppw