Back to All Scenarios
PASSEDinfrastructure / emc_unity_lun_overcommit

EMC Unity LUN Thinly Provisioned Overcommit

Thinly provisioned LUNs on an EMC Unity storage array exceed physical capacity. The storage pool runs out of space, causing write failures on all LUNs in the pool. VMware datastores become read-only and VMs begin crashing.

Pattern
DISK_FULL
Severity
CRITICAL
Confidence
95%
Remediation
Remote Hands

Test Results

MetricExpectedActualResult
Pattern RecognitionDISK_FULLDISK_FULL
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes18 linked
Cascade EscalationN/ANo
RemediationRemote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

EMC Unity 450F. Storage pool with 10TB physical, 25TB thin-provisioned. Actual usage reaches 100% of physical capacity. 15 LUNs in pool. VMware VMFS datastores going read-only. VM write operations failing.

Injected Error Messages (2)

EMC Unity storage pool critically overcommitted — no space left on storage pool pool_prod, 25TB thin provisioned on 10TB physical, physical capacity exhausted, all LUN write operations failing with SCSI sense code: no space left, 15 LUNs affected across pool
VMware datastores going read-only — underlying EMC Unity pool has no space left for thin-provisioned writes, VMFS volumes reporting no space left on device, VM disk write operations failing, VMs crashing due to inability to write to virtual disks

Neural Engine Root Cause Analysis

The EMC Unity storage array has experienced physical capacity exhaustion due to thin provisioning overcommitment. Pool_prod has 25TB of thin-provisioned storage allocated on only 10TB of physical capacity, and the physical storage has been completely consumed. This has caused all write operations to fail with SCSI 'no space left' errors, affecting 15 LUNs and triggering 10 correlated incidents indicating widespread application and service failures dependent on this storage pool.

Remediation Plan

1. Immediately identify and delete any unnecessary snapshots, temp files, or old backups from the storage pool. 2. Move non-critical LUNs to alternative storage pools if available. 3. Add physical disk capacity to pool_prod by installing additional drives or expanding existing RAID groups. 4. Consider migrating some LUNs to other storage arrays if expansion is not immediately possible. 5. Implement storage monitoring and alerting at 80% capacity threshold to prevent future occurrences. 6. Review thin provisioning ratios and establish policies to prevent overcommitment beyond safe thresholds.
Tested: 2026-03-30Monitors: 2 | Incidents: 2Test ID: cmncjzte605xrobqegg6pyxd8