Back to All Scenarios
PASSEDinfrastructure / s3_object_storage_quota_exceeded

S3-Compatible Object Storage Quota Exceeded

The S3-compatible object storage (MinIO) reaches its configured quota limit. Application uploads fail, log shipping stops, and backup jobs writing to object storage all fail simultaneously.

Pattern
DISK_FULL
Severity
CRITICAL
Confidence
92%
Remediation
Remote Hands

Test Results

MetricExpectedActualResult
Pattern RecognitionDISK_FULLDISK_FULL
Severity AssessmentCRITICALCRITICAL
Incident CorrelationYes14 linked
Cascade EscalationN/ANo
RemediationRemote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

MinIO cluster with 50TB quota. Usage at 100% after unexpected data growth. Application upload API returning errors. Log aggregation pipeline failing to ship logs. Nightly backup job to object storage failing.

Injected Error Messages (2)

MinIO object storage quota exceeded — no space left in bucket prod-data, quota limit 50TB reached, all PutObject operations returning 'no space left on device', application uploads failing, log shipping pipeline broken, bucket usage at 100% of configured quota
Backup to object storage failing — MinIO returning no space left error on all write operations, nightly backup job failed after 2 hours of partial upload, backup data incomplete, object storage at capacity limit

Neural Engine Root Cause Analysis

The MinIO cluster has reached its configured storage quota limit of 50TB for the prod-data bucket, causing all write operations to fail with 'no space left on device' errors. This is a storage quota exhaustion issue rather than actual disk space shortage, preventing application uploads and breaking the log shipping pipeline. The 8 correlated incidents within the same time window indicate a cascade effect where dependent services are failing due to their inability to write to MinIO storage.

Remediation Plan

1. Immediately assess if quota can be safely increased based on actual disk capacity 2. Identify and archive/delete old or unnecessary data from prod-data bucket 3. Implement data lifecycle policies to automatically manage storage 4. If urgent, temporarily increase quota limit to restore service 5. Set up proactive monitoring for storage usage at 80% and 90% thresholds 6. Review and optimize data retention policies across all applications
Tested: 2026-03-30Monitors: 2 | Incidents: 2Test ID: cmnck0l8h0628obqeqvz6377p