PASSEDserver / disk_full

Disk Full — Log Files Filling Production Database Server

The database server's data partition fills to 100% due to an unrotated slow query log. PostgreSQL can no longer write WAL files. All database writes fail. Application is read-only.

Pattern

DISK_FULL

Severity

CRITICAL

Confidence

95%

Remediation

Remote Hands

Test Results

Metric	Expected	Actual
Pattern Recognition	DISK_FULL	DISK_FULL
Severity Assessment	CRITICAL	CRITICAL
Incident Correlation	Yes	29 linked
Cascade Escalation	Yes	Yes
Remediation	—	Remote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

PostgreSQL 15 on Ubuntu 22.04. 500GB data partition. Slow query log grew to 180GB overnight. No log rotation configured. 3 applications depend on this database.

Injected Error Messages (3)

disk full on /data partition — no space left on device, PostgreSQL PANIC: could not write WAL file, disk space 100% used (500GB/500GB), filesystem full

Database connection errors — PostgreSQL rejecting writes due to disk full, application returning 500 errors on all write operations

Application write failures cascading from database disk full — no space left on device propagating through connection pool

Neural Engine Root Cause Analysis

The PostgreSQL database server has crashed due to complete disk space exhaustion on the /data partition (500GB/500GB used). PostgreSQL requires free disk space to write Write-Ahead Log (WAL) files for transaction durability, and when this fails, the database enters a PANIC state and shuts down to prevent data corruption. The 12 correlated incidents indicate this disk full condition is likely impacting multiple services that depend on this database or share the same underlying storage infrastructure.

Remediation Plan

1. Immediately check disk usage and identify largest files/directories consuming space on /data partition. 2. Archive or compress old PostgreSQL WAL files and logs if safe to do so. 3. Check for any large temporary files, core dumps, or application logs that can be cleaned up. 4. If possible, move non-critical data to alternate storage or expand the /data partition. 5. Once sufficient space is freed (recommend minimum 10-15% free space), restart PostgreSQL service. 6. Verify database integrity and that all dependent services recover. 7. Implement disk space monitoring and log rotation policies to prevent recurrence.

Tested: 2026-03-30Monitors: 3 | Incidents: 3Test ID: cmncjemmj00qvobqe6op3qalu