PASSEDserver / nfs_mount_stale

NFS Mount Stale — Server Unreachable

An NFS mount on a production application server becomes stale after the NFS server becomes unreachable due to a network switch failure. All processes attempting to access the NFS mount hang in uninterruptible sleep, and the application server becomes partially unresponsive.

Pattern

CONNECTION_REFUSED

Severity

CRITICAL

Confidence

85%

Remediation

Remote Hands

Test Results

Metric	Expected	Actual
Pattern Recognition	CONNECTION_REFUSED	CONNECTION_REFUSED
Severity Assessment	CRITICAL	CRITICAL
Incident Correlation	Yes	4 linked
Cascade Escalation	N/A	No
Remediation	—	Remote Hands — Corax contacts on-site support via call, email, or API

Scenario Conditions

CentOS 8 application server. NFS v4 mount from 10.10.6.100:/exports/data. NFS server unreachable after upstream switch port failure. Hard mount (default) causing processes to hang indefinitely. 12 application threads blocked on NFS I/O.

Injected Error Messages (1)

NFS mount stale on app-nfs-01 — connection refused from NFS server 10.10.6.100, mount point /mnt/data showing stale file handle, 12 application threads in uninterruptible sleep (D state), server partially unresponsive, all file operations on /mnt/data hanging, connection refused on port 2049 to NFS server

Neural Engine Root Cause Analysis

The NFS server at 10.10.6.100 is refusing connections on port 2049, causing stale file handles and hanging file operations on the app server. This has resulted in 12 application threads entering uninterruptible sleep state, making the application server partially unresponsive. The presence of 4 correlated incidents in the same time window suggests this is likely part of a broader infrastructure failure affecting multiple services.

Remediation Plan

1. Check NFS server (10.10.6.100) health and restart NFS services if needed 2. Force unmount stale NFS mount on app-nfs-01 using 'umount -f /mnt/data' 3. Verify NFS server accessibility and port 2049 connectivity 4. Remount NFS share with 'mount -t nfs 10.10.6.100:/path /mnt/data' 5. Restart application services on app-nfs-01 to clear hung threads 6. Investigate correlated incidents to address potential shared infrastructure issues

Tested: 2026-03-30Monitors: 1 | Incidents: 1Test ID: cmncjz12t05t9obqez5585ahc