Disk Space Crisis: How to Resolve a Full Root Partition and Overloaded Docker Overlays

156 Views
No Comments

Introduction

Running out of disk space on a server can be a frustrating experience. It leads to application malfunctions, system instability, and even crashes. Recently, I encountered a critical situation where my server’s root partition (/dev/vda3) and multiple Docker overlay filesystems were reporting 100% utilization. This article outlines the steps I took to diagnose and resolve the issue, providing a practical guide for anyone facing a similar disk space crisis.

The Problem: 100% Disk Usage

The issue became apparent when monitoring tools alerted me to critically low disk space. Running the command df -h confirmed the problem:

[root@iZbp15wv3kw8nmk6nxi8y4Z ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        912M     0  912M   0% /dev
tmpfs           930M     0  930M   0% /dev/shm
tmpfs           930M   95M  836M  11% /run
tmpfs           930M     0  930M   0% /sys/fs/cgroup
/dev/vda3        40G   40G   20K 100% /
/dev/vda2       100M  5.8M   95M   6% /boot/efi
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/1d2d1bfe1ffc4f6db4e6e094cbc5fea573fabb2d3ea595aef4dad90fed80d10d/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/fef67938518265be3b5e3097ab46c925693d918a6a87e106f9062fe08cc98ae3/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/8a63356428e0a5dab3f05e2457812a39b1867844e772a9ac6d202c4e1ac89b4a/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/8cbf2ca23251fb29e1c23b4dbc530d0f10e346eb671e6ca254ae9912051b93f9/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/66c638bf6da59051457776c6ade9949173f83ffb0f4394a055460ce54dc7811d/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/8b34ba9c8f3527a51635ce0918d52577e6cb7ccd33c163c3c795e8d4908c9270/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/49ababe15627beb203b86ca6ba9fe59ffea504dfc55496c735f471ee48cf2891/merged
overlay          40G   40G   20K 100% /var/lib/docker/overlay2/8e3a3e4496a2a486d399c0f520f5a808608e482bf0d50176b7fe980ef199a474/merged
tmpfs           186M     0  186M   0% /run/user/0

As you can see, /dev/vda3 (the root partition) and numerous Docker overlay filesystems were completely full. This meant no space was available for new files or even temporary operations, causing severe system instability.

Solution: A Step-by-Step Approach

Resolving this issue required a systematic approach, starting with the most likely and least disruptive solutions:

1. Reclaiming Space from Docker (Highly Recommended)

Docker is notorious for accumulating unused resources over time. These can include stopped containers, unused images (including dangling ones), volumes, and networks. Here’s how to clean them up:

  • Prune Stopped Containers: Bashdocker container prune
  • Prune Unused Images (including dangling images): Bashdocker image prune -a Caution: The -a flag removes all unused images, not just dangling ones. Ensure you don’t need any of them before proceeding.
  • Prune Unused Volumes: Bashdocker volume prune
  • Prune Unused Networks: Bashdocker network prune
  • One-Command Cleanup (Use with Caution): Bashdocker system prune -a This powerful command reclaims space from all of the above. Remember the -a flag removes all unused images, potentially including ones you might want to keep.

After running these commands, use df -h again to check if sufficient space has been freed.

2. Identifying and Removing Large Files/Directories

If cleaning up Docker resources doesn’t solve the problem, the next step is to locate large files or directories that can be safely removed.

  • Using du and sort: Bashdu -h / | sort -rh | head -n 20 This command lists the top 20 largest files and directories under the root (/) directory. You can adjust the number after head -n as needed.
  • Using ncdu (Recommended for Interactive Navigation): Install ncdu if it’s not already present:
    • Debian/Ubuntu: sudo apt-get install ncdu
    • CentOS/RHEL/Fedora: sudo yum install ncdu

    Then run: Bashncdu / ncdu provides an interactive interface for browsing directory sizes. You can navigate using arrow keys, delete files/directories with the d key (be extremely careful!), and quit with q.

  • Common Targets for Cleanup:
    • Log Files: Check /var/log for large log files. Consider using logrotate to manage log sizes.
    • Core Dumps: Look in /var/crash or similar directories for core dump files, which can often be safely removed.
    • Temporary Files: Examine /tmp. While usually cleaned on reboot, manual cleanup might be necessary.
    • Database Files: If running a database, check the size of its data files.
    • Downloaded Files: Check your Downloads or similar directories.

3. Resizing the /dev/vda3 Partition (Advanced and Risky)

If removing files doesn’t free up enough space, you might need to resize the root partition. This is an advanced procedure that carries a risk of data loss and should only be attempted as a last resort.

  • Virtual Machines: If your server is a VM, you can usually expand the disk size through your virtualization management platform. After expanding the disk, you’ll need to extend the filesystem within the operating system.
  • Physical Servers: Resizing partitions on physical servers typically involves using a partitioning tool like gparted from a Live CD/USB. This often requires data migration and carries a higher risk of data loss.

Always back up your important data before attempting any partition resizing.

4. Checking for Malware

While less common, it’s possible that a malicious program is consuming disk space by constantly writing data.

  • Monitor Processes: Use top or htop to monitor CPU and memory usage for suspicious processes.
  • Monitor Disk I/O: Use iotop to identify processes with high disk write activity.
  • Run a Virus Scan: Install and run a reputable antivirus scanner to check for malware.

Conclusion

Running out of disk space can be a critical issue, but with a systematic approach, it’s often solvable. Start by cleaning up Docker resources, then identify and remove large, unnecessary files. Only consider resizing partitions as a last resort and after backing up your data. Regularly monitoring disk usage and performing routine cleanup will help prevent such issues from recurring. Remember to stay vigilant and keep your systems healthy!

END
 0
vrain
Copyright Notice: Our original article was published by vrain on 2025-01-22, total 5223 words.
Reproduction Note: Unless otherwise specified, all articles are published by cc-4.0 protocol. Please indicate the source of reprint.
Comment(No Comments)