Disk Full: Symptoms, Diagnosis & Fixes

A full disk doesn't crash the server — it quietly stops it from writing, and everything that needs to write falls over.

What It Is

A full disk is exactly what it sounds like, and also almost never quite what you think. A filesystem fills up, the next process that tries to write a byte gets told No space left on device, and from there the failures fan out in directions that have nothing obvious to do with storage: a database refuses connections, logging stops, a deploy half-finishes, a web app throws 500s, apt bails mid-upgrade. The server is still up. The CPU is bored. Memory's fine. But the one resource that nearly everything quietly depends on — somewhere to put the next byte — is gone, and a surprising amount of software handles that about as gracefully as a car handles running out of road.

Here's the reassuring part, and it's worth saying up front because a full disk feels like an emergency: nothing is broken. No hardware died, no data is corrupt (yet — there's one nasty exception we'll get to), and you almost never lose anything you'd written before it filled. A full disk is a capacity problem, not a hardware problem — which makes it the friendliest of the disk emergencies, because the fix is "find what's eating the space and deal with it," and the space is always somewhere. That's the opposite of a disk failing, where the hardware is genuinely dying and no amount of cleanup helps. Same four-letter word, "disk," completely different night.

This is also one of the two or three most common things that goes wrong on a server, ever. Logs grow. Caches accumulate. A runaway process writes a 400 GB file overnight. Someone untars a backup into /tmp and forgets. It is so routine that the real skill isn't fixing it once — anyone can rm a big file — it's learning to read which mount is full, what filled it, and the handful of traps that make a full disk lie to your face. By the end of this page you'll diagnose it in two commands, know the one that catches everyone (the file you deleted that's still eating the disk), and have it set up so the next fill-up is a notification, not an outage.

How You Notice

A full disk announces itself in two registers: the blunt, obvious one, and the baffling one where nothing says "disk" at all. Here's each, with the command to confirm it on your own box right now.

  • No space left on device. The blunt one. Any command that writes — touch, cp, a text editor saving, a package install — fails with the same plain sentence. It's the kernel's ENOSPC error in human words, and it is refreshingly honest. The instant you see it, the question stops being what's broken and becomes which mount is full:

    df -h
    

    One line per filesystem, the Use% column tells you which one's at 100% in a single glance. (Full reference on df — it's the first command on this whole page.)

  • A service falls over for no obvious reason. This is the baffling register, and it's how most people actually meet a full disk — not by trying to write a file, but by a database that won't start, an app throwing write errors, or a deploy that dies halfway. MySQL and PostgreSQL refuse writes (and often connections) the moment they can't extend a data file or write their journal. Check the journal and the disk together:

    journalctl -xe | grep -iE "no space|enospc|disk full|write error"
    df -h
    

    When a service that worked yesterday dies today and the logs mention writing, check df -h before anything else. It's a five-second test that saves an hour of chasing the wrong ghost.

  • Logging just… stops. A subtler tell, and a vicious one: when the disk fills, the very logs you'd use to debug the problem can't be written either. A log file frozen at an old timestamp while the service is clearly still running is a classic full-disk fingerprint. (And on a systemd box, journalctl may itself complain it can't write.)

  • No space left on device — but df says there's space. The one that makes people doubt their sanity, and it has two completely different causes, both covered below: you're out of inodes (millions of tiny files), or a deleted file is still being held open. When the obvious reading contradicts itself, it's almost always one of those two.

Any of these means the same first move: run df -h, find the mount at the top of the Use% column, and read on. Everything downstream depends on knowing which filesystem is full — because they fill independently, and fixing the wrong one does nothing.

How I Read It

The whole diagnosis starts with one command, and the art is in reading it by the right column. Here's a real run from one of our boxes:

df -h
Filesystem               Size  Used Avail Use% Mounted on
udev                      16G     0   16G   0% /dev
tmpfs                    3.1G  2.1M  3.1G   1% /run
/dev/mapper/xps--vg-root 1.8T  1.4T  316G  82% /
tmpfs                     16G  844M   15G   6% /dev/shm
/dev/nvme0n1p2           944M  481M  398M  55% /boot
/dev/nvme0n1p1           975M   23M  952M   3% /boot/efi
tmpfs                    3.1G  148K  3.1G   1% /run/user/1000

It looks like a wall of mounts, but you read it the way you'd read a fuel gauge: jump straight to the Use% column, find the biggest number, then glance left to see which mount it is and how much absolute headroom is left. Let me walk it line by line, because two-thirds of these lines are noise you should learn to skip past.

  • /dev/mapper/xps--vg-root — 82%, mounted on /. This is the one that matters: the real root filesystem, a logical volume on actual storage, 1.8 TB, and 316 GB still free. At 82% it's comfortable but worth knowing — which, not coincidentally, is right where CleverUptime's first band sits (more on that below). When this line creeps toward 95%, that's the page you came here for.
  • /boot — 55%, /dev/nvme0n1p2. A small separate partition for the kernel and boot files, often under a gigabyte. It fills in its own special way — old kernels piling up after upgrades — so it gets its own fix below. A full /boot won't break a running server but will break the next apt upgrade.
  • The tmpfs and udev lines — ignore them, mostly. These aren't disks at all; they're RAM dressed up as filesystems (tmpfs is the page cache's cousin — files that live in memory and vanish on reboot). /run, /dev/shm, /run/user/1000 — all RAM. They can technically fill (a program dumping gigabytes into /dev/shm will fill /dev/shm), but nine times in ten they're near zero and you skip straight over them. Knowing they're RAM is what lets you skip them with confidence instead of squinting.

So the honest version of "read df -h" is: ignore the RAM-backed lines, find the real filesystem with the highest Use%, and check both the percentage and the absolute Avail. That second number matters more than people expect — 95% of a 20 GB disk leaves a single gigabyte and you're in trouble tonight; 95% of a 20 TB array leaves a terabyte and you have a leisurely week. The percentage tells you the alarm; the Avail tells you the clock.

Pro Tip

Add -h always (human-readable sizes) and, when you suspect the weird case, -T to show the filesystem type in its own column. Knowing a mount is btrfs or zfs rather than ext4 changes the whole reading — those filesystems get unhappy long before 100% (we'll see why), so 89% on btrfs is a different conversation than 89% on ext4.

The Two Ways df Lies to You

df -h is honest about bytes, but bytes aren't the only thing a filesystem can run out of — and that's where it appears to lie. Both liar-cases produce the same No space left on device while df -h insists there's plenty free. Learn to spot them and you'll never lose the hour everyone else loses here.

Liar #1 — out of inodes. Every file, no matter how tiny, consumes one inode — the little record that holds its metadata (owner, permissions, timestamps, and where its data blocks live). A classic ext4 filesystem fixes the number of inodes when it's created, so you can run out of inodes while you still have tons of free bytes — a mail spool, a session-cache directory, or a broken cron job can spawn millions of zero-byte files and exhaust the inode table while barely touching the disk's capacity. df -h looks fine; the real story is in df -i:

df -i
Filesystem                  Inodes   IUsed     IFree IUse% Mounted on
/dev/mapper/xps--vg-root 120004608 2125526 117879082    2% /
/dev/nvme0n1p2               62592     362     62230    1% /boot

That's the healthy reading — 2% of inodes used. The pathological one looks like this (a real shape from a session-cache blowup):

Filesystem        Inodes   IUsed   IFree IUse% Mounted on
/dev/sda1        2621440 2621440       0  100% /

IUse% pinned at 100%, IFree at 0, while df -h on the same mount might read a relaxed 40% bytes used. That's the signature, and once you've seen it once you never miss it: bytes free, inodes gone. The fix is to find and delete the swarm of tiny files (find /path -xdev -type f | wc -l per suspect directory points you at the culprit), not to free up gigabytes — gigabytes were never the problem.

Liar #2 — the deleted-but-still-open file. This is the single best trick on the page, the one that earns the bookmark. On Unix, deleting a file doesn't actually free its space if some process still has it open. The directory entry vanishes — ls no longer shows it, du can't find it — but the data blocks stay allocated until the last file descriptor pointing at them is closed. So you rm a 50 GB log file to free space, df still shows the disk full, and du swears the file is gone. They disagree, and both are telling the truth: du walks the directory tree (the file's not in it anymore) while df asks the filesystem how many blocks are allocated (still 50 GB, held open by the process that was logging to it). The culprit is almost always a long-running daemon that had the log open when you deleted it. lsof finds these ghosts:

lsof +L1
COMMAND   PID     USER   FD   TYPE DEVICE    SIZE/OFF NLINK   NODE NAME
nginx    1287 www-data    9w  REG  253,0  53687091200     0 131074 /var/log/nginx/access.log (deleted)

NLINK 0 and the (deleted) tag are the smoking gun: a ~50 GB file (SIZE/OFF) with zero directory links, still held open by nginx (PID 1287). The space won't come back until that descriptor closes. You don't even have to kill the process — telling it to reopen its logs does it (systemctl reload nginx, or a kill -HUP), and the blocks free instantly. When df and du disagree about how full a disk is, lsof +L1 is the one-line answer almost every time.

Note

There's a third, quieter reason df shows less free space than you'd expect, and it's not a bug: reserved blocks. By default ext4 reserves 5% of the filesystem for root, so an "empty" disk shows ~5% used and a "full" one stops letting non-root users write a little before true 100%. That headroom isn't waste — it's what lets you still sudo your way in and run cleanup commands when the disk is "full," and it keeps fragmentation sane. On a giant data volume with no system files, 5% of 20 TB is a wasteful terabyte; tune2fs -m 1 /dev/sdX (or -m 0) reclaims it safely for a data-only mount. Never drop it on / — that 5% is your rescue rope.

Finding the Culprit

df tells you which mount is full; it can't tell you what filled it. For that you walk the tree and add up sizes, which is the job of du. The move I make every single time, on the full mount:

du -h -x -d1 / | sort -rh | head

-x keeps it on one filesystem (so it doesn't wander down into other mounts and count them too — crucial when / is full but /home is a separate disk), -d1 stops it descending past the first level so you get a clean top-level breakdown, sort -rh puts the biggest at the top. You get the handful of directories eating the disk; then you cd into the fattest one and run it again, narrowing each time. It's a binary search through the filesystem, and three or four rounds finds almost anything.

That manual descent works everywhere and depends on nothing — but if you can install one tool, install ncdu (apt install ncdu). It does the same walk once and then hands you an interactive, arrow-key-navigable map of where the space went, sorted biggest-first, delete-from-inside-it included. The first time you use it to find a forgotten 80 GB core dump in four keystrokes, you'll wonder how you lived without it.

Pro Tip

du reads every inode under the path, so on a huge tree it can take minutes and hammer the disk — and on the already stressed full disk that's not free. Two refinements pay off: run it from the specific suspect mount (du -x -d1 /var) rather than all of /, and remember du measures files in the tree — if it comes back small but df says full, you've hit Liar #2 above (deleted-but-open) and the answer is lsof, not more du.

Reading It by Example

Train the pattern-match. The readout on the left, what I'd actually conclude on the right:

  • df -h shows / at 85–89%, gigabytes still free → Getting tight, not urgent. This is CleverUptime's Info band — worth knowing, nothing to do tonight. Find out what's growing (du -x -d1) so you're not surprised, then move on.
  • / at 90–94% → The Warning band: little headroom left, free up space before it fills. Rotate logs, clear the package cache, find the big files now while you still have room to work in.
  • / at 95%+ (or Avail down to a gigabyte or two) → The Error band. Once it hits 100%, writes start failing and services fall over. This is the page's main event — go to the fix ladder below and start at the top.
  • No space left on device but df -h shows free space, and df -i shows IUse% 100% → Out of inodes. Millions of tiny files somewhere. Find the directory with the absurd file count and clear it; freeing bytes won't help.
  • df says full, du says the disk is half empty → A deleted-but-open file. Run lsof +L1, find the (deleted) file with NLINK 0, reload (don't necessarily kill) the process holding it. Space returns instantly.
  • /boot at 100%, / fine → Old kernels piling up. apt autoremove (Debian/Ubuntu) prunes the unused ones; this one's almost always a one-liner.
  • A btrfs or zfs mount at ~85% and acting slow / refusing writes → Copy-on-write filesystems degrade and can fail allocations well before 100%. On these, 85% is the danger zone — treat it the way you'd treat 98% on ext4.

How to Fix It

Work the ladder top to bottom — the early rungs are cheap and safe, the later ones change the layout of your storage. First, one warning that costs more careers than running out of space ever has:

Danger

When you're hunting for space to delete, you are one careless rm -rf away from deleting the wrong thing — and on a server there is no Recycle Bin. Never rm -rf a path you haven't just looked at with ls or ncdu, never run it with a variable that might be empty (rm -rf "$DIR/" with an unset $DIR deletes /), and be especially careful in /var, /etc, and anything under /. Free space is recoverable; a deleted database file or a wiped /var/lib is not. The whole point of the diagnosis above is that you delete the known fat thing, not a guess.

Then, easiest-first:

  • Rotate and compress the logs. Logs are the number-one cause of a slowly-filling /, and the fix is logrotate — it's almost certainly already installed and running daily; the problem is usually that one noisy service writes faster than the default rotation keeps up, or isn't covered by a rotation rule at all. Check /var/log first (du -x -d1 /var/log | sort -rh | head), force a rotation right now with logrotate -f /etc/logrotate.conf to reclaim immediately, then add or tighten the rule so it doesn't recur. For systemd's own journal, journalctl --vacuum-size=500M caps it on the spot. Logs that fill a disk are a configuration bug, not a capacity problem — fix the rotation and the symptom stops coming back.

  • Clear the package-manager cache. Every install leaves the downloaded package behind. On Debian/Ubuntu, apt clean empties /var/cache/apt/archives and apt autoremove rips out kernels and dependencies nothing needs anymore — that pair alone routinely frees gigabytes and is the standard fix for a full /boot. On RHEL-family, dnf clean all. Safe, fast, and reversible (it just re-downloads next time you need it).

  • Find and remove the big, old, obviously-junk files. Forgotten tarballs, core dumps, an unpacked backup in /tmp, a 100 GB stray file from a runaway process. Find them by size:

    find / -xdev -type f -size +1G -exec ls -lh {} \; 2>/dev/null
    

    -xdev keeps find on the one filesystem; -size +1G catches the whales. Eyeball each result before deleting — a 40 GB file might be junk or might be your database. Core dumps (core or core.NNNN) and anything in /tmp older than a few days are usually safe; anything under /var/lib is usually not.

  • Truncate, don't delete, an actively-written log. If the file eating the disk is a log a running service still has open, deleting it triggers Liar #2 (the space won't come back). Instead, empty it in place without breaking the file handle: : > /var/log/huge.log (or truncate -s 0 /var/log/huge.log). The service keeps writing to the same now-empty file, the space frees immediately, and nothing needs restarting.

  • Move data to another filesystem. If one mount is full and another has room, relocate a big directory with rsync -a --remove-source-files and a symlink (or a bind mount) where it used to live. A common move: a database or Docker's storage outgrows / while /home sits empty — relocate the data directory there.

  • Grow the volume. When there's genuinely just not enough disk, add some. On LVM (the /dev/mapper/...-root you saw above is exactly this) it's two commands with zero downtime: lvextend -L +50G /dev/mapper/vg-root then resize2fs /dev/mapper/vg-root (or xfs_growfs / on XFS), and the filesystem is bigger while it's still mounted and serving. On a cloud instance, expand the block volume in the console, then run those same resize commands. This is LVM's whole reason to exist, and the first time you grow a live root filesystem with users on it and nothing so much as hiccups, you understand why nobody who's used it goes back to raw partitions.

Pro Tip

The fastest real reclaim on a full / is usually the boring trio, in order: journalctl --vacuum-size=500M, apt clean && apt autoremove, then logrotate -f. That sequence frees gigabytes on a typical box in under a minute and touches nothing irreplaceable — buy yourself room to breathe with those first, then go hunt the real culprit with a calm head instead of deleting under pressure.

How to Avoid It

A full disk is one of the few server problems you genuinely can prevent — it doesn't sneak up at the speed of hardware physics, it grows at the speed of your logs, and that's a speed you can stay ahead of. In rough order of payoff:

  1. Watch the trend, not the number. A disk at 70% is meaningless on its own; a disk that went 50% → 70% in a week is an outage with a date on it. The single most useful thing is knowing the rate it's filling, because that tells you when — and 95% with a two-week runway is a calm Tuesday task, while 95% gaining a percent an hour is tonight. This is exactly the kind of thing a monitor exists to do; a human checking df by hand always checks it the day after it filled.
  2. Fix log rotation properly. Most slow fills are logs, so the prevention is mostly logrotate done right: every noisy service covered by a rule, sensible rotate count and maxsize, compress on. Get this once and the most common cause of a full / simply stops happening. Cap the systemd journal too — SystemMaxUse= in /etc/systemd/journald.conf — so it can't quietly eat the disk by default.
  3. Give noisy, unbounded writers their own mount. Databases, Docker/container storage, and user uploads are the three that grow without limit and take down everything else when they fill /. Put them on a separate volume (or at least a separate LVM logical volume) so a runaway upload fills its disk and the rest of the system — including your ability to log in and fix it — keeps working. Isolation turns "the server is down" into "one mount is full."
  4. Leave more headroom on btrfs and zfs. Copy-on-write filesystems need free space to function, not just to store — they write new blocks before freeing old ones, so they slow down and can fail allocations above ~80%. If you run them, treat 80% as your ceiling, not 95%.

Note

The cheapest disk space is the disk space you don't fill in the first place. Before reaching for a bigger volume, ask why it's filling — an unbounded log, a backup written to the same disk it's backing up, a cache with no eviction. A bigger disk fills too, just later; fixing the writer fixes it forever. "We need a bigger server" is the reflex; "what's writing this?" is the cure.

How a Filesystem Actually Runs Out of Space

Now the part you don't need mid-emergency but that turns the whole page from a checklist into an instinct: why does a disk fill the way it does, and why do inodes, reserved blocks, and copy-on-write all behave so differently? It all comes down to what a filesystem actually is — and once you see the machinery, every weird symptom above becomes something you can simply reason out.

Two Ledgers: Blocks and Inodes

A disk, to the hardware, is just a long row of fixed-size blocks — typically 4 KB each, numbered from zero to however many fit. That's all the hardware offers: numbered boxes you can read or write. Everything else — files, directories, names, permissions — is a fiction the filesystem layers on top, and it maintains that fiction with two separate bookkeeping systems, which is the key to the whole page.

The first ledger tracks which blocks are free (a bitmap: one bit per block, set if used). When you write a file, the filesystem finds enough free blocks, marks them used, and dumps your data in. When df reports usage, it's literally reading this ledger — counting set bits. That's why df is instant no matter how many files you have: it never looks at files at all, just the block bitmap. It also can't tell you what's using the space, only how much — which is exactly the gap du fills by walking the files themselves.

The second ledger is the inode table — one record per file, holding everything about the file except its name and contents: owner, permissions, timestamps, size, and the list of block numbers where the data lives. Here's the thing that explains Liar #1: on a classic ext-family filesystem, the inode table is a fixed size, decided when the filesystem is created. The formatter guesses how many files you'll have (roughly one inode per 16 KB of disk) and allocates exactly that many records, forever. Create more files than that guess — even empty ones, even one byte each — and you run out of inodes while the block ledger is barely touched. Two independent ledgers, two independent ways to hit "full." df -h reads one; df -i reads the other. Now the contradiction isn't mysterious at all — it's two different counters, and you've been reading the wrong one.

(Newer filesystems sidestep this. XFS allocates inodes dynamically from free space as you go, and btrfs and ZFS don't have a fixed inode table at all — which is why inode exhaustion is overwhelmingly an ext4 story. If you've never hit it, there's a fair chance your busy boxes just aren't ext4.)

The Name Is Not the File

Now the mechanism behind the best trick on the page — the deleted-but-open file — because it falls straight out of how Unix separates names from data, and it's one of those design decisions that looks like a quirk and turns out to be quietly brilliant.

A filename, on Unix, is not the file. The file is the inode (and its data blocks); the name is just an entry in a directory that points at an inode — a hard link. One inode can have several names pointing at it (that's literally what a hard link is), and the inode carries a link count: how many names currently point here. Create a file, link count is 1. Make a second hard link, it's 2. And here's the elegant part: rm doesn't "delete a file" at all — it can't, because it only ever sees a name. What rm actually does is unlink: remove one directory entry and decrement the link count. The data blocks are freed only when the link count hits zero.

So far so tidy — but there's a second counter the filesystem checks before freeing anything: how many processes have the file open right now. A file's blocks come back only when both are zero — no names left and no open file descriptors. That's the whole secret. When you rm a log file that nginx has open, you take the link count from 1 to 0 — the name is gone, ls and du can't find it — but nginx still holds a descriptor, so the open count is 1, so the blocks stay allocated. df counts allocated blocks; du walks names; the name is gone but the blocks aren't, and the two tools disagree correctly. The space waits, patient and invisible, until nginx closes that descriptor — which is why a reload frees it instantly, no delete needed. lsof +L1 finds them by asking precisely the right question: show me every open file whose link count is below 1 — open, but nameless. Ghosts, holding the disk hostage.

This isn't a bug they never got around to fixing — it's a feature, and a load-bearing one. It's why you can replace a running program's binary or its config and it keeps running happily off the old version until restart (the old inode lingers, nameless, fully open). It's why a log rotation can move a file out from under a daemon without losing a single line. It's why deleting a file another process is reading never corrupts that read — the reader keeps its consistent view to the end. "Unlink, don't delete" looks like pedantry until you realize it's the thing that makes a live, running Unix system safe to rearrange underneath itself. Tuck that away: it's one of the prettiest ideas in the whole operating system, hiding inside an annoyance.

Why Copy-on-Write Fills Differently

One last piece, because it explains why CleverUptime's threshold note treats btrfs and ZFS specially. A traditional filesystem overwrites data in place — change a block, the new bytes land on the old block. A copy-on-write filesystem never does that: to change a block it writes a new copy elsewhere and only then updates the pointers, leaving the old version untouched (which is what makes instant snapshots and checksummed self-healing possible — the magic of these filesystems). The catch is that writing always needs somewhere new to write, so a copy-on-write filesystem needs a healthy margin of free space just to operate, not merely to store. Run one near full and it can't find contiguous free space to copy into; it slows to a crawl, fragments badly, and eventually fails writes with ENOSPC while df still claims a chunk free — because that "free" space is too scattered to use. That's why 85% on btrfs is the alarm that 98% is on ext4, and why our disk-space check says so right in the finding. Same word "full," a different definition of the line.

So: one disk, two ledgers, and a filesystem keeping an elaborate fiction on top of numbered boxes. Blocks run out, or inodes run out, or the free space gets too fragmented to use — three roads to the same No space left on device, and now you can tell which one you're on from across the room. The diary the filesystem keeps is honest; it just keeps it in two columns, and the trick was always knowing which column to read.

See Also

  • df — the first command on this page; reads the block ledger, names the full mount
  • du — walks the files to tell you what filled it
  • ncdu — the interactive disk-usage map; install it, you'll never go back
  • lsof — finds the deleted-but-open files holding space hostage
  • rsync — move data off a full mount cleanly
  • logrotate — the fix for the number-one cause of a full disk
  • LVM — grow a live filesystem with no downtime
  • filesystem — blocks, inodes, and how the fiction is maintained
  • /var/log — where the logs that fill your disk live
  • disk failing — the other disk emergency: hardware, not capacity
  • high I/O wait — what a thrashing or stressed disk does to the rest of the box
  • degraded RAID array — when a full or failing disk drops out of an array

Disk filling up and not sure which mount, or what's eating it?

CleverUptime watches every filesystem's usage and the rate it's climbing, flags the one mount that's getting tight by name — / at 91%, /boot full of old kernels — and warns you with days of runway to spare, so a full disk is a calm notification instead of a 3 a.m. outage.

Want to see your own server's health right now? One command, no signup, no install.

Check your server →