High Memory Usage: Symptoms, Diagnosis & Fixes

Linux spends every spare byte of RAM on purpose — so "used" looks scary, and "available" is the number that tells the truth.

What It Is

High memory usage is the alarm that gets misread more than any other on a server, and almost always in the same direction: someone runs free or top, sees used sitting at 90-something percent, and concludes the box is about to fall over and needs more RAM. Nine times out of ten it doesn't. What they're looking at isn't pressure — it's a healthy, hard-working Linux kernel doing exactly the clever thing it was built to do: filling otherwise-idle RAM with a cache it will hand straight back the moment a real program asks for it.

So let's lead with the single most liberating sentence on this page, the one that turns a panic into a shrug, and then spend the rest of it making sure you can tell the harmless version from the genuine one in about three seconds flat:

Free memory is wasted memory. RAM that sits empty does no work. It cost you money, it draws power, and giving it nothing to do is pure waste — so Linux refuses to. Every byte not currently held by a program, the kernel borrows to cache files it has read from (or is about to write to) disk, because serving the next read from RAM is thousands of times faster than going back to the platter. That borrowed cache shows up in the used (or buff/cache) column and makes a perfectly healthy server look 95% full. It isn't. The kernel is a thrifty landlord who fills every empty room with a paying short-term tenant — and evicts them instantly, no notice, no fuss, the second a long-term resident (a real process) needs the space.

This is so famously misunderstood it has its own monument on the internet — the page "Linux ate my RAM" (linuxatemyram.com), a one-joke website built entirely to talk panicked admins down off this exact ledge. By the end of this page you'll never need it: you'll read the one number that actually measures memory pressure (MemAvailable, not used), tell harmless cache from a real squeeze without a moment's doubt, name the process eating the most when there genuinely is one, and know exactly which rung of the fix ladder you're on. We'll start where it counts — spotting it, reading it, fixing it — and save the beautiful why (how Linux's virtual-memory machine juggles far more memory than you physically own) for the end, where it belongs once nobody's worried.

How You Notice

High memory usage shows up in two completely different registers, and telling them apart is the whole skill. There's the harmless one — a big used number that's really just cache — and the real one, where a program genuinely can't get the memory it needs and the whole box starts to suffer. Here's each place it surfaces, with the command to check it on your own server right now.

free shows used near the top. The classic trigger, and usually the false alarm. Run it human-readable:
```
free -h
```
A used figure at 90%+ is the line that launches a thousand support tickets — but used is the wrong column to panic over. The one that matters is available, all the way on the right, and we'll read it together in the next section. If available is healthy, you're looking at cache, not trouble.
top says memory is nearly full. top's summary header carries the same numbers free does — MiB Mem : total / free / used / buff/cache, and crucially avail Mem at the end of that line. Same rule: the used figure is the headline, avail Mem is the truth. Press M (capital) inside top to sort the process list by memory and the biggest consumer floats to the top instantly.
The box gets slow and swap starts filling. This is the register that matters. When RAM genuinely runs short, the kernel starts pushing pages out to swap on disk — and disk is glacial next to RAM, so everything stutters. Watch it move in real time with vmstat:
```
vmstat 1
```
The columns to watch are si and so (swap-in, swap-out, in KB/s). Idle zeros are healthy; a steady stream of non-zero so means the kernel is evicting real working memory to disk to make room — that's genuine pressure, and if it gets bad it becomes swap thrashing. Cache eviction is free and invisible; swap eviction is the tell that the cache cushion is already spent.
The OOM killer leaves a corpse in the log. The end of the line. When memory is so tight that even evicting all the cache and filling swap isn't enough, the kernel stops asking nicely and starts killing processes outright — the out-of-memory killer. It writes a tombstone to the kernel log every time:
```
dmesg -T | grep -iE "out of memory|killed process|oom"
journalctl -k | grep -i "Out of memory"
```
A line like Out of memory: Killed process 4821 (mysqld) is the kernel telling you, after the fact, that it had to shoot a process to keep the system alive. If you find these, you're past "high usage" and into out-of-memory — a different, sharper page.

The fork is everything here: a big used number with healthy available and zero swap activity is harmless cache — close the ticket. A shrinking available, climbing swap (so non-zero), or OOM kills in the log is real pressure — read on, because that's the problem this page is actually about. CleverUptime makes that exact call for you, every time, off MemAvailable — never off used — which is precisely why it doesn't page you for a cache-fat server that's running perfectly.

How I Read It

The whole diagnosis rests on one idea — read available, never used — and once it clicks, every memory readout on Linux stops lying to you. Let's read the two tools that report it, starting with the friendly one and then dropping down to the raw source it's built from.

free -h

               total        used        free      shared  buff/cache   available
Mem:            19Gi         15Gi       1.0Gi       272Mi       3.1Gi       4.6Gi
Swap:          2.0Gi          0B       2.0Gi

It looks like six numbers and a panic, but it's really three ideas. Let me walk every column, because the layout is a trap that snares almost everyone the first time:

total — 19 GiB. All the physical RAM the kernel can see. Boring, fixed, the only column nobody misreads.
used — 15 GiB. The headline that scares people: memory actively held by running programs plus kernel structures, not counting reclaimable cache. This is already much better-behaved than it used to be — for years free lumped cache into used and terrified everyone, but modern free (procps-ng 3.3.10+) split cache out into its own column precisely to end this misunderstanding. Even so, 15 of 19 GiB "used" still reads alarming. Don't stop here.
free — 1.0 GiB. Genuinely untouched, unborrowed RAM — and here's the counterintuitive part: a low free is good. A big free number means the kernel is leaving expensive RAM sitting idle, doing nothing. A healthy, warmed-up server wants free near the floor, because everything else has been put to work as cache. A high free on a long-running box means it just rebooted or its workload is tiny — not that it's "healthy."
shared — 272 MiB. Mostly tmpfs and shared-memory segments (the /dev/shm you may know from disk full) — RAM dressed up as files. Usually small; worth a glance only if it's surprisingly large.
buff/cache — 3.1 GiB. The hero of this page. Buffers and page cache: copies of file data the kernel is keeping in RAM because there was room and it might speed up the next read or write. Every byte of this is reclaimable — the kernel drops it instantly when a program needs the space. This is the "ate my RAM" memory. It's not lost; it's lent.
available — 4.6 GiB. The only number that matters. This is the kernel's own honest estimate of how much memory a new program could grab right now, without swapping — free plus the reclaimable slice of buff/cache, minus a small reserve the kernel keeps for itself. It already does the "give the cache back" arithmetic for you. So this box, despite reading 15 GiB "used," genuinely has 4.6 GiB ready to hand out. Healthy. Read available, ignore used, and 90% of memory panics evaporate.

The mental shortcut: used is what's spoken for; available is what you can actually have. They differ by exactly the reclaimable cache, and the gap is the kernel's thrift, not your problem. And that near-zero free column? It's the single most-misread number on a Linux box, and it means the opposite of what it looks like: not "out of memory" but the kernel having successfully put your expensive RAM to work as cache instead of letting it idle. On a healthy, long-running server free should be small and available comfortable — the day to worry is when available gets small, regardless of what free says.

Reading It at the Source

free and top are both just friendly faces over a single kernel file — /proc/meminfo, the virtual file the kernel writes its memory accounting into. (Yes: like so much of Linux, the canonical memory report is "just a file" — cat /proc/meminfo and you're reading the same bytes every memory tool on the system reads. Everything is a file, even the memory.) Reading it raw is worth doing once, because it shows the parts free quietly folds together:

cat /proc/meminfo

MemTotal:       19995464 kB
MemFree:         1063344 kB
MemAvailable:    4812204 kB
Buffers:          406124 kB
Cached:          2609256 kB
SwapTotal:       2097148 kB
SwapFree:        2097148 kB
Slab:             511064 kB
SReclaimable:     249660 kB
SUnreclaim:       261404 kB

Now the cache story is fully visible, and you can see exactly how available is built:

MemTotal — 19995464 kB. All physical RAM, the same as free's total.
MemFree — 1063344 kB. Truly idle RAM, untouched and uncached. free's free column, to the byte.
MemAvailable — 4812204 kB. The number. The kernel's estimate of memory available for new work without swapping — added to the kernel in 2014 (kernel 3.14) specifically because admins kept doing the cache arithmetic by hand and getting it wrong. Before this line existed, the folk formula was "free + buffers + cached," which over-counted (some cache can't actually be reclaimed). MemAvailable does the honest calculation inside the kernel, which alone knows how much of the cache is genuinely droppable. This is the line CleverUptime judges by — never used.
Buffers — 406124 kB. Cache for raw block-device I/O (filesystem metadata, mostly). Reclaimable.
Cached — 2609256 kB. The page cache: file contents held in RAM. The big, reclaimable cushion. free adds Buffers + Cached + SReclaimable to make its buff/cache column.
SReclaimable — 249660 kB. The part of the kernel's slab (its own internal object caches — directory entries, inode info) that can be handed back under pressure. Counted as reclaimable.
SUnreclaim — 261404 kB. The part of slab that can't be reclaimed — genuine kernel working memory, locked in. This is exactly why the naive "free + cached" formula over-counts and MemAvailable is more honest: it knows this slice is off-limits.
SwapTotal / SwapFree — both 2097148 kB. Total and unused swap. Equal here means nothing has been swapped out — the cleanest possible sign that there's no real memory pressure. The moment SwapFree starts dropping below SwapTotal is the moment to take a big used seriously.

Put MemAvailable: 4812204 kB (about 4.6 GiB) next to SwapFree untouched, and the verdict is open-and-shut: this server has comfortable headroom and not a byte has been swapped out. The scary used was a costume.

Naming the Consumer

So far we've established when high usage is harmless. When available is genuinely shrinking, the next question is who's eating it — and there's one command that answers it cleanly, sorting every process by resident set size (RSS: the actual physical RAM a process holds, the honest measure, as opposed to the virtual address space it merely reserves):

ps -eo pid,comm,rss,%mem --sort=-rss | head

  PID COMMAND            RSS %MEM
 2014 mysqld         6815744 33.4
 3187 java           4128820 20.2
 1422 redis-server   1048576  5.1
  988 nginx           184320  0.9
  742 systemd-journald 51200  0.2

Read it top-down: mysqld is holding 6.8 GB of resident RAM — a third of the box — and that's where you look first. (This is the same call CleverUptime's MemoryUsage finding makes for you: when one process holds 25%+ of RAM it names it outright — "mysqld is the biggest consumer at 33% of RAM — that's where to look first" — and when no single process dominates, it tells you that too, so you know it's many small ones adding up rather than one whale.) The %MEM column is RSS as a fraction of physical RAM, the quick eyeball; RSS itself, in KB, is the absolute figure that matters when you're deciding what to restart or tune.

Pro Tip

RSS overstates shared memory — two processes sharing the same library both count its pages in full, so summing the RSS column comes out larger than your actual RAM. For a single greedy process that's fine (the biggest RSS really is the biggest consumer), but don't try to reconcile the column total against free. If you need the de-duplicated, honest per-process figure, smem -tk reports PSS (proportional set size), which splits shared pages fairly across the processes using them — the number you actually want when chasing "where did all my RAM go."

Reading It by Example

Train the pattern-match. The readout on the left, what I'd actually conclude on the right:

free -h shows used 90%+, available 25%+ of total, Swap: 0B used → Harmless cache. The "Linux ate my RAM" case, by far the most common. Nothing to do — the kernel is working exactly as designed. Close the ticket.
available 5–10% of total, swap still empty → Getting tight, not urgent. This is CleverUptime's Info band (MemoryUsage:Elevated). Worth knowing what's growing — run the ps --sort=-rss above — but nothing's failing tonight.
available 2–5% of total, vmstat showing occasional so → The Warning band (MemoryUsage:High). Little headroom left, and the kernel has started leaning on swap. Find the consumer and act before it tips over.
available under 2%, so streaming non-zero, box sluggish → The Error band (MemoryUsage:Critical). The server is about to run out and the OOM killer may fire any moment. Free memory now — see the fix ladder.
free low but available high, top calm, no swap → Textbook healthy warmed-up server. The low free is the good sign, not the bad one. Don't "fix" this.
available shrinking steadily over hours/days while load is flat → A memory leak. A process whose RSS only ever climbs, never falls, is leaking — the cure is a restart now and a bug fix later, not more RAM.
Out of memory: Killed process in dmesg → You're past this page. Go to out-of-memory: the kernel already had to kill something to survive.
available fine but Swap 80%+ full → Not a RAM emergency — old, idle pages parked on disk and never reclaimed. That's swap full, a slower-burning and separate story.

The fork is simple once it's internalised: available healthy = cache (relax); available low or swap climbing = pressure (act). The size of used never, by itself, sorts you into the act column.

How to Fix It

First, the fix nobody needs but everybody reaches for: if available is healthy and swap is empty, there is nothing to fix. The most common "fix" for high memory usage is realising it was never a problem. Don't drop caches, don't add RAM, don't restart anything — you'd be paying to undo the kernel's best work. (There genuinely is a echo 3 > /proc/sys/vm/drop_caches knob that forces the kernel to dump its cache, and it does make used plummet and free shoot up — which is exactly why people run it to "fix" the number. All it actually achieves is throwing away a cache the kernel will immediately, patiently rebuild, making the next reads slower for no gain. It's a benchmarking tool, not a remedy. Pretend you never saw it.)

When available is genuinely low, work the ladder top to bottom — cheap and reversible first, structural last. But before any of it, one warning that costs more outages than running out of RAM ever does:

Danger

Reaching for the OOM killer by hand — kill -9 on the biggest RSS you see — is how a memory squeeze becomes data loss. SIGKILL gives a process no chance to flush its buffers or close its files cleanly, so killing a database, a queue worker, or anything mid-write can corrupt exactly the data you were trying to protect. Always try a graceful stop first (systemctl restart, or kill without -9, which sends SIGTERM and lets the process clean up). Save the -9 for something that's truly wedged and holds no data you can't lose — and never, ever lead with it on a database.

Then, by cause:

One process is hogging it (the ps --sort=-rss whale): restart or cap it. If a single service holds most of RAM and it's a leak or a runaway, a graceful restart (systemctl restart mysql) reclaims it instantly and buys you time to find the root cause. Better, give it a ceiling so it can never take the whole box down again: a systemd unit's MemoryMax= (or MemoryHigh= for a soft throttle) caps a service's memory at the cgroup level, so it gets pushed back or OOM-killed in isolation instead of dragging everything else down with it. This is the single highest-leverage fix for "one app keeps eating the server."
The app is simply configured to use too much: tune its memory settings. Most big servers have a knob you set too high (or left at a wasteful default). MySQL's innodb_buffer_pool_size, the JVM's -Xmx heap, PostgreSQL's shared_buffers and work_mem (which is per-connection — multiply it by your connection count and the surprise is often there), PHP-FPM's pm.max_children. The classic first-timer trap is sizing the database's cache to nearly all of RAM and forgetting the OS, the connections, and everything else also need some. Leave real headroom; the kernel's own page cache wants room too.
It's a genuine leak: restart now, fix the code later. If RSS for one process only ever climbs and never falls across hours, no amount of tuning saves you — it's a memory leak. Restart to reclaim immediately, then chase the bug (or, as a stopgap, set the systemd unit to restart on a MemoryMax= trip). A leak ignored ends in an OOM kill at the worst possible hour.
Many small processes, none dominant: reduce concurrency. When ps shows no single hog but the total is high — a web server forking hundreds of workers, a connection pool sized for a load you don't have — the fix isn't a bigger box, it's fewer workers. Cap PHP-FPM children, trim your app server's worker count, bound your database connection pool. (A green programmer's instinct is "more workers = faster"; past the point where they fit in RAM, more workers = swapping = slower. Counterintuitive, but the math is unforgiving.)
Add swap as a shock absorber — not as RAM. A little swap gives the kernel somewhere to park genuinely-cold pages so they're not wasting RAM, and it turns a sudden spike into a slowdown instead of an OOM kill. It is not a substitute for enough RAM — leaning on swap for hot working memory is swap thrashing, and disk is a thousand times slower than RAM. Add a sensible amount (1–2× a small RAM, less on a big box), tune swappiness if the kernel swaps too eagerly, and treat it as a safety net, never a crutch.
Genuinely need more: add RAM. Sometimes the workload honestly outgrew the machine — the working set legitimately doesn't fit. When you've ruled out leaks, over-tuning, and runaway concurrency, then more RAM is the right answer. But it's the last rung precisely because "we need a bigger server" is the reflexive first guess, and four times in five the real fix was a config line above it. So before you provision a single byte, run the ps -eo pid,comm,rss,%mem --sort=-rss from above and read the top three lines: one process holding most of memory is a tuning or leak problem (a bigger box just delays the same crash on a pricier machine); RAM spread evenly across hundreds of workers is a concurrency problem (the fix is fewer workers). Adding RAM is right only when the top consumers are all legitimate and the working set genuinely doesn't fit — rarer than the invoice would suggest.

How to Avoid It

Genuine memory pressure is one of the more preventable server problems, because unlike a failing disk it grows at the speed of your own configuration and workload, not the speed of physics. In rough order of payoff:

Cap your big services with cgroup limits. The single highest-leverage habit: give every memory-hungry service a MemoryMax= in its systemd unit. A leak or a runaway then gets contained to its own limit — throttled or OOM-killed alone — instead of taking the whole box and your ability to log in down with it. Isolation turns "the server is dead" into "one service restarted."
Size your app's memory deliberately, with headroom. Add up what each service is configured to take — database buffer pools, JVM heaps, per-connection buffers times max connections, worker counts times per-worker footprint — and make sure the total leaves room for the OS, the page cache, and a margin. The most common cause of real pressure isn't a leak; it's the sum of optimistic defaults quietly exceeding the RAM you have.
Bound your concurrency. Connection pools, worker processes, and thread pools should have ceilings tied to how much RAM each unit costs, not set to "as many as possible." Unbounded concurrency under load is how a fine server suddenly isn't.
Keep a little swap as a shock absorber. Not as RAM, but as the safety net that converts a sudden spike into a survivable slowdown and gives genuinely-cold pages somewhere to rest. A box with zero swap meets the OOM killer the instant it overcommits; a box with a sane sliver of swap usually rides the spike out.

Note

The deepest version of all of this isn't a one-off command — it's watching the trend of MemAvailable. A single reading tells you today's headroom; it's the slope that distinguishes a stable server from a memory leak that hits the OOM killer next Tuesday at 3 a.m. A flat available line is a healthy server; one that ratchets down a little each day, never recovering, is a leak with a delivery date — and you only ever see that slope if something reads the number every day and remembers yesterday's.

How Linux Actually Manages Memory

Now the part you don't need mid-panic but that turns this whole page from a checklist into an instinct — and it's one of the more elegant pieces of engineering in the whole system. Once you can picture what the kernel is actually doing with your RAM, every weird number above stops being trivia and becomes something you can simply reason out.

Every Address Is a Lie (a Useful One)

Start with the thing every other tutorial leaves muddy: when a program asks for memory, it does not get a chunk of physical RAM. It gets virtual memory — a private, pretend address space all its own, where addresses are made-up labels the kernel and the CPU's memory-management unit translate, on the fly, into wherever the real bytes happen to live. Two programs can both believe they own address 0x400000, and the hardware quietly maps each to a different physical page. The program lives in a comfortable fiction; the kernel runs the real estate behind the curtain.

This is the foundation everything rests on, and it buys three superpowers at once. Isolation: one process literally cannot name another's memory, so a bug in one can't scribble on another. Flexibility: the kernel can move a program's pages around physical RAM, or out to swap, without the program ever noticing — its addresses don't change, only the hidden mapping underneath. And the page cache: because the kernel controls every mapping, it can lend spare physical pages to a file cache and reclaim them the instant a process needs them, all invisibly. That last one is the entire reason "used" looks scary and "available" is the truth — the cache lives in physical RAM the kernel hasn't promised to anyone, so it can take it back without asking.

Pages, the Cache, and the Eviction Trick

Memory is managed in fixed-size chunks called pages — 4 KB on a typical x86 box. Every page of physical RAM is in one of a few states the kernel tracks obsessively: free (idle, available now), anonymous (a program's actual working memory — its heap and stack, backed by nothing but RAM and swap), or file-backed cache (a copy of something on disk).

Here's the move that makes the whole page click. When you read a file, the kernel copies its pages into RAM and keeps them — that's the page cache. The next read of that file comes from RAM at memory speed instead of disk speed, which is the difference between a microsecond and ten milliseconds — four orders of magnitude. So the kernel caches aggressively: any RAM not currently spoken for by a running program, it fills with file cache, because empty RAM is wasted RAM. That's why a healthy server's free is always near the floor.

And the genius is in the giving-back. File-cache pages are clean — an exact copy of what's already on disk — so the kernel can drop them with zero cost: just forget the mapping, the disk copy is still there. When a process asks for memory and free is low, the kernel instantly evicts cache pages to satisfy it. No write, no delay, no fuss. That instant, costless eviction is exactly why MemAvailable counts the reclaimable cache as available — to a new program, droppable cache is as good as free RAM, because that's precisely what it's about to become. The "used" RAM was never a wall; it was a curtain.

(Dirty pages — cache that's been written to but not yet flushed to disk — are the one exception: those must be written out before the page can be reused, which is why a sudden burst of writes can briefly stall things while the kernel flushes. But that's a write-throughput story, not a memory-pressure one.)

Overcommit: Promising More Than You Own

Now the last piece, the one that ties memory pressure to its scarier sibling out-of-memory. Linux, by default, will hand out more virtual memory than it physically has — a policy called overcommit. A process can malloc 8 GB on a 4 GB box and the call succeeds, because the kernel knows from long experience that programs routinely reserve far more than they ever touch. Memory only becomes real when a page is actually written to (a "page fault" that makes the kernel find a physical page right then). So the kernel cheerfully writes cheques it couldn't all cash at once, betting — correctly, almost always — that they won't all be cashed at the same moment.

It's the airline overbooking the flight, and it works for the same reason: most reservations are no-shows. But "almost always" is doing real work in that sentence. When too many processes do touch their promised pages at once, the kernel runs out of physical pages to back the cheques — there's no cache left to evict, swap is full or absent, and a write that must have a page simply can't be satisfied. The kernel can't tell a program "actually, that memory I promised isn't there." So it does the only thing left: it picks a victim and kills it. That's the OOM killer — the bill finally coming due on the overcommit bet — and it's why high memory usage and out-of-memory are two points on one continuum, not two unrelated problems. High usage with healthy available is the bet running fine. The OOM killer is the bet called in.

So: virtual memory lets every program live in a private fiction; the page cache fills the gaps with reclaimable speed; and overcommit lets the kernel promise more than it has, trusting the no-shows. Hold those three in your head and every number on this page reads itself — the scary used, the reassuring available, the swap that fills under real pressure, the killer that arrives when the bet goes bad. The kernel keeps an honest ledger through all of it; the only trick was ever knowing which column to read. And it's available.