Swap: Explanation & Insights
Disk pretending to be memory, so the kernel can stash cold pages and keep the hot ones close.
What It Is
Swap is space outside RAM — on a disk, or, as we'll see, cleverly compressed back inside RAM itself — that the kernel uses as an overflow for memory. When physical RAM gets tight, the kernel takes pages of memory that haven't been touched in a while, writes them out to swap, and hands the freed RAM to something that needs it more. Later, if a page that got evicted is touched again, the kernel faults it back in. That move out and back — paging — is the whole of swap in one sentence.
The key thing to grasp from the start is what kind of memory gets swapped. A process's memory comes in two broad flavours. Some of it is file-backed: it's a copy of something already on disk — program code mapped in from an executable, a library, a file the process opened — and the kernel doesn't need swap to reclaim it, because the original is right there on disk; it just drops the copy and re-reads it later. That reclaimable file-backed memory is the page cache. The other flavour is anonymous memory: the heap, the stack, anything a program allocated and scribbled into that has no file behind it. There's no original to fall back on, so if the kernel wants that RAM back, it has nowhere to put the contents — unless there's swap. Swap is the disk-backed home for anonymous pages. That's its job, and it explains every behaviour on this page.
This is the orientation map for swap on a Linux server. It ties together how the kernel decides which pages to evict, the single most misunderstood tuning knob in Linux (swappiness, and what it actually controls), how to make swap with a real swap file versus a swap partition, the relationship between swap and the page cache, the real danger (it isn't swap — it's thrashing), where swap sits in the seconds before the OOM killer fires, and the genuinely surprising idea that "swapping" can happen with no disk involved at all. By the end, the tired old advice to "just disable swap" will look exactly as wrong as it is.
Why It Matters
Swap matters because it changes the shape of how a server behaves under memory pressure — and because almost everything people believe about it is half-true at best. Get it right and swap is a quiet shock-absorber that lets a box ride out a transient spike without killing anything. Get it wrong — or panic and rip it out — and you trade a graceful slowdown for an abrupt out of memory kill, which is a far worse outcome.
It also matters because swap is where one of the nastiest performance failures on Linux is born. A server leaning gently on swap is fine. A server thrashing — paging frantically in and out because its working set no longer fits in RAM — slows to a crawl while the CPU sits mostly idle, blocked in iowait, waiting on disk that is a thousand times slower than the memory it's standing in for. That pattern, swap thrashing, is the real thing to fear, and learning to tell healthy swap from thrashing swap is most of what this page is for.
How the Kernel Decides What to Evict
When the kernel needs to free memory, it doesn't pick at random and it doesn't simply dump the oldest thing. It runs page reclaim, and the mental model is two lists. The kernel keeps memory on a pair of LRU (least-recently-used) lists — one for anonymous pages, one for file-backed pages — each split into active and inactive halves. Pages that get referenced get promoted toward active; pages that sit untouched drift down to inactive. When reclaim runs, it harvests from the inactive end of both lists: cold file pages get dropped (they can be re-read from disk for free), and cold anonymous pages get written to swap (they can't be reconstructed any other way).
So swap is not "what happens when RAM is full." It's one of two parallel reclaim paths the kernel is always balancing: shrink the page cache, or push anonymous pages to swap. The interesting question is never "is it swapping?" but "which path is it favouring, and is the working set genuinely too big?" — and that balance is exactly what the next section's knob tunes.
Swappiness: The Knob Everyone Misreads
Linux exposes one famous dial for this balance, and it is almost universally misunderstood. You read it here:
cat /proc/sys/vm/swappiness
60
Sixty is the historical default on most distributions. Nearly everyone reads swappiness as "how much swap to use" or "how eager the kernel is to swap" — a throttle on a single behaviour. That's not what it is. swappiness is a ratio. It biases the reclaim balance described above: how aggressively the kernel reclaims anonymous memory (which means swapping it out) versus reclaiming the page cache (dropping file-backed pages). Higher swappiness tilts reclaim toward swapping anonymous pages and sparing the cache; lower swappiness tilts it toward shrinking the cache first and leaving anonymous pages in RAM longer.
That reframing changes what the knob is for. Lowering it to, say, 10 doesn't mean "swap less, generally" — it means "when you must reclaim, prefer to throw away cached file pages and only swap anonymous pages as a last resort." On a database box where the page cache is precious that can be exactly wrong; on a desktop where interactivity matters and you'd rather drop cache than have your editor's pages on disk, low swappiness makes sense. The value is a hint, not a hard cap, and the kernel still swaps under genuine pressure no matter how low you set it.
# temporary, until reboot
echo 10 > /proc/sys/vm/swappiness
# permanent
echo 'vm.swappiness=10' > /etc/sysctl.d/99-swappiness.conf
Note
The range isn't 0–100 anymore. Since kernel 5.8 the ceiling is 200, which lets you tell the kernel to favour swapping anonymous memory even more heavily than reclaiming cache — useful in the zram world below, where "swapping" is cheap because it never touches a disk. Setting
0does not disable swap; it merely tells the kernel to avoid anonymous reclaim until it's truly out of cheaper options.
Swap File vs Swap Partition
Swap can live in two forms, and they perform identically on modern kernels — the choice is about flexibility, not speed.
A swap partition is a whole partition (or LVM volume) given over to swap. It's the traditional approach, set up at install time, and it has the faint advantage of being a contiguous region the kernel owns outright. The downside is that resizing it means repartitioning, which is a chore.
A swap file is just a regular file on an existing filesystem, marked as swap. It's the modern default on most distros precisely because you can create, grow, or delete it in seconds without touching the partition table. Here's the whole dance, with the real commands:
# create a 2 GiB file of zeroes (count = size in MiB)
dd if=/dev/zero of=/swapfile bs=1M count=2048
# lock down the permissions — root-only, or the kernel refuses it
chmod 600 /swapfile
# write the swap signature and metadata into the file
mkswap /swapfile
# activate it
swapon /swapfile
After that, swapon with the summary flag shows what's live:
swapon -s
Filename Type Size Used Priority
/swapfile file 2097148 0 -2
To make it survive a reboot, add a line to /etc/fstab:
/swapfile none swap sw 0 0
The same swapon/swapoff pair turns any swap area on and off live. swapoff /swapfile will refuse only if there isn't enough RAM (plus other swap) to hold everything currently paged out — because turning swap off means faulting all of it back into memory first.
Warning
On a btrfs filesystem a swap file needs special handling — it must sit on a
nodatacowfile with no compression and no snapshots, orswaponrejects it. The oldddrecipe also won't do on btrfs; usebtrfs filesystem mkswapfileinstead. On ext4 and XFS thedd+mkswaprecipe above is exactly right.
Swap and the Page Cache
Swap and the page cache are best understood as the two halves of one balance, not two unrelated features. Both are the kernel reclaiming memory; they differ only in which kind. The cache is clean overflow you can always recreate — drop a cached file page and the original is still on disk, free to re-read. Swap is dirty overflow you must preserve — an anonymous page has no backing file, so the only way to free its RAM is to write its contents somewhere first, and that somewhere is swap.
This is why a healthy server's memory looks the way it does. The kernel keeps RAM nearly full on purpose — mostly with cache, because cached files make everything faster and the cache costs nothing to give back. Swap, meanwhile, may show a little used and that's perfectly fine: it usually means the kernel spotted some genuinely cold anonymous pages early, parked them on disk, and reclaimed the RAM for cache and hot pages. That's the system working as designed. The mistake is to read any swap usage as a problem; the number that matters is the rate of paging, not the static amount sitting in swap.
The Myth: "Swap Is Bad, Disable It"
Here's the belief this whole page exists to dismantle. It floats around forums and Stack Overflow answers and the occasional confident colleague: swap is slow, swap is where performance goes to suffer, so turn it off and force everything to stay in fast RAM. It sounds logical. It's wrong, and understanding why it's wrong is the moment swap finally clicks.
Idle swap costs you nothing. A few hundred megabytes parked in swap are not slowing anything down — they're cold pages that haven't been touched in hours and won't be touched again soon, sitting harmlessly on disk while their former RAM does useful work as cache or holds something hot. Swap lets the kernel make a good trade: evict the genuinely-cold so the genuinely-hot gets the fast memory. Take swap away and you don't make those cold pages hot — you just forbid the kernel from reclaiming them at all, so they squat in RAM doing nothing while your cache shrinks and the box gets slower. And you remove the cushion: with no swap, the first real spike that exhausts RAM goes straight to the OOM killer, no warning, no graceful degradation.
The disease people are actually remembering when they say "swap is slow" is thrashing — and thrashing is not swap, it's a symptom of not enough RAM. When the working set (the pages a program genuinely needs right now) is larger than physical memory, the kernel is forced to evict pages it'll need again a moment later, fault them back, evict others to make room, fault those back — a frantic shuttle between RAM and disk where the CPU spends its life in iowait and almost no real work gets done. That misery is real. But the cause is the missing RAM, not the swap that's merely exposing it. Rip out swap and a thrashing box doesn't get faster; it gets OOM-killed instead. Same disease, uglier death.
Pro Tip
Don't watch how much swap is used — watch the paging rate: pages swapped in and out per second (
si/soinvmstat 1, or the per-second deltas the kernel exposes). Steady non-zero swap with a paging rate near zero is a healthy server that parked some cold pages. A high, sustained paging rate is thrashing, and that's your real alarm — regardless of how full or empty the swap area looks.
zram: Swap Inside RAM
Now the part that genuinely surprises people the first time they meet it — and the reason "swap = slow disk" is doubly outdated. Swapping doesn't have to touch a disk at all. Modern Linux can give you swap that lives compressed, inside RAM itself.
zram creates a block device backed not by storage but by a slice of RAM, with transparent compression on every page that lands in it. You point swap at that device, and now when the kernel "swaps out" a cold anonymous page, it isn't written to disk — it's compressed and kept in memory, typically shrinking to a third or less of its size. You spend a little CPU on compression and buy back a lot of effective memory: cold pages that were wasting full-size RAM now occupy a fraction of it, and the rest is freed for hot pages and cache. There's no disk in the loop, so the "swap" is orders of magnitude faster than the disk-backed kind — a paging operation that used to mean a millisecond seek now means a microsecond decompress.
This isn't exotic. It's the default on recent Fedora and Ubuntu, it's how ChromeOS and Android keep many apps alive in modest RAM, and it's the quiet reason a cheap phone juggles more apps than its raw memory should allow. A close cousin, zswap, sits one layer over: it compresses pages into a RAM pool as a write-back cache in front of a real disk swap, only spilling the genuinely coldest pages through to disk when the compressed pool fills. zram is "swap that is RAM"; zswap is "a compressed buffer that softens the trip to disk swap." Both turn the old slur on its head: with compressed in-RAM swap, swapping can be the fast path, and trading a sliver of CPU to fit more in memory is simply a good deal.
# is compressed in-RAM swap already running? (priority 100 = zram, ahead of disk)
swapon -s
Filename Type Size Used Priority
/dev/zram0 partition 8388604 124800 100
/swapfile file 2097148 0 -2
That higher priority number is the kernel preferring the fast in-RAM device first and only falling through to the disk swap file when zram is exhausted — exactly the layering you want.
A Little Swap Is Wise; Sustained Swapping Is a Red Flag
The honest server stance, stated plainly as best practice: put a little swap on every server, and treat sustained swapping as an alarm, not a feature.
A modest swap area — disk-backed, zram, or both — is cheap insurance. It gives the kernel somewhere to park truly cold pages, lets it reclaim that RAM for cache, and provides a cushion so a brief memory spike degrades gracefully instead of detonating into an OOM kill. That's the wise default, and it's why the major distros ship swap on by default.
But swap is a shock-absorber, not a capacity plan. If a server is sustainedly paging — not a blip during a nightly job, but hour after hour of steady swap-in/swap-out — that is the box telling you something is wrong. Either it's genuinely under-provisioned for its workload, or something is leaking memory and slowly squeezing everything else onto disk. The fix in that case is never "add more swap" — that just makes the slow death slower. The fix is to find the cause: right-size the RAM, cap or repair the leaking process, or move load off the box. Swap buys you time to do that calmly; it is not a substitute for doing it.
Just Before the OOM Killer
Swap is the kernel's second-to-last line of defence against running out of memory, and knowing where it sits in that sequence tells you how much warning you'll get. As pressure builds, the kernel first shrinks the page cache — free and painless. Then it starts swapping anonymous pages — slower, but still graceful. Only when there's nothing left to evict and nowhere left to swap does it reach for the last resort: the OOM killer, which picks a process by a crude "how much would killing this free, how expendable is it" score and sends it SIGKILL. No cleanup, no save, just gone — and it tends to pick your biggest process, which is very often the database you most needed alive.
So swap activity is your early-warning system. A box that has started paging steadily is a box walking toward the cliff edge; the rising paging rate is the warning the OOM kill won't give you. Watch the paging rate climb and you can intervene — add RAM, kill the leak, shed load — while you still have a running server to intervene on. Ignore it, run swap dry, and the next thing you read about it is a line in dmesg explaining which daemon got shot.
History and Philosophy
Swap is old enough that its rules of thumb have curdled into folklore. The famous one — "make swap twice your RAM" — was sound advice in an era it no longer describes. It dates from the 1990s, when RAM cost a small fortune and a server might have 64 megabytes if it was lucky; doubling that for swap gave the machine room to run more than its meagre memory could otherwise hold, and the disk-to-RAM size ratio made the maths reasonable. Apply the same rule to a box with 256 GB of RAM and you're asking for half a terabyte of swap — a number that helps nobody and, if you ever actually filled it, would mean the machine had been thrashing into oblivion for hours. The rule outlived its arithmetic. Today the sane guidance is a few gigabytes for most servers, sized to hibernation needs on laptops, and the old 2× line is best filed next to "defragment your hard drive every week" — true once, quaint now.
There's a deeper lineage worth a moment, because it's where the word page on this page comes from. Swap is the surviving, simplified descendant of demand paging and virtual memory, ideas pioneered on the Atlas computer at the University of Manchester around 1962. Atlas was the first machine to let programs address more memory than physically existed, automatically shuttling pages between fast core memory and a slow magnetic drum — and presenting the programmer with the lovely fiction of one large, uniform memory. Every modern OS inherited that trick; swap is the piece of it you can still see and touch. When you swapon a file in 2026, you're operating the great-grandchild of a Manchester drum, and the kernel is still telling each process the same kind benevolent lie Atlas told sixty years ago: you have all the memory you need — don't worry about where it actually lives.
See Also
- RAM — the fast memory swap is the overflow for
- page cache — the other half of reclaim: clean file pages you can drop for free
- disk — where disk-backed swap lives, and why it's a thousand times slower than RAM
- iowait — the CPU-stalled-on-storage signal that thrashing produces
- process — what owns the anonymous memory that ends up in swap
- kernel — the bookkeeper running page reclaim and choosing what to evict
- out of memory — the OOM killer, the last resort after swap runs dry
- swap thrashing — the slow-to-a-crawl pattern, diagnosed
- swappiness too high — when the reclaim balance tips the wrong way
free— see used and free swap at a glanceswapon— list, activate, and prioritise swap areasmkswap— write the swap signature onto a file or partition
Is your server quietly paging to disk while you're not looking?
CleverUptime watches both how much swap you're using and how hard you're paging in and out, and warns you in plain language the moment that activity turns from a healthy cushion into the slow grind that means the box is running short on memory.
Want to see your own server's health right now? One command, no signup, no install.