top Command: Tutorial & Examples

The live pulse of your server — every process, every core, refreshed every second.

What It Is

top is the first command you reach for when a Linux server feels wrong — slow, unresponsive, and you don't yet know why. Type three letters and you get a live, self-updating x-ray of the machine: what's running, what's eating the CPU, how much memory is left, and whether the box is calm or on fire. It's installed on every Linux server on earth, needs no setup, and has looked basically the same for decades.

If you've never touched a server before, this is the perfect place to start. top is safe — it only looks, it never changes anything — and almost everything about how a Linux machine spends its time is on this one screen. We'll explain every field, teach you to read it the way a twenty-year veteran does, and along the way pick up how CPUs, processes and the kernel actually work. By the end it'll be both your first lesson and the reference you come back to for the flag you can never remember.

Your First Look

Run it (and remember: q quits — that one stumps everybody the first time):

top

top - 14:21:07 up 9 days,  2:14,  2 users,  load average: 0.98, 1.02, 0.74
Tasks: 142 total,   2 running, 140 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.2 us,  1.1 sy,  0.0 ni, 94.5 id,  0.1 wa,  0.0 hi,  0.1 si,  0.0 st
MiB Mem :  15998.4 total,    412.6 free,   4821.0 used,  10764.8 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.  10812.1 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 2323 mysql     20   0   19.8g   6.0g  11196 S  99.7  37.7 101:56.24 mysqld
 1576 root      20   0  1.9g   260m  35772 S   3.0   1.6   0:12.45 java
 2746 www-data  20   0  568m    87m  20740 S   0.7   0.5   0:05.27 apache2

It comes in two halves: the five summary lines (the whole machine at a glance) and the process table below (one row per running program, sorted with the biggest CPU user on top, refreshed every few seconds). It looks like a lot, but once you know the reading order you only ever check two or three numbers. Let's learn the mental model first, then explain every single field so this doubles as your reference.

How I Read It

This is the part I wish someone had just shown me the first time. When a server's in trouble, here's the exact sequence in my head — about three seconds total.

First I look at one number — the load average — and then ask how many cores the box has. That one comparison gives me the whole picture. Count cores with nproc. Load average is "how many processes wanted to run," averaged over 1, 5 and 15 minutes, so left-to-right it reads like a trend. Load 0.98 on an 8-core box means the machine is nearly asleep — so whatever's slow isn't capacity, and I look elsewhere. Load 30 on 4 cores means it's drowning.

Second I glance at %Cpu(s) for two things: is there wa (CPU stalled waiting for the disk), and is the busy time us (your programs) or sy (the kernel)? Third, only now, I read the process list — already sorted by CPU, so the culprit is usually line one.

And the move that took me years: if one process sits near 100% but load is low and most cores look idle, I press 1 to split the CPU line per core. Nine times out of ten one core is pinned at 100% and the rest are asleep — the program is single-threaded and can only use one core, no matter how big the server is. People buy a bigger machine to fix this; it just adds idle cores. (Catching exactly that — "one core maxed, the rest idle" — is the sort of thing CleverUptime flags for you, so you don't have to know to press 1.)

That's the craft: load vs cores, wa for disk, the sorted list for the culprit, 1 for the single-threaded trap. Now let's explain everything behind those glances.

Pro Tip

press M to sort by memory instead of CPU — the second most useful key in top after q, and the fast way to find a leaking process. P flips you back to CPU sort.

The Summary Lines, Explained

Line 1 — the heartbeat. Current time; up 9 days since the last reboot (a server up for years is both a brag and a red flag — it means missing kernel security patches); users logged in; and the load average. One thing worth knowing about that core count from nproc: it counts logical CPUs, which includes hyperthreads — Intel's and AMD's trick of presenting one physical core as two, so it can keep crunching one thread while the other waits on memory. A hyperthread isn't a whole core, though (figure ~1.3× throughput, not 2×), so load sitting right at nproc is a little more loaded than it looks.

Line 2 — Tasks. Total processes, split into running (on a CPU or queued), sleeping (waiting for something — the normal state for nearly everything), stopped (suspended), and zombie (finished but not yet cleaned up by their parent). A zombie or two is harmless; hundreds means a buggy parent.

Line 3 — %Cpu(s), where the CPU's time goes. And here's a genuinely surprising idea worth pausing on: there are several different things a CPU can be doing, and "crunching numbers" is only one of them. Yes — a CPU can also be busy waiting. That sounds like a contradiction, but think about it: if a program asks for data from a disk or a network interface, that data takes ages in CPU terms, and the core can't proceed until it arrives. That stall is real, measurable time, and it has its own bucket. Once you internalize that "busy" isn't one thing, this whole line snaps into focus:

  • us (user) — running your programs. The "good" busy.
  • sy (system) — running the kernel on their behalf (system calls, memory management, etc.).
  • ni (nice) — running deliberately low-priority (niced) work.
  • id (idle) — genuinely doing nothing; spare capacity.
  • wa (I/O wait) — that "busy waiting" above: stalled on the disk. High wa means your bottleneck is storage, not the CPU — chase it with iostat.
  • hi / si — servicing hardware / software interrupts (usually tiny; high si often means heavy network traffic).
  • st (steal) — the one almost nobody knows, and it's gold on a cloud VM. It's time the hypervisor took your virtual CPU away to run someone else's. Steady high st means your VM is fine but the physical host is overcommitted — a noisy neighbour is eating your cycles, and it's your cue to complain or move.

Line 4 — MiB Mem: total, free, used, and buff/cache (RAM Linux is borrowing as disk cache). Line 5 — MiB Swap (total/free/used) plus avail Mem, the amount actually available to programs. Read avail, not free — see Gotchas for the trap that panics every beginner. (When the box runs low and starts using swap, performance falls off a cliff, because swap lives on the slow disk.)

The Columns, Explained

Every column in the process table, so you never have to wonder:

  • PID — the process ID, the number you feed to kill or renice.
  • USER — who owns it (a process running as root that shouldn't be is a security smell).
  • PR / NI — kernel priority and the "nice" value (-20 greedy … 19 yielding); rt means real-time. This is how the kernel's scheduler decides who runs next when more threads want a core than there are cores.
  • VIRT — total memory mapped, often huge and mostly unused — not real usage.
  • RESresident memory: the actual RAM in use. This is the memory number that matters.
  • SHR — the part of RES shared with others (libraries and such).
  • S — state, and the letters tell a story: R running, S sleeping (the normal wait), D uninterruptible sleep (stuck in a disk/kernel call — you can't even kill a D process, and a pile of them is what high wa looks like), T stopped, Z zombie, I idle kernel thread.
  • %CPU — share of one core, so it can exceed 100% (350 = 3.5 cores). This is also where you see whether a program knows how to use more than one core at all.
  • %MEMRES as a share of physical RAM.
  • TIME+ — total CPU time used since launch, to 1/100s.
  • COMMAND — the program; press c for the full command line with arguments.

Reading It by Example

Now that you know the fields, here's the fast way to build instinct — real readings and what they mean. Assume a 4-core box unless noted, and remember the three numbers are 1, 5, 15 minutes.

load average: 5.00, 3.00, 1.00 — getting worse, fast. Fifteen minutes ago this box was calm (1.0); now it's 5. Something recent is escalating — stop and dig in before it becomes the next one.

load average: 100.00, 80.00, 60.00 — something is badly wrong. This isn't always a CPU on fire, and the most memorable time I saw it the CPU was nearly idle. I'd mounted a home directory over NFS through a WireGuard tunnel, and the tunnel dropped. Every process that so much as touched that mount froze in D state (uninterruptible sleep), waiting for a network reply that would never come — and you can't even kill a D process. They piled up, load sailed past 100, and top showed almost no CPU use at all. The lesson burned in: load counts everyone waiting, not just everyone running — a stuck disk or a dead network mount can drive it to the moon while the cores sit idle. (A runaway loop or fork storm gets you to load 100 too — but those at least show the CPU busy.)

load average: 1.00, 3.00, 5.00 — the storm already passed. The mirror image: it was bad, now clearing. If someone says "the site was slow ten minutes ago," this says it's recovering on its own.

load average: 3.90, 4.00, 3.80 on 4 cores — healthy and fully used. Load around the core count, steady, is a well-sized server doing real work and keeping up. Don't let busy-looking numbers scare you — this is what "good" looks like.

load average: 0.10, 0.15, 0.12 — you're paying too much. Nearly idle, and has been. Safe — but if it's always like this you're renting a server several sizes too big. Drop to a smaller instance and pocket the difference; for a startup that's real money every month.

A few whole-screen patterns that each point to a specific mistake:

  • One process at 100%, everything else idle, low load → a single-threaded app. The fix is in the code (or scaling out), not a bigger box.
  • free near zero, Swap used climbing, high wa, a database with a huge RES → out of RAM and swapping. The classic "I put MySQL on a tiny VM and never tuned it." Tune the memory or size up.
  • Tasks: 4000 total, load in the hundreds, a program spawning copies of itself → a runaway/fork bug.
  • st parked at 20+ → not your fault: the hypervisor host is overcommitted. Complain or move.

Cheat Sheet

Interactive keys (the real way to use it):

  • q quit · h help · M sort by memory (the key I press most) · P sort by CPU (default) · T by total time
  • 1 one line per core · H show threads · c full command line
  • k kill · r renice · e/E cycle memory units · u filter by user · W save your layout

Flags worth knowing:

  • top -b -n 1 — one plain-text snapshot, then exit (the version for scripts/logs)
  • top -o %MEM sort by a column · top -p PID watch specific processes · top -u USER one user · top -H threads from the start

How You'll Actually Use It

Honestly, most of the time you just type top, read it, and quit — one glance is the point. The few things beyond that: open it and press M to find a memory hog; scope it to one service with pgrep:

top -p "$(pgrep -d',' nginx)"

or grab a snapshot for a script:

top -b -n 1 | head

If you ever build a big pipe out of top, that's the sign you actually want ps: top is for looking, ps is for scripting.

Gotchas

  • VIRT is not real memory — read RES. VIRT counts everything mapped (huge for Java and databases).
  • "Linux ate my RAM!" A tiny free looks alarming, but read avail Mem. Linux uses spare RAM as disk cache and returns it instantly — empty RAM is wasted RAM. Real pressure shows as swap climbing.
  • %CPU over 100% is normal — it's per core. Press 1.
  • High load with idle CPU isn't a contradiction — load also counts D-state processes waiting on the disk.
  • A D-state process won't die — not even kill -9 reaches it until its I/O returns.

History & Philosophy

The load average is older than Linux — it comes from late-1960s mainframes, because operators needed to answer "is this machine keeping up?" in one glance at a teletype. That ruthless simplicity is why three little numbers are still the first thing you read fifty years later.

And here's the secret that makes the whole system click: top isn't doing anything magic. Everything it shows already lives in plain text files under /proc:

cat /proc/loadavg

There's your load average, in a file. cat /proc/meminfo for memory, /proc/stat for the CPU counters. The kernel publishes the entire live state of the machine as files you can read, and top just reads them a few times a second. The first time that lands — that a running Linux system is just files you can look at — the box stops being a black box and becomes something you're curious to explore. That's the "everything is a file" idea, one of the most beautiful in computing.

Column Reference

A quick lookup for the %Cpu(s) summary line — the eight buckets the CPU's time falls into:

Field Meaning Watch for
us User-space CPU — running your programs The "good" busy
sy Kernel CPU — system calls on your behalf High = syscall-heavy work
ni Niced user CPU — deliberately low-priority work Background batch jobs
id Idle — spare capacity Low = the machine is full
wa I/O wait — CPU stalled on the disk High = disk-bound; run iostat
hi Hardware IRQ servicing Usually tiny
si Software IRQ servicing High = heavy network traffic
st Stolen by the hypervisor High = noisy neighbour on a VM

See Also

  • htop — the friendlier, colorful cousin
  • ps — a one-shot snapshot; the tool for scripting
  • iostat — when wa is high and you suspect the disk
  • free — the clearest view of "Linux ate my RAM"
  • load average — what those three numbers really count
  • /proc — where all of this actually lives
  • high load — a full diagnose-and-fix walkthrough

Don't want to squint at top at 3am?

CleverUptime reads all of this for you, every minute, on every server — and explains it in plain language ("one core maxed, your app is single-threaded"), not a wall of numbers.

Want to see your own server's health right now? One command, no signup, no install.

Check your server →