dstat Command: Tutorial & Examples

One colored live dashboard for CPU, disk, network and memory — vmstat, iostat and ifstat in a single row.

What It Is

dstat is the tool you run when you want to watch everything your server is doing right now on one tidy, color-coded line that scrolls down the screen once a second. CPU, disk, network, memory, paging, interrupts — side by side, same instant, same units, so you can finally answer the question that older tools made annoyingly hard: when the disk got busy, what was the network doing? It's the unification of three classic commands — vmstat, iostat, and ifstat — into one readable stream, and it does it in color, which sounds like a gimmick until the first time a red number jumps out at you across a wall of green.

If you've never run a server, this is a wonderful place to build intuition, because dstat is safe — it only reads, never changes anything — and it shows you the whole machine breathing in real time. We'll explain every column and every plugin, teach you to read the stream the way someone does who's stared at thousands of these, and pick up how CPUs, disks and the network actually trade time along the way. One honest caveat up front, and we'll come back to it: the classic dstat is a Python 2 program that's been retired, so on a modern box you may type dstat and get command not found. Don't panic — there's a drop-in successor with the same name. We'll sort that out before you need it.

Your First Look

Type the name and watch:

dstat

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read  writ| recv  send|  in   out | int   csw
  4   1  95   0   0|  18k   42k|   0     0 |   0     0 | 412   980
  3   1  96   0   0|   0    128k| 1.2k  840B|   0     0 | 388   910
 38   6  55   1   0|   0     12M| 6.4k  3.1k|   0     0 |2.1k  4.8k
 41   7  50   2   0|4096B   28M|  12k  5.8k|   0     0 |2.4k  5.2k

There it is — the entire machine in one row, refreshed every second, a new line each tick. The header groups the columns into blocks (total-cpu-usage, dsk/total, net/total, paging, system), and on a real terminal each block is tinted its own color so your eye learns the layout in seconds. Watch the bottom two lines: the CPU climbed from idle into the 40s and the disk write column jumped from nothing to 28 megabytes a second at the same instant — that visual alignment, that "these two things happened together," is the entire reason dstat exists. Press Ctrl-C to stop it; it runs forever otherwise.

Run it with a number to change the interval, and a second number to cap how many lines:

dstat 5 12      # every 5 seconds, 12 samples, then quit

How I Read It

This is the part I wish someone had shown me. dstat throws a dozen numbers at you, but you don't read them left to right — you read them like a chord, all at once, looking for the one column that's loud.

First, my eye goes to the CPU block, and within it, two columns: wai and stl. Everyone watches usr (your programs) and sys (the kernel), and sure, those tell you the box is busy. But wai — I/O wait — is the one that tells me why it feels slow when the CPU looks free: the cores are stalled, parked, waiting for the disk to answer. A high wai and a busy dsk block on the same line is a storage bottleneck, full stop, and now I know to reach for iostat or iotop. And stl — steal — is the quiet tell on a cloud VM: it's CPU time the hypervisor took away to run somebody else's virtual machine. Steady stl above zero means a noisy neighbour is eating cycles you're paying for.

Second, I read across the row, not down the column. That's the whole superpower. A spike in dsk/writ lined up with a spike in net/recv says "we're receiving data and writing it to disk" — a backup landing, an upload, a database ingesting. A spike in dsk/read with idle network says "we're serving from disk" — a cold cache, a big query scanning a table. The correlation across the row is the diagnosis. Older tools showed you these in separate windows on separate clocks; dstat puts them on one timeline so the story tells itself.

Third, I watch the paging block like a hawk on a box I suspect is short on memory. The in/out columns there are not network — they're pages swapped between RAM and disk. The instant those climb off zero and stay there, the machine is out of memory and thrashing: every swap-in is a disk read standing between a program and the data it wanted, and performance falls off a cliff. Zero is healthy; a steady trickle is the early warning; a flood is the emergency.

That's the craft: wai/stl in the CPU block for the hidden stalls, reading across the row for cause-and-effect, paging for the memory cliff. Everything else is detail — and now we'll cover all of it.

The Default Columns, Explained

Run bare dstat and you get five blocks. Here's every column, so you never have to guess at a unit again. (dstat auto-scales and labels magnitudes — k, M, G, B for bytes — which is one of its nicest touches; a raw 4096B and a 28M sit in the same column and you always know which is which.)

total-cpu-usage — the percentage of CPU time, summed across all cores, in the same buckets top uses:

usr — running your programs. The "good" busy.
sys — running the kernel on their behalf (system calls, memory management).
idl — idle; genuinely spare capacity.
wai — I/O wait: stalled waiting on the disk. The column that explains "slow but not busy."
stl — steal: time the hypervisor gave to another VM. Gold on a cloud box.

dsk/total — aggregate disk throughput across every block device:

read / writ — bytes per second read from and written to disk.

net/total — aggregate network throughput across every interface:

recv / send — bytes per second in and out.

paging — the memory-pressure block, and the one people misread:

in / out — pages moved in from and out to swap, per second. Not the network. Sustained non-zero here is the memory alarm.

system — how often the kernel is being interrupted from its work:

int — hardware interrupts per second (a busy NIC or disk controller raises this).
csw — context switches per second: how often the scheduler swaps one process off a core for another. A sky-high csw with little real work done is a classic sign of too many things fighting over too few cores — lock contention, a thundering herd of threads — and it's a number you'd struggle to even find without dstat or vmstat.

Pro Tip

The single most useful flag is -d -D sda,sdb (or whatever your disks are called): it splits the dsk block into one column per drive instead of one lumped total. On a box with a fast SSD and a slow archive disk, the aggregate hides everything — per-device is where you see which disk is the bottleneck. The same trick works for network interfaces with -n -N eth0,eth1.

The Plugins, Explained

Here's the design that makes dstat more than a prettier vmstat: every block of columns is a plugin, and you compose the dashboard you want by switching them on. The defaults above are just a sensible starting set. Each plugin has a short flag, and you stack them in any order:

-c CPU · -d disk · -n network · -g paging · -m memory (used/buffers/cache/free) · -s swap · -y system (int/csw) · -p process count (run/blocked/new) · -l load average · -r I/O requests (not bytes — operations per second, the IOPS view) · -i interrupts (per IRQ).

So dstat -cdngy is the explicit way to ask for the defaults, and dstat -tcmsdn adds a timestamp and memory and swap to the mix. Speaking of which — -t puts a wall-clock time on every line, which sounds trivial until you're scrolling back through a captured log trying to line a spike up with something in /var/log. Always add -t; future-you will thank present-you.

Then there are the named plugins, invoked with --name, and this is where dstat gets genuinely clever. A few worth knowing:

--top-cpu / --top-mem / --top-io — adds a column naming the single biggest process in that category, right there in the row. So instead of "the disk is busy," you get "the disk is busy and mysqld is the one hammering it" — the what and the who on the same line. This is the feature that turns dstat from a gauge into an accusation.
--top-bio — the biggest block-I/O process specifically.
--net-packets — packets/sec instead of bytes/sec, which separates "one big transfer" from "a flood of tiny packets" (often the signature of a SYN flood or a chatty, badly-batched app).
--tcp / --socket — TCP connection states and socket counts.

List every plugin your build has with dstat --list. The greatest-hits line that lives in a lot of admins' muscle memory is:

dstat -tcmsdn --top-cpu --top-mem

Time, CPU, memory, swap, disk, network — and the top CPU and memory process named on every line. That one command is a remarkably complete picture of a server's life, scrolling past once a second.

The Magic: It Writes Itself Into a Spreadsheet

Here's the trick almost nobody discovers, and it's genuinely lovely. dstat can stream its numbers straight to a CSV file at the same time as it's painting the colored display for you:

dstat -tcmdn --output /tmp/perf.csv 5

You watch the live dashboard on screen; meanwhile /tmp/perf.csv fills up with a clean, timestamped, comma-separated row every five seconds — headers and all. Leave it running through a nightly batch job or a load test, come back in the morning, open the CSV in a spreadsheet, and plot it. Suddenly the 3am disk spike you slept through is a line on a graph, lined up against the CPU and the network, and the cause is obvious. A monitoring tool you can read with your eyes and hand to a spreadsheet, from one command, with no separate logging agent to install — the first time you graph a real incident from a --output file, dstat earns a permanent spot in your toolbox.

(That single capability also quietly explains its successor's whole existence, which we'll get to in History — the idea of one stream feeding both a human and a chart turned out to be the future of the entire tool.)

Reading It by Example

Real rows, and what each one is telling you. The columns are usr sys idl wai stl | read writ | recv send | in out | int csw.

41 7 50 2 0 | 0 28M | 12k 5.8k | 0 0 | ... — healthy heavy work. CPU busy on real user code, disk writing hard, a little network, no I/O wait to speak of, no paging. This is a box earning its rent — don't let the big numbers scare you; this is what "working" looks like.

3 2 15 80 0 | 4M 200k | 0 0 | 0 0 | ... — a disk bottleneck. The CPU is 80% in wai, barely doing user work, and the disk is reading. The machine feels maxed out but the cores are mostly idle — they're standing in line behind slow storage. A bigger CPU buys you nothing here; chase the disk with iostat and iotop.

10 5 60 0 25 | ... — the noisy neighbour. A quarter of your CPU is stl, stolen by the hypervisor for someone else's VM. Nothing on your box is wrong; you're sharing a physical host with a greedy tenant. Your cue to complain to the provider or move.

paging in 8.0M out 12M holding steady — out of memory, thrashing. Those aren't network bytes — they're swap traffic. Every program that touches memory is now waiting on the disk. Add RAM, find the leak, or size up; nothing else recovers a thrashing box.

int 45k csw 90k with modest usr — the box is busier herding threads than doing work. A storm of context switches and interrupts with little to show for it points at lock contention, a tight retry loop, or an interrupt-heavy device. Pair --top-cpu to name the culprit.

Cheat Sheet

The flags and moves worth keeping:

dstat — defaults (cpu/disk/net/paging/system), 1-second tick · dstat 5 20 — every 5s, 20 samples · Ctrl-C to stop
-t timestamp each line (always do this) · -f full per-device detail (don't aggregate) · --nocolor plain text for logs/pipes
Plugins: -c cpu · -d disk · -n net · -m mem · -s swap · -g paging · -y system · -l load · -p procs · -r I/O requests (IOPS) · -i interrupts
Per-device: -D sda,sdb named disks · -N eth0 named interfaces · -C 0,1,total named CPUs
Named: --top-cpu · --top-mem · --top-io · --net-packets · --tcp · --list to see them all
--output file.csv stream to CSV while you watch · --bits show network in bits not bytes
The classic one-liner: dstat -tcmsdn --top-cpu --top-mem

How You'll Actually Use It

In practice there are three modes. First, the quick glance: type dstat, watch ten seconds, see whether the load is CPU, disk, or network, and move on — exactly the way you'd use top, but with disk and network in the same view. Second, the targeted watch: you suspect a specific subsystem, so you compose a focused dashboard — dstat -tdn -D sda --top-io to stare at one disk and the process hitting it while you reproduce a slow query. Third, the capture: --output to a CSV during a load test or overnight, then graph it. That third mode is the one that separates "I think the disk was busy" from "here's the graph proving the disk pegged at 02:14, three seconds after the backup kicked off."

What you won't do is pipe dstat into awk for scripting — its colored, human-friendly output is deliberately not built for machine parsing (the man page says so outright). When you need numbers for a script, the right tools are the raw /proc files or sar from sysstat. dstat is for looking.

Common Errors and Troubleshooting

dstat: command not found. The most common one in 2024+. Classic dstat is Python 2 and has been dropped from current distros. Install the successor — on Debian/Ubuntu it's apt install pcp (it ships pcp-dstat, often symlinked as dstat); on Fedora/RHEL it's dnf install pcp-system-tools. Same columns, same flags, same muscle memory. (See History for the why.)
Module dstat_top_io failed to load or similar plugin warnings. A named plugin needs data the kernel isn't exposing (older kernel, missing /proc entry, or a container with a restricted view). Drop that plugin; the rest still run.
All disk/net columns sit at zero inside a container. Containers often can't see host-level block-device and interrupt counters. Run dstat on the host for whole-machine I/O, not inside the container.
The numbers look wrong the first line. The very first sample is a since-boot average, not a one-second rate — dstat needs two readings to compute a delta. Ignore line one; trust line two onward. (This trips people up on vmstat too — same reason.)

History & Philosophy

dstat was written around 2004 by Dag Wieers, a Belgian sysadmin who was tired of having three terminals open running vmstat, iostat, and ifstat side by side and trying to eyeball whether their numbers lined up. He wrote a single Python program that read the same /proc and /sys files those tools read, put the results on one timeline, added color, and let you snap plugins together like Lego. It was an instant hit, precisely because it solved a real daily annoyance with taste.

And then it ran headlong into one of the quiet tragedies of the open-source 2010s: Python 2. dstat was written in it, leaned on it, and when Python 2 reached its long-announced end of life on January 1, 2020, a tool that hadn't seen much active development was suddenly built on a foundation the distros were ripping out. Rather than let a beloved utility simply rot, the Performance Co-Pilot project (PCP) adopted it: they rebuilt dstat as pcp-dstat, a faithful reimplementation that keeps the exact same command-line interface and colored output you know, but draws its data from PCP's richer collection engine underneath. So on a modern server, dstat the experience is alive and well — it just wears a new engine under the hood, the way a classic car gets a modern drivetrain while keeping the dashboard you love. The lesson tucked in here is one every admin eventually learns: a tool's interface can outlive its implementation by decades if people care enough to carry it across.

There's a deeper idea worth carrying away, too. Every number dstat shows you it computes — it reads a counter from /proc now, reads it again a second later, and reports the difference. The kernel doesn't store "28 MB/s"; it stores a single ever-growing tally of total bytes written since boot, and the rate is something dstat works out by subtracting. cat /proc/stat once, wait, cat it again, and you've done by hand exactly what dstat does in a loop. Once that lands — that the live, breathing dashboard is just arithmetic on plain-text files the kernel keeps updating — the machine stops being a black box. It's the same beautiful "everything is a file" idea that runs under top and every other tool here, and dstat is simply the one that does the subtraction for the whole machine at once.