iostat Command: Tutorial & Examples

The command top sends you to when wa is high — names the disk, measures the queue.

What It Is

iostat answers exactly one question, and it's the question top makes you ask: is my disk the bottleneck? You've looked at top, you've seen the wa column (I/O wait) sitting at 30% or 50%, and you know the CPU is stalled waiting for storage — but you don't know which device, how slow, or by how much. That's the handoff. You type iostat -x 2 3, and within four seconds you have a name (nvme0n1, sda, dm-0), a queue depth, a per-request latency, and a verdict: this disk is the suspect, with 12ms average wait per request and a queue of 8 requests stacked up.

If you've never run a server, this is the second performance tool to learn after top. top sees the symptom (the CPU is bored, waiting on something); iostat sees the cause (this block device is the bottleneck). Together they cover most "why is this server slow?" questions you'll ever face. We'll explain every field, teach the diagnostic loop the way an experienced admin runs it, and along the way pick up how Linux talks to storage — block devices, request queues, merges, device mapper, and the one beautiful old metric (%util) that lies on modern SSDs.

Your First Look

The canonical invocation — extended stats, every 2 seconds, three samples:

iostat -x 2 3
Linux 6.12.86-amd64 (xps)   05/27/2026   _x86_64_   (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.41    0.42    1.41    0.18    0.00   94.58

Device   r/s   rkB/s  rrqm/s  %rrqm r_await rareq-sz   w/s   wkB/s  wrqm/s  %wrqm w_await wareq-sz   f/s f_await  aqu-sz  %util
dm-0    23.8   898.8    0.00   0.00    0.25    37.70  285.7  2498.9    0.00   0.00    4.09     8.75  0.00    0.00    1.17   1.19
nvme0n1 21.5   898.9    2.32   9.74    0.18    41.75  272.1  2499.1   13.62   4.77    1.69     9.18  1.22    1.62    0.47   0.97

Two reports stacked on top of each other: a CPU summary (one row, the same %user/%system/%iowait/%idle you know from top, averaged across all cores) and a device table (one row per block device: IOPS, throughput, merged requests, latency, queue depth, utilization). The -x is "extended" — without it you get the four-column toy view; with it you get every number that matters.

The 2 3 is "sample every 2 seconds, three times." Together they're the version of iostat you'll type for the rest of your life — short enough to read, long enough to spot trends, with one critical catch about the very first sample we'll get to in a moment.

How I Read It

Three seconds, the same order every time, with the trick that took me years.

First — and this is the move: I ignore the first sample. The first table iostat prints is the average since the machine booted — meaningless. A server up for 30 days will smooth a 5-minute disk storm into invisibility. The second and third samples are the live numbers, each one covering the 2-second window since the last report. I scroll past the first table and read the second.

Second, I scan %util and aqu-sz together. %util is the percentage of wall time the device had at least one request in flight; aqu-sz is how many requests were stacked up on average. On a calm box both are near zero. When something hurts: %util will be high and aqu-sz will be greater than 1 — meaning requests are queuing behind each other instead of being served instantly. That's saturation. If %util is high but aqu-sz is well under 1, the device is doing real work but not yet backed up.

Third, I check the latency columns — r_await and w_await. These are what your processes actually feel: average milliseconds per read / write request, queue time included. A healthy NVMe reads under 1ms; a healthy SSD under 5ms; a healthy spinning HDD under 20ms. Numbers above those by 5× or 10× mean the device is hurting — and they translate directly to the wa percentage you saw in top.

The handoff to the next tool: iostat names the device. It does not tell you which process is hammering it. For that, you reach for iotop (per-process I/O — the top of the disk world). And if the disk is full instead of slow, df tells you how full and du tells you which directory ate the space. Four tools, one diagnostic chain: topiostatiotop, with df + du for the "full" axis. Once you know that loop, "the server is slow" stops being scary.

Pro Tip

always run iostat -x 2 3 and ignore the first sample. It's the average since boot — meaningless for anything live. The second and third reports cover the 2-second windows you actually care about. This single discipline separates people who use iostat correctly from people who get confused by it.

The Columns Explained

Every column in iostat -x, in the order they print, with what each one is really telling you.

  • Device — the block device name as it appears in /dev. sda (first SATA disk), nvme0n1 (first NVMe namespace), dm-0/dm-1 (device mapper layers — usually LVM volumes or dm-crypt containers stacked on top of a physical disk; the same I/O often shows up twice, once on dm-X and once on the underlying nvme0n1). Cross-check with lsblk to see the stacking.
  • r/s, w/s — reads and writes completed per second after merging. These are IOPS — the I/O-operations metric storage vendors quote. A SATA SSD does ~10k random read IOPS, an NVMe does 100k–1M, a spinning HDD does 100–200.
  • rkB/s, wkB/s — kilobytes per second of throughput. Multiply IOPS × average request size (rareq-sz/wareq-sz) and you get this. Sequential large requests show high kB/s with modest IOPS; tiny random requests show the opposite.
  • rrqm/s, wrqm/srequests merged per second. Here's something quietly elegant: when your program asks for sectors 100, 101, 102, the kernel's I/O scheduler notices they're adjacent and merges them into one bigger request before sending it to the device. That coalescing is gold for HDDs (one seek instead of three) and still helps on SSDs. High rrqm/s means your workload is sequential-friendly; near zero means it's random and the kernel can't help you.
  • %rrqm, %wrqm — what percentage of submitted requests got merged. Useful for understanding how much the merging helps.
  • r_await, w_awaitthe latency columns, and the most honest numbers on the screen. Average milliseconds per request including time in the queue and time on the device. This is what your processes wait. If r_await is 15ms on an NVMe, something is wrong (queue saturated, or a sibling VM is stealing the controller). Old versions of iostat printed a single await; current versions split it so you see read and write latency separately — writes are often slower because of write barriers and flush operations.
  • rareq-sz, wareq-sz — average request size in kilobytes. Tells you whether the workload is "many small" (a database doing 8K page reads, request size near 8) or "few large" (a backup streaming a file, request size in the hundreds).
  • d/s, dkB/s, drqm/s, %drqm, d_await, dareq-sz — the same six columns for discard operations (TRIM on SSDs; how the filesystem tells the drive "these blocks are free, you can erase them"). Mostly zero unless fstrim is running.
  • f/s, f_awaitflush requests per second and their average latency. A flush is when the filesystem says "really write this to physical media, don't buffer it." Databases issue flushes on every commit; high f_await is exactly what hurts transactional workloads. f_await is often the truest "is my SSD struggling?" number — far better than %util.
  • aqu-szaverage queue size: how many requests were waiting or in flight at any moment, averaged over the interval. This is the saturation signal that actually works on modern hardware. Sustained aqu-sz > 1 means requests are queuing behind each other; aqu-sz > 10 means the device is buried. (Old iostat spelled this avgqu-sz.)
  • %util — percentage of wall time the device had at least one request in flight. This is the beautiful old metric that lies on modern hardware — see the Warning below.

Warning

%util was designed in the era when a disk could service one request at a time — a single spinning platter, one head, one seek. On that hardware, "100% utilized" meant truly saturated. On modern NVMe and even SATA SSDs, one device can serve dozens of requests in parallel (NVMe specs up to 64K queues × 64K commands). %util can show 100% while the drive is barely working, because as soon as one request lands a new one starts. For modern storage, trust aqu-sz (average queue depth) and w_await/f_await (real latency) instead. %util is still useful on spinning rust and as a "is anything happening?" indicator, but treat it as a hint, not a verdict.

Reading It by Example

The patterns that turn into instinct. All from the second or third sample of iostat -x 2 3 — never the first.

%util 1, aqu-sz 0.01, r_await 0.2, w_await 1.5 on an NVMe. Idle. The disk is not your problem. If top said the server was slow, look elsewhere — CPU, network, application-level locking.

%util 95, aqu-sz 0.4, w_await 1.2 on an NVMe. Busy but not hurting. The drive almost always has something in flight (high %util), but the queue isn't backing up (aqu-sz < 1) and processes get their writes in 1.2ms. This is exactly the modern-hardware "lying %util" pattern — perfectly healthy.

%util 99, aqu-sz 12.5, r_await 45, w_await 80. Genuinely saturated. Twelve requests stacked up, 80ms write latency — your processes are waiting. This matches a high %iowait in the CPU row and high wa in top. Next steps: iotop to find the culprit process; consider whether the workload should be throttled, batched, or moved to faster storage.

%iowait 30% in the CPU row, every device at %util 5. The wait isn't this server's local disks. Could be an NFS mount (the kernel still counts NFS waits as %iowait, but per-device stats only cover local block devices — same trap as the load-100, idle-CPU story under top); could be a stuck remote mount that doesn't even show as a row.

dm-0 and nvme0n1 both show ~250 w/s, with dm-0's w_await 4.1 and nvme0n1's w_await 1.7. Normal: dm-0 is a device mapper layer (LVM or dm-crypt) sitting on top of nvme0n1, so the same I/O appears twice. The extra w_await on dm-0 is the cost of going through the mapper layer (encryption, snapshot bookkeeping). Read the physical row to judge hardware; read the dm row to judge what your filesystem sees.

rrqm/s 0, wrqm/s 0 everywhere, on a workload you'd expect to be sequential. The kernel isn't merging — likely a database with O_DIRECT or fsync-per-write that defeats merging. Either confirm that's intended, or check whether the application is sending tiny scattered writes when it could batch.

f/s climbing, f_await 30ms+. Flush latency on a transactional workload. This is the classic "Postgres commits got slow" or "MySQL binlog is dragging" symptom. The drive's write cache is full or the underlying media is overcommitted; the fix is faster storage, a controller with battery-backed cache, or batching commits at the application level.

Cheat Sheet

The invocations worth memorizing:

  • iostat -x 2 3 — the canonical one. Extended stats, two-second windows, three samples. Ignore the first.
  • iostat -xz 2 — extended, only devices with activity (-z skips zero-rows), runs forever. Best for staring at a live system.
  • iostat -xt 2 3 — add a timestamp before each report.
  • iostat -xm 2 3 — show throughput in megabytes/sec instead of kilobytes (handy on fast NVMe).
  • iostat -d -x 2 — device stats only, skip the CPU summary.
  • iostat -p sda — include partition-level stats for sda (sda1, sda2, …) — useful for seeing which partition is hot.
  • iostat -p ALL 2 3 — every device and every partition.
  • iostat -y 1 5skip the since-boot first report (-y). Equivalent to manually ignoring it, but saves you scrolling.
  • iostat -o JSON -x 2 3 — machine-readable output for scripts and dashboards.
  • iostat -c 2 3 — only the CPU report (same numbers as top in a one-shot form).
  • iostat --human -x 2 3 — auto-scaled units in the throughput columns.

How You'll Actually Use It

In real life, iostat lives in two moments. The 3am disk question: top shows wa 40, you ssh over, type iostat -x 2 3, scroll past the first table, read aqu-sz and w_await on each row, name the suspect device. Total time: ten seconds. Then iotop -oP to find the process doing the I/O. The pre-deployment sanity check: before turning on debug logging or shipping a write-heavy feature, you sit with iostat -xz 2 open in a tmux pane and watch the queue and latency under load. If aqu-sz climbs into the double digits in staging, it will do worse in production.

What iostat is not for: scripting alerts. The output format has shifted between sysstat versions (the awaitr_await/w_await split happened around v11; aqu-sz was avgqu-sz before that), so awk-based parsers break across distros. For long-term archives use sar (same sysstat package, designed for historical storage); for JSON dashboards use iostat -o JSON. And if you find yourself wanting per-process I/O — which iostat deliberately doesn't show — that's iotop or pidstat -d.

Gotchas

  • The first sample is the average since boot. Always. Ignore it, or pass -y to suppress it. Reading the first sample is the #1 iostat mistake.
  • %util lies on modern storage. Reread the Warning above — on NVMe and parallel SATA SSDs, 100% utilization does not mean saturated. Use aqu-sz and *_await instead.
  • iostat may not be installed. It ships in the sysstat package, which isn't on minimal Debian/Ubuntu/Alpine installs. If iostat: command not found, run apt install sysstat (Debian/Ubuntu) or dnf install sysstat (RHEL/Fedora). On the same package install, also enable sadc if you want sar to collect history.
  • Device mapper layers double-count. I/O through LVM, dm-crypt, or mdadm RAID appears on the dm-*/md* rows and on the physical device. Don't sum them. Use lsblk to see the stack.
  • %iowait is a CPU state, not a disk state. It only counts when the CPU was idle and waiting on a disk request. A busy CPU with simultaneous heavy I/O can show low %iowait and still be I/O-bound — the cores are running other work while the I/O runs in parallel. The device-level aqu-sz and await columns are the real indicator.
  • NFS and other network filesystems don't appear as devices. iostat reads /proc/diskstats, which only tracks local block devices. For NFS, use nfsiostat (also in sysstat).
  • Transactions per second (tps) is requests, not bytes. A tps of 1000 could be 1000 × 4KB random reads (4 MB/s) or 1000 × 1MB sequential reads (1 GB/s). Always cross-read with kB/s.

History & Philosophy

iostat ships in sysstat, Sébastien Godard's package, alongside sar, pidstat, mpstat, tapestat, and nfsiostat. Sébastien has maintained that suite since 1999, by himself, for over twenty-five years — and the design is delightfully consistent: every tool reads counters from /proc, takes a sample interval, and prints the deltas. Once you know one, you know the family. The original iostat goes further back — it was a Solaris tool in the 1980s, ported and rewritten for Linux, then absorbed into sysstat. The four-character name is a Unix classic: io + stat.

And the payoff that lands on every tool in this corner of the docs: iostat isn't doing anything you can't do yourself. Run this:

cat /proc/diskstats
259       0 nvme0n1 1234567 89012 98765432 234567 8901234 567890 ...

Eleven (now fifteen, on newer kernels) integers per device, cumulative since boot: reads completed, sectors read, time spent on reads, writes completed, sectors written, time spent on writes, I/O in flight, weighted time in I/O, and now discard and flush counters. The kernel updates them on every request; iostat reads the file twice with your interval between, subtracts, divides by the elapsed time, and prints. That's the whole tool — and it's the same "everything is a file" idea you'll see under top, free, ps, and df. The kernel publishes the live state of the storage subsystem as plain text in /proc; the classic tools are tiny formatters on top. Once that lands, you stop being afraid to cat a /proc file yourself.

The deeper design lesson — and the reason aqu-sz and await exist alongside %util — is that good metrics survive changes in hardware. When iostat was new, %util told you the truth because a disk really could only do one thing at a time. When SSDs and NVMe arrived, queue depth and latency stayed honest while %util quietly stopped meaning what it used to. The metric didn't break; the world moved underneath it. Worth remembering the next time you build a dashboard.

Column Reference

The iostat -x device columns that matter most, at a glance:

Column Meaning
r/s Read requests completed per second (IOPS)
w/s Write requests completed per second (IOPS)
rkB/s Kilobytes per second read
wkB/s Kilobytes per second written
await Average ms per I/O request, queue time included
r_await Average ms per read request
w_await Average ms per write request
aqu-sz Average queue length (sustained > 1 = requests queuing)
%util Percentage of wall time the device was busy (lies on SSD/NVMe)

See Also

  • top — the tool that sends you here when wa is high
  • iotop — the per-process I/O view (which process is hammering the disk)
  • vmstat — the bigger-sister overview: CPU, memory, swap, and I/O rates in one screen
  • pidstat — per-process stats (pidstat -d for I/O per PID); same sysstat family
  • mpstat — per-CPU stats; same sysstat family
  • sar — the historical archive; long-term storage of the same numbers
  • df — when the disk is full instead of slow
  • du — which directory ate the space
  • lsof — which process is holding which file; pair with iotop to find the I/O culprit
  • lsblk — the block-device tree; understand what dm-0 sits on top of
  • smartctl — the underlying drive's own health view
  • /proc/diskstats — the raw file iostat reads
  • block device — what iostat is measuring
  • iowait — what the %iowait and wa numbers actually count
  • high I/O wait — the diagnose-and-fix walkthrough
  • high load — when high I/O is the reason load is high
  • disk full — the other way a disk causes trouble

wa is high in top and you don't know which disk to blame?

CleverUptime watches the symptoms this page is about — load creeping, services flapping, the database timing out under I/O pressure — and flags the underlying cause in plain language so you don't have to remember which command answers it.

Want to see your own server's health right now? One command, no signup, no install.

Check your server →