/proc/mdstat: Explanation & Insights
The kernel's live scoreboard for every software RAID array on the box.
What It Is
/proc/mdstat is a virtual file maintained by the Linux kernel's md (multiple devices) driver — the subsystem behind Linux software RAID. It doesn't live on any filesystem; it's generated on the fly every time you read it, straight from the kernel's in-memory state. One cat and you see every software array on the machine: what level, which members, who's up, who's down, whether anything is rebuilding, and how fast.
cat /proc/mdstat
That's the whole command. No flags, no options, no root required. The file is world-readable because knowing the state of your arrays is never a secret — it's a necessity. If you administer a server with software RAID, you'll read this file more often than almost any other file in /proc. Most monitoring tools — including CleverUptime — parse it on every collection cycle, because a degraded array is the quietest emergency in computing: the server keeps working perfectly, and the only sign is one character changing from U to _ in a file nobody's watching.
The file only shows software RAID arrays managed by md. If your server uses a hardware RAID controller (LSI MegaRAID, HP SmartArray, Dell PERC), the controller hides the arrays from the kernel and /proc/mdstat will either be empty or show unused devices: <none>. That's not broken — it's a different world, managed through vendor tools like megacli, ssacli, or perccli. Everything on this page assumes md — the kind you build and manage with mdadm.
What It Looks Like
Here's a real-world /proc/mdstat from a server with two NVMe drives mirrored across three arrays:
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 nvme1n1p3[0] nvme0n1p3[1]
931913024 blocks super 1.2 [2/2] [UU]
bitmap: 0/7 pages [0KB], 65536KB chunk
md1 : active raid1 nvme1n1p2[0] nvme0n1p2[1]
67042304 blocks super 1.2 [2/2] [UU]
md0 : active raid1 nvme1n1p1[0] nvme0n1p1[1]
1046528 blocks super 1.2 [2/2] [UU]
unused devices: <none>
And a more complex setup — four SATA HDDs, two RAID 1 mirrors for boot and swap, two RAID 5 arrays for system and data:
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10]
md1 : active raid1 sdb2[1] sda2[0] sdc2[3] sdd2[2]
523264 blocks super 1.2 [4/4] [UUUU]
md2 : active raid5 sdb3[1] sda3[0] sdc3[4] sdd3[2]
67055616 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
md0 : active raid1 sdb1[1] sda1[0] sdc1[3] sdd1[2]
16759808 blocks super 1.2 [4/4] [UUUU]
md3 : active raid5 sdb4[1] sda4[0] sdc4[4] sdd4[2]
29179894272 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 2/73 pages [8KB], 65536KB chunk
unused devices: <none>
That second server has 27 TB of usable RAID 5 space across md3 alone — four disks, three worth of capacity, one worth of parity. Every byte intact, every member present, [UUUU]. All good.
Every Line, Explained
The file has a fixed structure. Let's walk through it piece by piece, because every character carries meaning.
The Personalities Line
Personalities : [raid1] [raid6] [raid5] [raid4] [raid10]
This lists the RAID levels the kernel currently has loaded. Each one is a kernel module (raid1.ko, raid456.ko — yes, 4, 5, and 6 share one module — raid10.ko). If a level isn't listed here, the kernel hasn't loaded that module, and you can't create arrays of that type until you do (modprobe raid5). On most servers this line is just furniture — the modules auto-load when you create an array. But if you're debugging why mdadm --create fails with a cryptic error, check here first.
The Array Block
Each array gets a block of one to three lines. Here's a RAID 5 example, annotated:
md3 : active raid5 sdb4[1] sda4[0] sdc4[4] sdd4[2]
29179894272 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 2/73 pages [8KB], 65536KB chunk
Line 1 — the identity line:
| Fragment | Meaning |
|---|---|
md3 |
The array's device name — you'll find it at /dev/md3 |
active |
The array is assembled and running. Alternatives: inactive (stopped or partially assembled) |
raid5 |
The RAID level |
sdb4[1] |
Member device sdb4, in slot 1. The number in brackets is the md internal index, not the disk's position in the chassis |
sda4[0] |
Member sda4, slot 0 |
sdc4[4] |
Member sdc4, slot 4. The index can be higher than the member count — md reuses slots but doesn't always renumber after replacements |
sdd4[2] |
Member sdd4, slot 2 |
If a member is a spare, you'll see (S) after the brackets: sde4[5](S). If a member has been marked faulty, you'll see (F): sdb4[1](F). Those two suffixes are the first thing to look for when something is wrong.
Line 2 — the geometry line:
| Fragment | Meaning |
|---|---|
29179894272 blocks |
Total usable capacity in 1 KB blocks. Divide by 1048576 for GB — this is ~27.2 TB |
super 1.2 |
Metadata format version. 1.2 means the superblock sits at the start of the device (the modern default). You'll also see 1.0 (superblock at the end) and 0.90 (ancient, pre-2011) |
level 5 |
RAID level again — redundant with line 1, but useful when scripting |
512k chunk |
The stripe width. Data is written in 512 KB chunks, alternating across disks. Larger chunks favour sequential I/O; smaller chunks spread small writes more evenly |
algorithm 2 |
The parity rotation layout. 2 is left-symmetric — the default, and the one that distributes parity most evenly across disks. You'll never need to change this |
[4/4] |
The status fraction. [expected/present]. Four members expected, four present. [4/3] means one is missing — the array is degraded |
[UUUU] |
The status string. One character per expected member. U = up. _ = missing or failed. This is the single most important thing in the entire file — if there's an underscore, something is wrong |
For RAID 1 arrays, the geometry line is shorter — no chunk size or algorithm, because mirrors don't stripe:
931913024 blocks super 1.2 [2/2] [UU]
Line 3 — the bitmap line (optional):
bitmap: 2/73 pages [8KB], 65536KB chunk
A write-intent bitmap tracks which regions of the array have been written to recently. If a member drops and comes back quickly (a transient cable glitch, a temporary power issue), the array only needs to resync the dirty regions instead of the entire disk — turning a multi-hour rebuild into a minutes-long touch-up. The numbers: 2/73 means 2 of 73 bitmap pages are currently dirty; 65536KB chunk is the granularity — each bit covers 64 MB of array space. Not every array has a bitmap (you add one with mdadm --grow --bitmap=internal), but on large arrays it's cheap insurance.
The Rebuild Progress Line
When an array is rebuilding (resyncing, recovering, or checking), a fourth line appears:
md3 : active raid5 sde4[4] sda4[0] sdc4[5] sdd4[2]
29179894272 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
[=====>...............] recovery = 28.3% (2754048/9726631424) finish=1284.7min speed=90340K/sec
bitmap: 2/73 pages [8KB], 65536KB chunk
That progress line is the heartbeat of a rebuild, and every piece matters:
| Fragment | Meaning |
|---|---|
[=====>...............] |
Visual progress bar — roughly how far along the rebuild is |
recovery |
The operation type. recovery = rebuilding a replacement. Others: resync (initial sync after creation), check (scrub, reading everything to verify parity) |
28.3% |
How far along, as a percentage |
(2754048/9726631424) |
Blocks done / blocks total |
finish=1284.7min |
Estimated time remaining — 21 hours in this case. This number fluctuates with I/O load; don't hold it to the minute |
speed=90340K/sec |
Current rebuild speed in KB/s. Governed by /proc/sys/dev/raid/speed_limit_min and speed_limit_max — you can tune these to speed up or slow down the rebuild |
A rebuild on a large array can take hours or even days, and during that time the array is vulnerable — it's running on fewer members than it needs. See RAID rebuilding for the full survival guide.
The Footer
unused devices: <none>
Always present. Lists any block devices that md knows about but aren't part of any array. Almost always <none>. If you see device names here, something is partially assembled or a spare was removed from its array — investigate with mdadm --examine.
Reading It by Example
The real skill with /proc/mdstat is pattern recognition. You don't parse it — you glance at it, and one shape tells you everything. Here are the shapes that matter.
Healthy — All U's, Nothing Else
md0 : active raid1 nvme1n1p1[0] nvme0n1p1[1]
1046528 blocks super 1.2 [2/2] [UU]
Two members expected, two present, both up. No bitmap line (small array, fast rebuild anyway), no progress line. This is the shape you want to see, and ideally the only one you ever see. The [UU] at the end is the all-clear signal.
Degraded — The Underscore
md2 : active raid1 nvme0n1p3[1]
931913024 blocks super 1.2 [2/1] [_U]
[2/1] — two expected, one present. [_U] — the first member is gone. Notice that the identity line only shows one device now (nvme0n1p3[1]); the missing member has vanished from the listing entirely. This is a degraded RAID mirror running on its last leg. One more failure and the data is gone. The fix: identify the dead disk (check dmesg for I/O errors, smartctl for SMART status), replace it physically, partition the replacement to match, and add it with mdadm /dev/md2 --add. The rebuild starts automatically.
Rebuilding — The Progress Bar
md2 : active raid1 nvme1n1p3[2] nvme0n1p3[1]
931913024 blocks super 1.2 [2/1] [_U]
[===>.................] recovery = 18.4% (171637760/931913024) finish=62.3min speed=203124K/sec
Still [2/1] and [_U] — the new member hasn't fully joined yet. But there's a progress bar, which means the rebuild is running. The new disk (nvme1n1p3[2], note the higher slot index) is being written to block by block. When recovery hits 100%, the status flips to [2/2] [UU] and the progress line disappears. Until then, the array is still degraded — handle it gently, avoid unnecessary I/O spikes, and resist the urge to reboot.
Spare Present
md5 : active raid5 sdf3[5] sde3[4] sdd3[2] sdc3[1] sdb3[0] sdg3[6](S)
19531673600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
sdg3[6](S) — the (S) marks a hot spare. It's part of the array but not active; when a member fails, the spare automatically takes its place and the rebuild starts without human intervention. The status string still shows [5/5] [UUUUU] because the spare doesn't count toward the active member count — it's standing by. Hot spares are the difference between "a disk failed and the rebuild started immediately" and "a disk failed and nobody noticed for three days."
Faulty Member (Before Removal)
md5 : active raid5 sde3[4] sdd3[2] sdc3[1] sdb3[0] sdf3[5](F)
19531673600 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [UUUU_]
sdf3[5](F) — the (F) means md has marked this member faulty. It's still listed (unlike a fully removed member), but it's no longer contributing to the array. [5/4] [UUUU_] confirms: five expected, four working. The next step is mdadm /dev/md5 --remove /dev/sdf3 to formally eject it, then physically swap the drive.
Check (Scrub) in Progress
md3 : active raid5 sdb4[1] sda4[0] sdc4[4] sdd4[2]
29179894272 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
[========>............] check = 42.1% (4094208/9726631424) finish=891.2min speed=105132K/sec
All members up ([UUUU]), but a check is running — this is a scrub, reading every stripe and verifying parity consistency. Triggered manually with echo check > /sys/block/md3/md/sync_action or by a cron job (many distros run a weekly scrub). After it finishes, check /sys/block/md3/md/mismatch_cnt — a 0 means everything matched; a non-zero value means some stripes had parity mismatches, which could indicate silent data corruption or a disk starting to go bad.
Why It Matters
/proc/mdstat matters because RAID failures are silent by default. A degraded mirror doesn't crash your server, doesn't slow it down visibly, doesn't pop up a warning on your screen. It just quietly drops from [UU] to [U_] and waits — for the next disk to fail, or for you to notice. The median time between "first disk died" and "admin noticed" on an unmonitored server is measured in weeks, sometimes months. During that entire window the server is one failure away from total data loss, running without the safety net it was built with.
This is the file that closes that gap. A monitoring agent that reads /proc/mdstat every few minutes catches the underscore the moment it appears — not when the second disk fails, not when the backup turns out to be three months old, not when the client calls to ask why the site is down. Early, quietly, with time to act.
The file is also the only way to watch a rebuild in progress. Without it, you're blind to how fast the recovery is going, whether it's stalled, and how long it'll take. During a rebuild the array is at its most vulnerable — every surviving member is being hammered with sequential reads — and knowing the ETA is the difference between "I'll swap the other aging disk after this rebuild finishes" and "I'll risk a hot-swap right now and hope."
How to Write to It
You can't. /proc/mdstat is strictly read-only — it's a window into kernel state, not a control interface. All management goes through mdadm or through sysfs entries under /sys/block/mdN/md/. The two most useful sysfs knobs:
# Trigger a scrub
echo check > /sys/block/md0/md/sync_action
# Cancel a running scrub or rebuild (careful — cancelling a rebuild leaves the array degraded)
echo idle > /sys/block/md0/md/sync_action
# Tune rebuild speed (KB/s)
echo 200000 > /proc/sys/dev/raid/speed_limit_min
cat /proc/sys/dev/raid/speed_limit_max
The speed limits live in /proc/sys/dev/raid/ and apply to all arrays system-wide. Raising speed_limit_min accelerates a rebuild at the cost of I/O bandwidth for everything else; lowering speed_limit_max does the opposite, protecting production workloads during a long resync. The defaults (1000 KB/s min, 200000 KB/s max) are conservative — on a dedicated storage box you can safely push speed_limit_min much higher.
Gotchas
- The file is empty if there are no md arrays. That's normal on a server with hardware RAID, no RAID at all, or only ZFS/Btrfs pools (those have their own redundancy, invisible to md). Don't confuse "empty
/proc/mdstat" with "no redundancy." - Device names can reorder between reboots.
sdatoday might besdbtomorrow if the kernel enumerates disks in a different order. The array itself doesn't care — it identifies members by their superblock UUID, not their device name — but you might, if you're comparing two snapshots of/proc/mdstatacross a reboot. Uselsblkwith-o SERIALto pin devices to physical slots. [2/2] [UU]doesn't mean the data is healthy. It means all expected members are present and the kernel considers them functional. Silent bit rot, bad sectors that haven't been read yet, firmware bugs — none of these show up in the status string. That's what scrubs are for: periodiccheckoperations that read every stripe and verify parity. Run them weekly.- The rebuild ETA lies.
finish=62.3minis computed from the current speed, which fluctuates with system I/O load. A rebuild that says "20 minutes left" can stretch to an hour if a backup job kicks in. Treat the estimate as a rough guide, not a countdown. - Bitmap doesn't appear by default. New arrays created with
mdadmdon't have a write-intent bitmap unless you ask for one. On large arrays, add one (mdadm --grow /dev/md0 --bitmap=internal) — it's the difference between a full multi-hour resync and a quick partial catch-up after a transient dropout.
History and Philosophy
The md driver has been in the Linux kernel since the mid-1990s — one of the oldest subsystems still in daily production use. Its /proc interface predates sysfs, predates /sys/block/, predates the idea that kernel state should be exposed through structured hierarchies rather than human-readable flat files. And yet it persists, because it turns out that a flat text file you can cat from any shell on any system with zero dependencies is a remarkably hard interface to beat.
The format hasn't changed in any breaking way in over two decades. Scripts written in 2005 to parse /proc/mdstat still work today. The [UU] / [U_] convention, the progress bar, the speed counter — all of it locked in by the weight of a million monitoring tools and admin habits. The kernel developers could have moved everything to sysfs and deprecated the file (and sysfs does expose the same data, in more structured form, under /sys/block/mdN/md/). They didn't, because /proc/mdstat does the one thing sysfs can't: show you everything at once, on one screen, in one command. When you're standing at a console at 3 a.m. with a degraded array, cat /proc/mdstat is exactly the right interface — not a tree of files to walk, not a JSON blob to parse, just text that tells you what you need to know.
There's a design lesson in that. Sometimes the best interface is the simplest one — a file that exists in memory, costs nothing to read, and answers the question "is everything okay?" in one glance. The U and _ convention is brilliantly compact: you don't read it so much as see it. A wall of U's is health; a single _ is trouble. Your eye catches it faster than any structured parser could.
See Also
- RAID — the concept, the levels, the maths behind parity
mdadm— the tool that builds, inspects, and repairs every software arraysmartctl— read the S.M.A.R.T. health data on member diskslsblk— map device names to physical serials so you pull the right one/proc/diskstats— per-device I/O statistics, useful during rebuilds/proc/mounts— which arrays are mounted where- degraded RAID — what to do when the status string shows an underscore
- RAID rebuilding — surviving the rebuild window
- failing disk — reading the SMART attributes on the member that dropped
Would you notice if one of your arrays dropped a disk right now?
CleverUptime reads
/proc/mdstaton every collection cycle, catches the instant[UU]turns to[U_], and tells you which array lost which member — so you find out from a notification, not from the second failure.Want to see your own server's health right now? One command, no signup, no install.