Replace disk in mdadm before it fails

There are 2 disks containing / and SWAP partition.

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.8G 0 disk
├─sda1 8:1 0 130.4G 0 part
│ └─md0 9:0 0 130.3G 0 raid1 /
└─sda2 8:2 0 6.4G 0 part
└─md1 9:1 0 6.4G 0 raid1 [SWAP]
sdd 8:48 0 136.8G 0 disk
├─sdd1 8:49 0 130.4G 0 part
│ └─md0 9:0 0 130.3G 0 raid1 /
└─sdd2 8:50 0 6.4G 0 part
└─md1 9:1 0 6.4G 0 raid1 [SWAP]

One disk started to act strange, utilization spiked and latency began to increase. After some investigation one thing stood out, that disk had continuously increasing Non-medium error count, by hundreds a minute. General opinion on the internet is that if that number continues to grow start looking for a replacement disk. Usually people get few hundreds of those, but we were getting those numbers per minute and we got to millions range.

# smartctl --all /dev/sdd | grep "Non-medium error count"
Non-medium error count: 7564451

Compare that to another disk, and you will see the difference.

# smartctl --all /dev/sda | grep "Non-medium error count"
Non-medium error count: 35

Mark the disk as failed.

# mdadm --manage /dev/md0 --fail /dev/sdd1
# mdadm --manage /dev/md1 --fail /dev/sdd2

Remove disk from configuration.

# mdadm --manage /dev/md0 --remove /dev/sdd1
# mdadm --manage /dev/md1 --remove /dev/sdd2

To be sure that we get the right disk out we need to locate it. Turn the led on, and off after we locate the drive.

# ledctl locate=/dev/sdd
# ledctl locate_off=/dev/sdd

Remove physical disk and replace it with the new one. Check lsblk to see new disk.

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.8G 0 disk
├─sda1 8:1 0 130.4G 0 part
│ └─md0 9:0 0 130.3G 0 raid1 /
└─sda2 8:2 0 6.4G 0 part
└─md1 9:1 0 6.4G 0 raid1 [SWAP]
sdd 8:48 0 136.8G 0 disk

Copy partition table from sda to sdd

# sfdisk -d /dev/sda | sfdisk /dev/sdd

Partition table is copied to the new disk.

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.8G 0 disk
├─sda1 8:1 0 130.4G 0 part
│ └─md0 9:0 0 130.3G 0 raid1 /
└─sda2 8:2 0 6.4G 0 part
└─md1 9:1 0 6.4G 0 raid1 [SWAP]
sdd 8:48 0 136.8G 0 disk
├─sdd1 8:49 0 130.4G 0 part
└─sdd2 8:50 0 6.4G 0 part

Add new disk to raid arrays, first SWAP because it’s smaller an it will rebuild quickly.

# mdadm --manage /dev/md1 --add /dev/sdd2

After that add / and let it rebuild while you finish rest of the checkups.

# mdadm --manage /dev/md0 --add /dev/sdd1

We can see the final state is the same as when we started.

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 136.8G 0 disk
├─sda1 8:1 0 130.4G 0 part
│ └─md0 9:0 0 130.3G 0 raid1 /
└─sda2 8:2 0 6.4G 0 part
└─md1 9:1 0 6.4G 0 raid1 [SWAP]
sdd 8:48 0 136.8G 0 disk
├─sdd1 8:49 0 130.4G 0 part
│ └─md0 9:0 0 130.3G 0 raid1 /
└─sdd2 8:50 0 6.4G 0 part
└─md1 9:1 0 6.4G 0 raid1 [SWAP]

Difference is that we now have functioning disk with the same utilizationa and latency as sda, and no increase in Non-medium error count.

All this was done with zero downtime, because all the disks were in hot-swap drive bays. If the disk failed totally server would still run fine because SWAP was on raid1, which was intended configuration here.

Leave a Reply

Your email address will not be published. Required fields are marked *