#7 Linux LVM, RAID, failed drive recovery 2021

In the previous SSD cached LVM/RAID notes some time ago (in a galaxy far, far away) we setup a moder, mdadm-less Linux software RAID setup.

However, in the meantime maybe a drive failed, and as I found the error recovery a bit tricky, let's document exactly what commands are needed to recover the LVM RAID from an error:

RAID5 failed / missing drive

pvcreate /dev/sd*new
vgextend vg0 /dev/sd*new
lvconvert --repair vg0/lv

As this will still leave the old UUID missing, you might want to clean that up with:

vgreduce --removemissing vg0

Easier replaceing an online driver

If you notice an aging and degrading, or already partially failed drive, you should be able to replace that a bit easier when still online:

pvcreate /dev/sd*new
vgextend vg0 /dev/sd*new
lvconvert --replace /dev/sd*old vg0/lv /dev/sd*new
vgreduce vg0 /dev/sd*old

Added bonus: periodically srub the whole thing to prevent bit rot!!1!

As magnetically or otherwise stored data bits might decay and thus "bit rot" after some time, some consider it a good idea to periodically scrub and refresh it. This also can help to catch drives going bad earlier:

lvchange --syncaction check vg0/lv

External links

The Author

René Rebe studied computer science and digital media science at the University of Applied Sciences of Berlin, Germany. He is the founder of the T2 Linux SDE, ExactImage, and contributer to various projects in the open source ecosystem for more than 20 years, now. He also founded the Berlin-based software company ExactCODE GmbH. A company dedicated to reliable software solutions that just work, every day.