2023-06-24
Time to replace half of a mirrored disk (again)
Error messages like this make me fix things fast:Jun 24 13:42:59 conway kernel: [6925745.388604] sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT Jun 24 13:42:59 conway kernel: [6925745.389388] sd 0:0:0:0: [sda] tag#6 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 Jun 24 13:42:59 conway kernel: [6925745.390157] print_req_error: I/O error, dev sda, sector 616464 Jun 24 13:42:59 conway kernel: [6925745.390923] md: super_written gets error=10 Jun 24 13:42:59 conway kernel: [6925745.391705] md/raid1:md127: Disk failure on sda3, disabling device. Jun 24 13:42:59 conway kernel: [6925745.391705] md/raid1:md127: Operation continuing on 1 devices. Jun 24 13:42:59 conway mdadm[2559]: Fail event detected on md device /dev/md127, component device /dev/sda3The part that makes me go 'hmmm' is that this was another Kingston A400 SSD, just like the one that failed in December 2021 for which I ordered a replacement from a different brand. Since that disk failed under warranty it was replaced with another Kingston A400 which I still had available in packaging. So that is now in use and the failed SSD is removed. I wonder how long that replacement disk will work fine. I did all the bits to replace the disk and recreate the software raid mirror. This worked fine, and all my work to make sure the system can boot from either disk of the mirror worked.