Date: Sun, 12 Mar 2006 12:19:04 +0100 From: Paul Schenkeveld <fb-geom@psconsult.nl> To: freebsd-geom@freebsd.org Subject: gvinum losing state about failed drives Message-ID: <20060312111904.GA52139@psconsult.nl>
next in thread | raw e-mail | index | archive | help
Hi, My hardware: Intel L440GX+ serverboard, 2x 700MHz P3, 1GB ECC RAM 2x Seagate SCSI 73GB off mainboard SCSI controller 2x add-in Promise ATA133 controller 4x Hitachi 500GB ATA133 disks off the Promise controllers add-in Intel gigabit ethernet controller My gvinum config: 12 volumes mirrored across da0 and da1 1 volume 500GB mirrored across ad4 and ad8 1 volume 500GB mirrored across ad6 and ad10 After my 4-STABLE to 6-STABLE upgrade of the first server I had two occasions where two ATA disks became unavailable because the controller stopped responding. The first time I lost ad8 and ad10 containing vol12.p1 and vol13.p1, the second time (after everything was manually repaired) I lost vol12.p0 and vol13.p0. When the ATA controller stops, two gvinum drives go down, the plexes and the subdisks on them go down as well. After a reboot, however, all drives, plexes and subdisks are up again. By comparing the plexes by hand (using optimized cmp which still takes 5.5 hours for 500GB) I see that they are not equal, understandably because some data was updated while one plex was down. Seems that the failure of a drive and its subdisks is not recorded in the metadata of the other drives. I'm now contemplating a rollback of the upgrade as this server has been down too long already but I'll try to get me a similar setup here to do more testing. Regards, Paul Schenkeveld
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060312111904.GA52139>