Date: Fri, 17 Oct 2003 22:09:40 -0400 From: Barney Wolff <barney@databus.com> To: stable@freebsd.org Subject: unintended ATARAIDDELETE Message-ID: <20031018020939.GA24917@pit.databus.com>
next in thread | raw e-mail | index | archive | help
I've had a very odd problem with a -stable system on an Asus A7V333-raid, which has a Promise raid controller on the motherboard. For several days in a row the system lost its raid0 array during the 3am daily run, leaving it with no disk. The raid was actually turned off in the bios, with manual intervention required on reboot to turn it back on. I suspected hardware, but in desperation booted a -stable kernel from 10/3/03. That kernel survived the daily run, and reported the following: Oct 14 14:41:43 192.168.24.4 /kernel.maybe.ok: ad6: hard error reading fsbn 133757952 of 0-127 (ad6 bn 133757952; cn 132696 tn 6 sn 6) trying PIO mode (I should note that I added a script in /usr/local/etc/periodic/daily to back up this system, so files are read that normally see no access.) I suspect that something in the newer -stable kernel reacted to this hard error by doing, intentionally or not, an ioctl ATARAIDDELETE. Since the error has since been remapped, I can't easily test this idea, but thought I should report it in case it triggers a eureka moment in a developer. The syndrome appears only in response to a disk error; I've been running a -stable kernel from 10/16/03 with no problem after the bad block was remapped. I added code to log and nop ata_raid_destroy, so I hope to notice if it ever happens again. -- Barney Wolff http://www.databus.com/bwresume.pdf I'm available by contract or FT, in the NYC metro area or via the 'Net.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031018020939.GA24917>