Date: Mon, 13 Dec 2004 10:28:53 -0800 (PST) From: Doug White <dwhite@gumbysoft.com> To: Joe Rhett <jrhett@meer.net> Cc: =?iso-8859-1?Q?S=F8ren?= Schmidt <sos@DeepCore.dk> Subject: Re: drive failure during rebuild causes page fault Message-ID: <20041213102333.V92964@carver.gumbysoft.com> In-Reply-To: <20041213060549.GE78120@meer.net> References: <20041213052628.GB78120@meer.net> <20041213054159.GC78120@meer.net><20041213060549.GE78120@meer.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 12 Dec 2004, Joe Rhett wrote: > On Sun, Dec 12, 2004 at 09:59:16PM -0800, Doug White wrote: > > Thats a nice shotgun you have there. > > Yessir. And that's what testing is designed to uncover. The question is > why this works, and how do we prevent it? I'm sure Soren appreciates you donating your feet to the cause :) Why it works: the system assumes the administrator is competent enough to not yank a disk that is being rebuilt to. > Is there a proper way to handle these sort of events? If so, where is it > documented? > > And fyi just pulling the drives causes the same failure so that means that > RAID1 buys you nothing because your system will also crash. This is why I don't trust ATA RAID for fault tolerance -- it'll save your data, but the system will tank. Since the disk state is maintained by the OS and not abstracted by a separate processor, if a disk dies in a particularly bad way the system may not be able to cope. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041213102333.V92964>