From owner-freebsd-stable@FreeBSD.ORG Tue Dec 14 06:59:25 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DFC4F16A4CE for ; Tue, 14 Dec 2004 06:59:25 +0000 (GMT) Received: from spider.deepcore.dk (cpe.atm2-0-53484.0x50a6c9a6.abnxx9.customer.tele.dk [80.166.201.166]) by mx1.FreeBSD.org (Postfix) with ESMTP id 117CD43D54 for ; Tue, 14 Dec 2004 06:59:24 +0000 (GMT) (envelope-from sos@DeepCore.dk) Received: from [194.192.25.143] (laptop.deepcore.dk [194.192.25.143]) by spider.deepcore.dk (8.12.11/8.12.10) with ESMTP id iBE6xHRR040997; Tue, 14 Dec 2004 07:59:20 +0100 (CET) (envelope-from sos@DeepCore.dk) Message-ID: <41BE8F2D.8000407@DeepCore.dk> Date: Tue, 14 Dec 2004 07:58:53 +0100 From: =?ISO-8859-1?Q?S=F8ren_Schmidt?= User-Agent: Mozilla Thunderbird 0.7.2 (X11/20040802) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Doug White References: <20041213052628.GB78120@meer.net> <20041213054159.GC78120@meer.net> <20041212215841.X83257@carver.gumbysoft.com> <20041213060549.GE78120@meer.net> <20041213102333.V92964@carver.gumbysoft.com> <20041213192119.GB4781@meer.net> <20041213183336.T97507@carver.gumbysoft.com> In-Reply-To: <20041213183336.T97507@carver.gumbysoft.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-mail-scanned: by DeepCore Virus & Spam killer v1.4 cc: Joe Rhett cc: freebsd-stable@freebsd.org Subject: Re: drive failure during rebuild causes page fault X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Dec 2004 06:59:26 -0000 Doug White wrote: > On Mon, 13 Dec 2004, Joe Rhett wrote: >=20 >=20 >>>This is why I don't trust ATA RAID for fault tolerance -- it'll save y= our >>>data, but the system will tank. Since the disk state is maintained by= >>>the OS and not abstracted by a separate processor, if a disk dies in a= >>>particularly bad way the system may not be able to cope. >> >>Yes, but SATA isn't limited by this problem. It does have a processor = per >>disk. (this is all SATA, if I didn't make that clear) >=20 > Actually on SATA its worse -- the disk just stops responding to everyth= ing > and hangs. If you don't detect this condition then you go into an > infinite wait. >=20 > In any case, yes the ATA RAID code could use a massive robustness pass.= So > could the core ATA code. Patches accepted :) Actually I'm in the process of rewriting the ATA RAID code, so things=20 are rolling, albeit slowly, time is a precious resource. I belive that=20 it can be made pretty robust, but the rest of the kernel still have=20 issues with disappearing devices etc thats out of ATA's realm. Anyhow. I can only test with the HW I have here in the lab, which by far = covers all possible permutations, so testing etc by the community is=20 very much needed here to get things sorted out... --=20 -S=F8ren