From owner-freebsd-current@FreeBSD.ORG Sat Jan 31 12:21:59 2009 Return-Path: Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A214C106564A for ; Sat, 31 Jan 2009 12:21:59 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by mx1.freebsd.org (Postfix) with SMTP id E1A818FC17 for ; Sat, 31 Jan 2009 12:21:58 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: (qmail invoked by alias); 31 Jan 2009 12:21:57 -0000 Received: from p54A3DF7B.dip.t-dialin.net (EHLO tron.homeunix.org) [84.163.223.123] by mail.gmx.net (mp069) with SMTP; 31 Jan 2009 13:21:57 +0100 X-Authenticated: #1673122 X-Provags-ID: V01U2FsdGVkX1+a2ZMKSEWarX4AW/zsLXzOgfjSDZ+EpR/EQ/e40L PWIzB2dEc/z/tX Message-ID: <49844264.7000300@gmx.de> Date: Sat, 31 Jan 2009 13:21:56 +0100 From: Christoph Mallon User-Agent: Thunderbird 2.0.0.19 (X11/20090103) MIME-Version: 1.0 To: Dylan Alex Simon References: <8cb6106e0901200641x4b0bda9ag31e6f059f13035a7@mail.gmail.com> <200901201829.n0KITE8V072323@lurza.secnetix.de> <20090131010855.GA7991@datura.dylex.net> In-Reply-To: <20090131010855.GA7991@datura.dylex.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.61 Cc: freebsd-current@FreeBSD.ORG Subject: Re: SATA DMA errors on second ICH10 bus X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Jan 2009 12:21:59 -0000 Dylan Alex Simon schrieb: >> That advice seems to be particularly valuable given the >> current firmware problems that particular Seagate disks >> are exhibiting. > > I've confirmed with Seagate and others that the firmware these disks already > have (CC1F) is not affected by the firmware problems. The instability (as > described in kern/130726) continues with a kernel from today. I've traced it > down to exclusively and reliably being caused by access to disks on multiple > channels simultaneously (access to any pair of disks on the same channel works > fine). If anyone has any suggestions or any other data I should collect let > me know as I will have to put these machines into production shortly (without > freebsd unfortunately). I suspect I see the same problem with some nvidia SATA controller. If there is high load on both channels of one controller, there are exactly the errors you showed. Your kernel does not use INVARIANTS, is this correct? Otherwise you should see a very specific panic caused by a KASSERT(). I analysed the problem a bit. You can see my findings in the thread "Question about panic in brelse()". I suspect a hardware bug plus incorrect error handling in the driver in FreeBSD. As a workaround, I suggest you connect each disk to a separate controller - if you have not more disks than controllers.