From owner-freebsd-current@FreeBSD.ORG Tue Oct 2 18:10:18 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ECB4516A419 for ; Tue, 2 Oct 2007 18:10:18 +0000 (UTC) (envelope-from stevenschlansker@berkeley.edu) Received: from smtp-out1.berkeley.edu (smtp-out1.Berkeley.EDU [128.32.61.106]) by mx1.freebsd.org (Postfix) with ESMTP id C85C013C4A6 for ; Tue, 2 Oct 2007 18:10:18 +0000 (UTC) (envelope-from stevenschlansker@berkeley.edu) Received: from eva-wlan-110.airbears.berkeley.edu ([169.229.253.125]) by fe2.calmail with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.68) (auth plain:stevenschlansker@berkeley.edu) (envelope-from ) id 1IcmCI-0005FC-6k for freebsd-current@freebsd.org; Tue, 02 Oct 2007 11:10:18 -0700 Message-ID: <47028989.9080300@berkeley.edu> Date: Tue, 02 Oct 2007 11:10:17 -0700 From: Steven Schlansker User-Agent: Thunderbird 2.0.0.6 (X11/20070924) MIME-Version: 1.0 To: freebsd-current@freebsd.org References: <4701FE7C.8020200@berkeley.edu> <20071002143044.GL1693@garage.freebsd.pl> In-Reply-To: <20071002143044.GL1693@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Repeatable kernel panic on -CURRENT using ZFS over SATA X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Oct 2007 18:10:19 -0000 Pawel Jakub Dawidek wrote: > On Tue, Oct 02, 2007 at 01:17:00AM -0700, Steven Schlansker wrote: >> Hello everyone, >> I recently set up a 6 drive SATA raidz2. Whenever I try to use the >> array, the dmesg fills up with warnings that WRITE_DMA must be retried >> (repeatedly) >> >> As soon as I remove the load, everything runs fine. >> >> Dmesg with errors here: >> http://soda.csua.berkeley.edu/~steven/dmesg.txt >> >> The eventual end result: >> http://soda.csua.berkeley.edu/~steven/Image053.jpg >> >> >> The only references I can find to similar problems were either not >> resolved, or seemed to be related to a chipset which I am not using. >> >> Is this a known issue? How can I make this machine stable? Is there >> any more information I can provide to aid debugging? Thanks so very much, > > This looks like a problem a couple of folks already reported. For me it > looks like ATA bug, as if I recall correctly various controllers from > various vendors are affected. Unfortunately Soren isn't very active > lately. As a work-around you may try disabling write cache on your > disks (hw.ata.wc=0 to /boot/loader.conf), but this may only help to > mitigate the problem. > I tried disabling the write cache, however that didn't do much. I think the frequency of the WRITE_DMA timeouts decreased, but they are definitely still happening. Are there any other things I can try? I'd really like to get this working, as I just spent a thousand dollars on all this equipment, and to find out it can't stay online for more than a few minutes is quite saddening... I can try to help debug the problem if someone will guide me along - the system is a production system but nobody will know if it crashes a few times, so I'm perfectly willing to try things and panic it or whatever. I'd like to help quash the bug, but I do not have the kernel knowledge to do it myself, only the hardware that causes it :) Any other suggestions are also welcome. Thanks, Steven