From owner-freebsd-current@FreeBSD.ORG Wed Dec 8 01:47:56 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 50F4916A4CE for ; Wed, 8 Dec 2004 01:47:56 +0000 (GMT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id 5278343D60 for ; Wed, 8 Dec 2004 01:47:55 +0000 (GMT) (envelope-from iedowse@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 8 Dec 2004 01:47:54 +0000 (GMT) To: Garance A Drosihn In-Reply-To: Your message of "Mon, 06 Dec 2004 19:51:56 EST." Date: Wed, 08 Dec 2004 01:47:53 +0000 From: Ian Dowse Message-ID: <200412080147.aa24331@salmon.maths.tcd.ie> cc: freebsd-current@freebsd.org cc: =?iso-8859-1?Q?S=F8ren_Schmidt?= Subject: Re: Another twist on WRITE_DMA issues <- ProblemFound X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2004 01:47:56 -0000 In message , Garance A Drosihn writes: >That isn't it either. I think the hardware is just mocking me. I >had zero problems for more than 24 hours. I then copied one set of >partitions to another, booted up to that second set, and immediately >I was back to having the above warnings/errors, and before long I >had a system panic. And when I try to 'call doadump()', that fails >with an error writing to the disk, so I can't get a core dump of it >either. If you don't mind the risk of triggering some asserts, you could try the patches in http://people.freebsd.org/~iedowse/callout/ to see if they make any difference. The aim of callout.diff is to provide callouts with race-free semantics, and then callout_ata.diff is an attempt to fix some timeout race conditions in the ATA driver using the new API. The patches seem to have fixed the ATA timeout messages I was getting after suspend/resume events, but it's a bit early to tell for sure as I've only been using them on my laptop now for a few days. Depending on what drivers you use (especially ethernet), you may get panics with those patches. This is because the patched callout code requires that Giant is held while stopping and starting non-mpsafe timers. It is safe to #if 0 out the new mtx_assert conditions in kern_timeout.c for now as a workaround. Ian