From owner-freebsd-stable@FreeBSD.ORG  Mon Feb 11 18:35:39 2008
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BC6A716A418
	for <freebsd-stable@freebsd.org>; Mon, 11 Feb 2008 18:35:39 +0000 (UTC)
	(envelope-from remco@spacemarines.us)
Received: from green.qinip.net (green.qinip.net [62.100.30.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 8622913C465
	for <freebsd-stable@freebsd.org>; Mon, 11 Feb 2008 18:35:39 +0000 (UTC)
	(envelope-from remco@spacemarines.us)
Received: from marshal.spacemarines.us (h89220144089.dsl.speedlinq.nl
	[89.220.144.89])
	by green.qinip.net (Postfix) with ESMTP id 91613C76B;
	Mon, 11 Feb 2008 19:35:42 +0100 (CET)
Received: by marshal.spacemarines.us (Postfix, from userid 1000)
	id B2DB11CDAD; Mon, 11 Feb 2008 19:35:37 +0100 (CET)
Date: Mon, 11 Feb 2008 19:35:37 +0100
To: Clifton Royston <cliftonr@lava.net>
Message-ID: <20080211183537.GA6497@marshal.spacemarines.us>
References: <479A0731.6020405@skyrush.com>
	<20080125162940.GA38494@eos.sc1.parodius.com>
	<479A3764.6050800@skyrush.com>
	<3803988D-8D18-4E89-92EA-19BF62FD2395@mac.com>
	<479A4CB0.5080206@skyrush.com>
	<20080126003845.GA52183@eos.sc1.parodius.com>
	<20080211120057.GA5821@marshal.spacemarines.us>
	<20080211172454.GB5323@lava.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20080211172454.GB5323@lava.net>
User-Agent: Mutt/1.5.13 (2006-08-11)
From: remco@spacemarines.us (Remco van Bekkum)
Cc: freebsd-stable@freebsd.org
Subject: Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Feb 2008 18:35:39 -0000

On Mon, Feb 11, 2008 at 07:24:55AM -1000, Clifton Royston wrote:
> On Mon, Feb 11, 2008 at 01:00:57PM +0100, Remco van Bekkum wrote:
> > On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote:
> > After having replaced my first SATA disk with one of the same type,
> > having still the same errors, I replaced this 1TB drive with 4x500GB
> > Hitachi P7K500 in raidz. It worked fine for a week, but yesterday I
> > cvsupped and rebuild world. This afternoon everything is breaking down
> > again with the same errors:
> > 
> > Feb 11 12:34:09 xaero kernel: ad6: WARNING - SETFEATURES SET TRANSFER
> > MODE taskqueue timeout - completing request directly
> > Feb 11 12:34:13 xaero kernel: ad6: WARNING - SETFEATURES SET TRANSFER
> > MODE taskqueue timeout - completing request directly
> > Feb 11 12:34:17 xaero kernel: ad6: WARNING - SETFEATURES ENABLE RCACHE
> > taskqueue timeout - completing request directly
> > Feb 11 12:34:21 xaero kernel: ad6: WARNING - SETFEATURES ENABLE WCACHE
> > taskqueue timeout - completing request directly
> > Feb 11 12:34:25 xaero kernel: ad6: WARNING - SET_MULTI taskqueue timeout
> > - completing request directly
> > Feb 11 12:34:25 xaero kernel: ad6: FAILURE - WRITE_DMA48 timed out
> > LBA=298014274
> 
>   Did you try replacing cabling as a previous poster recommended?  I've
> had similar problems with both traditional parallel ATA and SATA due to
> marginal cables, which of course are not solved by swapping drives.
> 
>   Not saying there's not a software problem here, just that there is
> still one area to eliminate.
>   -- Clifton
>  
> -- 
>     Clifton Royston  --  cliftonr@iandicomputing.com / cliftonr@lava.net
>        President  - I and I Computing * http://www.iandicomputing.com/
>  Custom programming, network design, systems and network consulting services

Hi Clifton,

I don't recall exactly anymore, but at least 3 cables have been used
without problems on other systems. I'm wondering, the mainboard acts
weird sometimes as well: when I press the reset button, it sometimes powers down.
Also, I just did a reset after it deadlocked on shutdown because of the errors,
and when the system booted, 2 disks were not seen by the bios.
I had to power down the box and when it came up again, the disks were back.
Can software leave the disks in a state that the bios doesn't detect
them after pressing the reset button?
I'm 100% certain that on my previous installation, in a 100% different
system, I got the same errors. That should normally mean either software or disk.
The disk has been replaced, the OS is the same. I'm either having really bad luck or something else is wrong.
What is a good way of stress testing disks?
Thanks!

- Remco