From owner-freebsd-hackers@FreeBSD.ORG  Sun Jun 28 15:43:55 2009
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 156261065675
	for <freebsd-hackers@freebsd.org>; Sun, 28 Jun 2009 15:43:55 +0000 (UTC)
	(envelope-from kpielorz_lst@tdx.co.uk)
Received: from lorca.tdx.co.uk (lorca.tdx.co.uk [62.13.128.6])
	by mx1.freebsd.org (Postfix) with ESMTP id A6E918FC08
	for <freebsd-hackers@freebsd.org>; Sun, 28 Jun 2009 15:43:54 +0000 (UTC)
	(envelope-from kpielorz_lst@tdx.co.uk)
Received: from Octa64 (rainbow.tdx.co.uk [62.13.130.232] (may be forged))
	(authenticated bits=0)
	by lorca.tdx.co.uk (8.14.0/8.14.0/Kp) with ESMTP id n5SFUNJ1090360
	for <freebsd-hackers@freebsd.org>; Sun, 28 Jun 2009 16:30:23 +0100 (BST)
Date: Sun, 28 Jun 2009 16:30:24 +0100
From: Karl Pielorz <kpielorz_lst@tdx.co.uk>
To: freebsd-hackers@freebsd.org
Message-ID: <20E145B15D43DBD9A741F1DB@Octa64>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Subject: ata 'Flush Cache' errors, on non-failing disk?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Jun 2009 15:43:55 -0000


Hi,

I've recently updated my amd64 system from 6.4 to 7.2-Stable - this works 
fine, but I've started picking up errors on the console:

  ad36: TIMEOUT - FLUSHCACHE retrying (1 retry left)

The drive (an WD5000AAKS) appears healthy - SMART reports no errors, or 
problems - and the timeouts only appear when that drive is 'being hammered' 
by write requests (e.g. during ZFS re-silvering to it)

The Western-Digi drive doctor CD/ISO runs a full test, and reports no 
problems (in that machine, with that drive).

I did find a number of posts, such as:

 <http://lists.freebsd.org/pipermail/freebsd-current/2009-April/005939.html>

Which point to the default timeout for the ATA flushcache command being 5 
seconds, when perhaps it should be 30...

But the code in 7.2-STABLE bears no resemblance to the code that the patch 
is for - so I'm guessing things have moved on since then...

Is there anywhere I might apply a similar patch to up the timeout to see if 
that cures the problem?

The only mentions of ATA_FLUSHCACHE appears to be calls to "ata_controlcmd( 
xxxx, ATA_FLUSHCACHE, 0, 0, 0);" - "ata_controlcmd" in turn seems to set a 
request timeout of '1' - but I can't tell if that's a timeout of 1 second, 
1 tick, or 1 what - or if it's a timeout for adding the command to the 
queue, or actually a timeout for executing that command...

Is upping that request timeout conditionally for cache flushes likely to 
have the effect I'm looking for?


-Kp