From owner-freebsd-stable@FreeBSD.ORG  Wed Jul 16 01:10:26 2008
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 98F92106566C
	for <freebsd-stable@freebsd.org>; Wed, 16 Jul 2008 01:10:26 +0000 (UTC)
	(envelope-from andrew@modulus.org)
Received: from email.octopus.com.au (host-122-100-2-232.octopus.com.au
	[122.100.2.232])
	by mx1.freebsd.org (Postfix) with ESMTP id 510198FC08
	for <freebsd-stable@freebsd.org>; Wed, 16 Jul 2008 01:10:25 +0000 (UTC)
	(envelope-from andrew@modulus.org)
Received: by email.octopus.com.au (Postfix, from userid 1002)
	id 399CD17369; Wed, 16 Jul 2008 11:10:24 +1000 (EST)
X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on email.octopus.com.au
X-Spam-Level: 
X-Spam-Status: No, score=-1.4 required=10.0 tests=ALL_TRUSTED autolearn=failed
	version=3.2.3
Received: from [10.1.50.60] (138.21.96.58.exetel.com.au [58.96.21.138])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	(Authenticated sender: admin@email.octopus.com.au)
	by email.octopus.com.au (Postfix) with ESMTP id 54FD11736B
	for <freebsd-stable@freebsd.org>; Wed, 16 Jul 2008 11:10:19 +1000 (EST)
Message-ID: <487D4A2A.9010508@modulus.org>
Date: Wed, 16 Jul 2008 11:08:58 +1000
From: Andrew Snow <andrew@modulus.org>
User-Agent: Thunderbird 2.0.0.14 (X11/20080523)
MIME-Version: 1.0
To: freebsd-stable@freebsd.org
References: <487CCD46.8080506@ibctech.ca>
	<200807151711.m6FHBgVO007481@apollo.backplane.com>
In-Reply-To: <200807151711.m6FHBgVO007481@apollo.backplane.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: taskqueue timeout
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jul 2008 01:10:26 -0000

Matthew Dillon wrote:
>     Try that first.  If it helps then it is a known issue.  Basically
>     a combination of the on-disk write cache and possible ECC corrections,
>     remappings, or excessive remapped sectors can cause the drive to take
>     much longer then normal to complete a request.  The default 5-second
>     timeout is insufficient.

 From Western Digital's line of "enterprise" drives:

"RAID-specific time-limited error recovery (TLER) - Pioneered by WD, 
this feature prevents drive fallout caused by the extended hard drive 
error-recovery processes common to desktop drives."


Western Digital's information sheet on TLER states that they found most 
RAID controllers will wait 8 seconds for a disk to respond before 
dropping it from the RAID set.  Consequently they changed their 
"enterprise" drives to try reading a bad sector for only 7 seconds 
before returning an error.

Therefore I think the FreeBSD timeout should also be set to 8 seconds 
instead of 5 seconds.  Desktop-targetted drives will not respond for 
over 10 seconds, up to minutes, so its not worth setting the FreeBSD 
timeout any higher.


More info:
http://www.wdc.com/en/library/sata/2579-001098.pdf
http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery


- Andrew