From owner-freebsd-stable@FreeBSD.ORG  Wed Jul 16 03:27:43 2008
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C1F121065675
	for <freebsd-stable@freebsd.org>; Wed, 16 Jul 2008 03:27:43 +0000 (UTC)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 915058FC1C
	for <freebsd-stable@freebsd.org>; Wed, 16 Jul 2008 03:27:43 +0000 (UTC)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.14.1/8.14.1) with ESMTP id m6G3Rh5G012576;
	Tue, 15 Jul 2008 20:27:43 -0700 (PDT)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.14.1/8.13.4/Submit) id m6G3Rh57012575;
	Tue, 15 Jul 2008 20:27:43 -0700 (PDT)
Date: Tue, 15 Jul 2008 20:27:43 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200807160327.m6G3Rh57012575@apollo.backplane.com>
To: Steve Bertrand <steve@ibctech.ca>
References: <487CCD46.8080506@ibctech.ca>	<200807151711.m6FHBgVO007481@apollo.backplane.com>	<487CF077.2040201@ibctech.ca>
	<487CFA08.5000308@ibctech.ca>
	<200807151955.m6FJtf77008969@apollo.backplane.com>
	<487D5D08.9070102@ibctech.ca>
Cc: freebsd-stable@freebsd.org
Subject: Re: taskqueue timeout
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Jul 2008 03:27:43 -0000


:...
:>     and see if the problem reoccurs with just two drives.
:
:... I knew that was going to come up... my response is "I worked so hard 
:to get this system with ZFS all configured *exactly* how I wanted it".
:
:To test, I'm going to flip to 30 as per Matthews recommendation, and see 
:how far that takes me. At this time, I'm only testing by backing up one 
:machine on the network. If it fails, I'll clock the time, and then 
:'reformat' with two drives.
:
:Is there a technical reason this may work better with only two drives?
:
:Is there anyone interested to the point where remote login would be helpful?
:
:Steve

    This issue is vexing a lot of people.

    Setting the timeout to 30 will not effect performance, but it will
    cause a 30 second delay in recovery when (if) the problem occurs.
    i.e. when the disk stalls it will just sit there doing nothing for
    30 seconds, then it will print the timeout message and try to recover.

    It occurs to me that it might be beneficial to actually measure the
    disk's response time to each request, and then graph it over a period
    of time.  Maybe seeing the issue visually will give some clue as to the
    actual cause.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>