From owner-freebsd-stable@FreeBSD.ORG Wed Jul 16 03:27:43 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C1F121065675 for ; Wed, 16 Jul 2008 03:27:43 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.freebsd.org (Postfix) with ESMTP id 915058FC1C for ; Wed, 16 Jul 2008 03:27:43 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.14.1/8.14.1) with ESMTP id m6G3Rh5G012576; Tue, 15 Jul 2008 20:27:43 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.14.1/8.13.4/Submit) id m6G3Rh57012575; Tue, 15 Jul 2008 20:27:43 -0700 (PDT) Date: Tue, 15 Jul 2008 20:27:43 -0700 (PDT) From: Matthew Dillon Message-Id: <200807160327.m6G3Rh57012575@apollo.backplane.com> To: Steve Bertrand References: <487CCD46.8080506@ibctech.ca> <200807151711.m6FHBgVO007481@apollo.backplane.com> <487CF077.2040201@ibctech.ca> <487CFA08.5000308@ibctech.ca> <200807151955.m6FJtf77008969@apollo.backplane.com> <487D5D08.9070102@ibctech.ca> Cc: freebsd-stable@freebsd.org Subject: Re: taskqueue timeout X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Jul 2008 03:27:43 -0000 :... :> and see if the problem reoccurs with just two drives. : :... I knew that was going to come up... my response is "I worked so hard :to get this system with ZFS all configured *exactly* how I wanted it". : :To test, I'm going to flip to 30 as per Matthews recommendation, and see :how far that takes me. At this time, I'm only testing by backing up one :machine on the network. If it fails, I'll clock the time, and then :'reformat' with two drives. : :Is there a technical reason this may work better with only two drives? : :Is there anyone interested to the point where remote login would be helpful? : :Steve This issue is vexing a lot of people. Setting the timeout to 30 will not effect performance, but it will cause a 30 second delay in recovery when (if) the problem occurs. i.e. when the disk stalls it will just sit there doing nothing for 30 seconds, then it will print the timeout message and try to recover. It occurs to me that it might be beneficial to actually measure the disk's response time to each request, and then graph it over a period of time. Maybe seeing the issue visually will give some clue as to the actual cause. -Matt Matthew Dillon