From owner-freebsd-stable@FreeBSD.ORG Tue Jul 15 17:55:54 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E65B106564A for ; Tue, 15 Jul 2008 17:55:54 +0000 (UTC) (envelope-from steve@ibctech.ca) Received: from ibctech.ca (v6.ibctech.ca [IPv6:2607:f118::b6]) by mx1.freebsd.org (Postfix) with SMTP id AAAFE8FC14 for ; Tue, 15 Jul 2008 17:55:53 +0000 (UTC) (envelope-from steve@ibctech.ca) Received: (qmail 60298 invoked by uid 89); 15 Jul 2008 17:59:10 -0000 Received: from unknown (HELO ?IPv6:2607:f118::5?) (steve@ibctech.ca@2607:f118::5) by 2607:f118::b6 with ESMTPA; 15 Jul 2008 17:59:10 -0000 Message-ID: <487CE4B8.5080900@ibctech.ca> Date: Tue, 15 Jul 2008 13:56:08 -0400 From: Steve Bertrand User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Matthew Dillon References: <487CCD46.8080506@ibctech.ca> <200807151711.m6FHBgVO007481@apollo.backplane.com> In-Reply-To: <200807151711.m6FHBgVO007481@apollo.backplane.com> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: taskqueue timeout X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jul 2008 17:55:54 -0000 Matthew Dillon wrote: > If you are getting DMA timeouts, go to this URL: Yes, I am. > http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting I fall under the category of "ATA/SATA DMA timeout issues". > Then I would suggest going into /usr/src/sys/dev/ata (I think, on > FreeBSD), locate all instances where request->timeout is set to 5, > and change them all to 10. > > cd /usr/src/sys/dev/ata > fgrep 'request->timeout' *.c > ... change all assignments of 5 to 10 ... > > Try that first. If it helps then it is a known issue. Basically > a combination of the on-disk write cache and possible ECC corrections, > remappings, or excessive remapped sectors can cause the drive to take > much longer then normal to complete a request. The default 5-second > timeout is insufficient. > > If it does help, post confirmation to prod the FBsd developers to > change the timeouts. I've just reproduced the problem, and will try hacking the code now to see if the problem goes away. Since the box won't take input, I can't tell the disk usage at the time it dies. However, it seems to appear while running an Amanda backup, and my network throughput hits about ~90 Mbps @ ~5 kpps. I'll post back with results of the increase of the timeout. Steve