From owner-freebsd-current@FreeBSD.ORG Sat Nov 13 16:41:26 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 59CC816A4CE; Sat, 13 Nov 2004 16:41:26 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id D933343D31; Sat, 13 Nov 2004 16:41:25 +0000 (GMT) (envelope-from scottl@freebsd.org) Received: from [192.168.254.11] (junior-wifi.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.12.11/8.12.10) with ESMTP id iADGh8jV009642; Sat, 13 Nov 2004 09:43:09 -0700 (MST) (envelope-from scottl@freebsd.org) Message-ID: <41963961.50306@freebsd.org> Date: Sat, 13 Nov 2004 09:42:09 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.2) Gecko/20040929 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Poul-Henning Kamp References: <26249.1100342074@critter.freebsd.dk> In-Reply-To: <26249.1100342074@critter.freebsd.dk> X-Enigmail-Version: 0.86.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=0.0 required=3.8 tests=none autolearn=no version=2.63 X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on pooker.samsco.org cc: Robert Watson cc: Zoltan Frombach cc: freebsd-current@freebsd.org cc: Garance A Drosihn cc: =?ISO-8859-1?Q?S=F8ren_Schmidt?= Subject: Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2004 16:41:26 -0000 Poul-Henning Kamp wrote: > In message <4195E1FF.5090906@DeepCore.dk>, =?ISO-8859-1?Q?S=F8ren_Schmidt?= wri > tes: > > >>>>Timeout is 5 secs, which is a pretty long time in this context IMHO.. >>> >>>Five seconds counted from when ? >> >>Now thats the nasty part :) >>ATA starts the timeout when the request is issued to the device, so >>theoretically the disk could take 4.9999 secs to complete the request >>and then the timeout fires before the taskqueue gets its chance at it, >>but IMHO thats pretty unlikely... > > > I find that far more likely than kernel threads being stalled for that > long. ATA disks doing bad-block stuff takes several seconds on some > of the disks I've had my hands on. > Bad block recovery takes a while, as do things like periodic thermal recal. The IBM drives are famous for this 'feature'. > >>Anyhow, I can just remove the warning from ATA if that makes anyone >>happy, since its just a warning and ATA doesn't do anything with it at all. >>However, IMNHO this points at a problem somewhere that we should better >>understand and fix instead. > > > I would prefer you reset the timer to five seconds in your interrupt > routine so we can see exactly on which side of that the time is spent. > > At least cancel the hardware timeout in the ithread. I don't doubt that there are times when the system is going to get busy and not service g_up right away (and thus the bio_taskqueue), and I won't argue that this doesn't indicate buggy or poorly implemented code elsewhere in the system. But the timeout warning that is given now does nothing to help identify whatever real problem exists, and only confuses users. Scott