From owner-freebsd-current@FreeBSD.ORG Sat Nov 13 12:12:29 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 735E516A4CE; Sat, 13 Nov 2004 12:12:29 +0000 (GMT) Received: from hotmail.com (bay2-dav4.bay2.hotmail.com [65.54.246.108]) by mx1.FreeBSD.org (Postfix) with ESMTP id 56D8A43D2F; Sat, 13 Nov 2004 12:12:29 +0000 (GMT) (envelope-from tssajo@hotmail.com) Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sat, 13 Nov 2004 04:11:02 -0800 Received: from 24.24.201.219 by BAY2-DAV4.phx.gbl with DAV; Sat, 13 Nov 2004 12:10:15 +0000 X-Originating-IP: [24.24.201.219] X-Originating-Email: [tssajo@hotmail.com] X-Sender: tssajo@hotmail.com From: "Zoltan Frombach" To: =?iso-8859-1?Q?S=F8ren_Schmidt?= , "Poul-Henning Kamp" References: <26249.1100342074@critter.freebsd.dk> <4195E5DB.2070302@DeepCore.dk> Date: Sat, 13 Nov 2004 04:10:12 -0800 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Message-ID: X-OriginalArrivalTime: 13 Nov 2004 12:11:02.0174 (UTC) FILETIME=[D7C95BE0:01C4C979] cc: Garance A Drosihn cc: freebsd-current@freebsd.org cc: Robert Watson Subject: Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Nov 2004 12:12:29 -0000 I will apply this patch first thing tomorrow. But I don't see how will I see any difference? Does it put something into a log file? Shouldn't it? Zoltan Poul-Henning Kamp wrote: > In message <4195E1FF.5090906@DeepCore.dk>, > =?ISO-8859-1?Q?S=F8ren_Schmidt?= wri > tes: > > >>>>Timeout is 5 secs, which is a pretty long time in this context IMHO.. >>> >>>Five seconds counted from when ? >> >>Now thats the nasty part :) >>ATA starts the timeout when the request is issued to the device, so >>theoretically the disk could take 4.9999 secs to complete the request and >>then the timeout fires before the taskqueue gets its chance at it, but >>IMHO thats pretty unlikely... > > I find that far more likely than kernel threads being stalled for that > long. ATA disks doing bad-block stuff takes several seconds on some > of the disks I've had my hands on. > >>Anyhow, I can just remove the warning from ATA if that makes anyone happy, >>since its just a warning and ATA doesn't do anything with it at all. >>However, IMNHO this points at a problem somewhere that we should better >>understand and fix instead. > > I would prefer you reset the timer to five seconds in your interrupt > routine so we can see exactly on which side of that the time is spent. It would be even better to time how long both ops take and be able to get that via a sysctl or something (I have that on my TODO list but its loooong :) ). Anyhow resetting it is easy (patch against 5.3R): Index: ata-queue.c =================================================================== RCS file: /home/ncvs/src/sys/dev/ata/ata-queue.c,v retrieving revision 1.32.2.5 diff -u -r1.32.2.5 ata-queue.c --- ata-queue.c 24 Oct 2004 09:27:37 -0000 1.32.2.5 +++ ata-queue.c 13 Nov 2004 10:44:40 -0000 @@ -216,6 +216,9 @@ ata_completed(request, 0); } else { + if (!dumping) + callout_reset(&request->callout, request->timeout * hz, + (timeout_t*)ata_timeout, request); if (request->bio && !(request->flags & ATA_R_TIMEOUT)) { ATA_DEBUG_RQ(request, "finish bio_taskqueue"); bio_taskqueue(request->bio, (bio_task_t *)ata_completed, request); -- -Søren