Date: Sat, 13 Nov 2004 13:15:46 +0100 From: =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@DeepCore.dk> To: Zoltan Frombach <tssajo@hotmail.com> Cc: Robert Watson <rwatson@freebsd.org> Subject: Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout Message-ID: <4195FAF2.4050409@DeepCore.dk> In-Reply-To: <BAY2-DAV4EoPXCFbN4T00016204@hotmail.com> References: <26249.1100342074@critter.freebsd.dk> <4195E5DB.2070302@DeepCore.dk> <BAY2-DAV4EoPXCFbN4T00016204@hotmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Zoltan Frombach wrote: > I will apply this patch first thing tomorrow. But I don't see how will = I=20 > see any difference? Does it put something into a log file? Shouldn't it= ? The change is that if you see the "WARNING interrupt seen" we know that=20 it was the upper layers that used up the 5 secs of timeout, not that some of it was used by a disk being slow to respond. -S=F8ren > Zoltan >=20 > Poul-Henning Kamp wrote: >=20 >> In message <4195E1FF.5090906@DeepCore.dk>,=20 >> =3D?ISO-8859-1?Q?S=3DF8ren_Schmidt?=3D wri >> tes: >> >> >>>>> Timeout is 5 secs, which is a pretty long time in this context IMHO= =2E. >>>> >>>> >>>> Five seconds counted from when ? >>> >>> >>> Now thats the nasty part :) >>> ATA starts the timeout when the request is issued to the device, so=20 >>> theoretically the disk could take 4.9999 secs to complete the request= =20 >>> and then the timeout fires before the taskqueue gets its chance at=20 >>> it, but IMHO thats pretty unlikely... >> >> >> I find that far more likely than kernel threads being stalled for that= >> long. ATA disks doing bad-block stuff takes several seconds on some >> of the disks I've had my hands on. >> >>> Anyhow, I can just remove the warning from ATA if that makes anyone=20 >>> happy, since its just a warning and ATA doesn't do anything with it=20 >>> at all. >>> However, IMNHO this points at a problem somewhere that we should=20 >>> better understand and fix instead. >> >> >> I would prefer you reset the timer to five seconds in your interrupt >> routine so we can see exactly on which side of that the time is spent.= >=20 >=20 > It would be even better to time how long both ops take and be able to > get that via a sysctl or something (I have that on my TODO list but its= > loooong :) ). >=20 > Anyhow resetting it is easy (patch against 5.3R): >=20 > Index: ata-queue.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /home/ncvs/src/sys/dev/ata/ata-queue.c,v > retrieving revision 1.32.2.5 > diff -u -r1.32.2.5 ata-queue.c > --- ata-queue.c 24 Oct 2004 09:27:37 -0000 1.32.2.5 > +++ ata-queue.c 13 Nov 2004 10:44:40 -0000 > @@ -216,6 +216,9 @@ > ata_completed(request, 0); > } > else { > + if (!dumping) > + callout_reset(&request->callout, request->timeout * hz, > + (timeout_t*)ata_timeout, request); > if (request->bio && !(request->flags & ATA_R_TIMEOUT)) { > ATA_DEBUG_RQ(request, "finish bio_taskqueue"); > bio_taskqueue(request->bio, (bio_task_t *)ata_completed, > request); >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4195FAF2.4050409>