Date: Sun, 14 Nov 2004 10:27:32 +0100 From: Frode Nordahl <frode@nordahl.net> To: =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@DeepCore.dk> Cc: Garance A Drosihn <drosih@rpi.edu> Subject: Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout Message-ID: <69854275-361F-11D9-B78A-000A95A9A574@nordahl.net> In-Reply-To: <4195E1FF.5090906@DeepCore.dk> References: <25983.1100341229@critter.freebsd.dk> <4195E1FF.5090906@DeepCore.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Nov 13, 2004, at 11:29, S=F8ren Schmidt wrote: > Poul-Henning Kamp wrote: >> In message <4195DB3E.2040807@DeepCore.dk>,=20 >> =3D?ISO-8859-1?Q?S=3DF8ren_Schmidt?=3D wri >> tes: >>>> It is not really the task of the ata driver to fail requests at = that >>>> time. How long is the timeout anyway ? >>> >>> Oh, ATA doesn't fail them, it just yells that the request hasn't=20 >>> been finished yet by the upper layers, it doesn't do anything to the=20= >>> request. >>> >>> Timeout is 5 secs, which is a pretty long time in this context = IMHO.. >> Five seconds counted from when ? > > Now thats the nasty part :) > ATA starts the timeout when the request is issued to the device, so=20 > theoretically the disk could take 4.9999 secs to complete the request=20= > and then the timeout fires before the taskqueue gets its chance at it,=20= > but IMHO thats pretty unlikely... > > Anyhow, I can just remove the warning from ATA if that makes anyone=20 > happy, since its just a warning and ATA doesn't do anything with it at=20= > all. > However, IMNHO this points at a problem somewhere that we should=20 > better understand and fix instead. Please don't remove the warning until we can find and fix this problem!=20= Even if it may not be ATA related, your warning is by now the only way=20= to tell me when the problem occurs :-) I have two brand new systems which I have installed 5.3-R on who show=20 this problem. They are two way 3.06GHz Xeons, and when run in SMP mode,=20= the system will often panic shortly after these warnings occur, most=20 often in UFS code. Since we don't know what or where the problem is, the traces might be=20 completely bogus, but I include them anyway:=20 http://home.powertech.no/frode/freebsd/ I have just installed CURRENT on one of them, haven't gotten around to=20= make it crash yet, but I get this: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=3D119959919 ad4: FAILURE - WRITE_DMA timed out g_vfs_done():ad4s1f[WRITE(offset=3D43971141632, length=3D2048)]error =3D = 5 initiate_write_filepage: already started ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=3D120428639 ad4: FAILURE - WRITE_DMA timed out g_vfs_done():ad4s1f[WRITE(offset=3D44211126272, length=3D16384)]error =3D = 5 =46rom ffs_softdep.c: if (pagedep->pd_state & IOSTARTED) { /* * This can only happen if there is a driver that does=20= not * understand chaining. Here biodone will reissue the=20= call * to strategy for the incomplete buffers. */ printf("initiate_write_filepage: already started\n"); return; } Mvh, Frode Nordahl > --=20 > > -S=F8ren > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to=20 > "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?69854275-361F-11D9-B78A-000A95A9A574>