Date: Wed, 19 Feb 2003 10:20:12 +1100 (EST) From: Bruce Evans <bde@zeta.org.au> To: Ruslan Ermilov <ru@FreeBSD.ORG> Cc: Alfred Perlstein <alfred@FreeBSD.ORG>, Thomas Moestl <tmm@FreeBSD.ORG>, Soren Schmidt <sos@FreeBSD.ORG>, <current@FreeBSD.ORG> Subject: Re: cvs commit: src/sys/kern kern_intr.c src/sys/dev/ata ata-all.c Message-ID: <20030219095525.R11144-100000@gamplex.bde.org> In-Reply-To: <20030218102408.GA48010@sunbay.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 18 Feb 2003, Ruslan Ermilov wrote: > On Fri, Feb 14, 2003 at 05:10:40AM -0800, Alfred Perlstein wrote: > > alfred 2003/02/14 05:10:40 PST > > > > Modified files: > > sys/kern kern_intr.c > > sys/dev/ata ata-all.c > > Log: > > Fix crash dumps on ata and scsi. > > > [...] > > To fix ata, use what appears to be a polling method if we're dumping, > > I stole this from tmm but added code to ensure that this change is > > only in effect while dumping. > > > > Tested by: des > > > FWIW, if I propagate this change to the !dumping case, it also > fixes the ``resume stucks in "ata1: resetting devices .."'' bug > I was having with my ThinkPad 600X: > > %%% > Index: ata-all.c > =================================================================== > RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v > retrieving revision 1.165 > diff -u -p -r1.165 ata-all.c > --- ata-all.c 14 Feb 2003 13:10:40 -0000 1.165 > +++ ata-all.c 18 Feb 2003 10:08:22 -0000 > @@ -486,8 +486,7 @@ ata_getparam(struct ata_device *atadev, > > /* apparently some devices needs this repeated */ > do { > - if (ata_command(atadev, command, 0, 0, 0, > - dumping ? ATA_WAIT_READY : ATA_WAIT_INTR)) { > + if (ata_command(atadev, command, 0, 0, 0, ATA_WAIT_READY)) { > ata_prtdev(atadev, "%s identify failed\n", > command == ATA_C_ATAPI_IDENTIFY ? "ATAPI" : "ATA"); > free(ata_parm, M_ATA); > %%% There is, or was, something near here that made the whole system go unresponsive (as seen by nfs clients) for several seconds. I guess the main problem was just using polled mode in all cases here. In RELENG_4, polling is done at splbio() so normally only disk devices are blocked, but under -current almost everything is blocked by Giant. > The resume session (with apm(4)) now looks like this: > > : cbb0: PCI Memory allocated: 50103000 > : cbb1: PCI Memory allocated: 50102000 > : pcm0: detached > : csa: card is Thinkpad 600X/A20/T20 > : pcm0: <CS461x PCM Audio> on csa0 > : pcm0: <Cirrus Logic CS4297A ac97 codec> > : wakeup from sleeping state (slept 00:00:10) > : ata0: resetting devices .. > : done > : ata1: resetting devices .. > : ata1-slave: timeout waiting for cmd=ec s=01 e=24 > : ata1-slave: ATA identify failed > : done Apparently the timeout is too short or the interrupt got lost. The timeout seems to be too short. It is 10 seconds, but IIRC the spec is says 30 seconds for reset of the master and a bit more for the slave. Since things work with polling, we know that the device state changed properly. We could test for this state change instead of always aborting after the timeout, and do finer grained and more sleeps to determine the precise timeout required. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030219095525.R11144-100000>