Date: Tue, 13 Apr 1999 10:23:21 -0600 (MDT) From: "Kenneth D. Merry" <ken@plutotech.com> To: asami@cs.berkeley.edu (Satoshi Asami) Cc: scsi@FreeBSD.ORG Subject: Re: timed out while idle? Message-ID: <199904131623.KAA03308@panzer.plutotech.com> In-Reply-To: <199904131010.DAA47254@silvia.hip.berkeley.edu> from Satoshi Asami at "Apr 13, 1999 3:10:41 am"
next in thread | previous in thread | raw e-mail | index | archive | help
Satoshi Asami wrote... > Hi Justin, Ken and others, > > What exactly does "timed out while idle" mean? We're still seeing > these stuff from time to time: > > === > Apr 1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): SCB 0x30 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 > Apr 1 18:34:47 m0 /kernel: SEQADDR == 0x8 > Apr 1 18:34:47 m0 /kernel: SSTAT1 == 0xa > Apr 1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): Queuing a BDR SCB > Apr 1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): Bus Device Reset Message Sent > Apr 1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): no longer in timeout, status = 34b > Apr 1 18:34:47 m0 /kernel: ahc2: Bus Device Reset on A:12. 1 SCBs aborted The timed out while idle message means that the drive took longer than the timeout (60 seconds) to respond to a read or write request, and nothing was going on on the bus at the time. In other words, your drive went out to lunch, and we hit it with a BDR to get it to come back. > Some of these eventually lead to panics or hangs. > > These are the same IBM disks we asked about a while ago. > > === > da34 at ahc2 bus 0 target 2 lun 0 > da34: <IBM OEM DCHS09Y 2424> Fixed Direct Access SCSI2 device > da34: 20.0MB/s transfers (10.0MHz, offset 8, 16bit), Tagged Queueing Enabled > da34: 8689MB (17796077 512 byte sectors: 255H 63S/T 1107C) > === > > I traced the message to ahc_timeout() in aic7xxx.c but not being a > kernel hacker myself I can't really tell where it's called from. Is > this like one of those alarm clocks ("wake me up in 5msecs if nothing > happens")? Yep. There's a timeout for each transaction. If the transaction doesn't complete in the specified period of time (60 seconds for disk reads/writes), the timeout fires, a BDR is sent and all transactions that were queued to the disk are requeued. > Also, I see 7 phases in case statements: > > === > case P_DATAOUT: > printf("in dataout phase"); > break; > case P_DATAIN: > printf("in datain phase"); > break; > case P_COMMAND: > printf("in command phase"); > break; > case P_MESGOUT: > printf("in message out phase"); > break; > case P_STATUS: > printf("in status phase"); > break; > case P_MESGIN: > printf("in message in phase"); > break; > case P_BUSFREE: > printf("while idle, LASTPHASE == 0x%x", > bus_state); > break; > === > > Is there some place that explans roughly what these correspond to? > The ones we see most often are P_BUSFREE, P_COMMAND and P_DATAIN. I > see that you refer to Adaptec databooks in aic7xxx.reg but since we > don't have those, any web page or other on-line documentation that we > can refer to will be great. Those are SCSI bus phases. If you're seeing timeouts in datain phase or command phase, that often indicates a termination or cabling problem. Just look at the SCSI specs if you want to find out about the different SCSI bus phases. Ken -- Kenneth Merry ken@plutotech.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199904131623.KAA03308>