Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Apr 1999 10:23:21 -0600 (MDT)
From:      "Kenneth D. Merry" <ken@plutotech.com>
To:        asami@cs.berkeley.edu (Satoshi Asami)
Cc:        scsi@FreeBSD.ORG
Subject:   Re: timed out while idle?
Message-ID:  <199904131623.KAA03308@panzer.plutotech.com>
In-Reply-To: <199904131010.DAA47254@silvia.hip.berkeley.edu> from Satoshi Asami at "Apr 13, 1999  3:10:41 am"

next in thread | previous in thread | raw e-mail | index | archive | help
Satoshi Asami wrote...
> Hi Justin, Ken and others,
> 
> What exactly does "timed out while idle" mean?  We're still seeing
> these stuff from time to time:
> 
> ===
> Apr  1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): SCB 0x30 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
> Apr  1 18:34:47 m0 /kernel: SEQADDR == 0x8
> Apr  1 18:34:47 m0 /kernel: SSTAT1 == 0xa
> Apr  1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): Queuing a BDR SCB
> Apr  1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): Bus Device Reset Message Sent
> Apr  1 18:34:47 m0 /kernel: (da44:ahc2:0:12:0): no longer in timeout, status = 34b
> Apr  1 18:34:47 m0 /kernel: ahc2: Bus Device Reset on A:12. 1 SCBs aborted

The timed out while idle message means that the drive took longer than the
timeout (60 seconds) to respond to a read or write request, and nothing was
going on on the bus at the time.  In other words, your drive went out to
lunch, and we hit it with a BDR to get it to come back.


> Some of these eventually lead to panics or hangs.
> 
> These are the same IBM disks we asked about a while ago.
> 
> ===
> da34 at ahc2 bus 0 target 2 lun 0
> da34: <IBM OEM DCHS09Y 2424> Fixed Direct Access SCSI2 device 
> da34: 20.0MB/s transfers (10.0MHz, offset 8, 16bit), Tagged Queueing Enabled
> da34: 8689MB (17796077 512 byte sectors: 255H 63S/T 1107C)
> ===
> 
> I traced the message to ahc_timeout() in aic7xxx.c but not being a
> kernel hacker myself I can't really tell where it's called from.  Is
> this like one of those alarm clocks ("wake me up in 5msecs if nothing
> happens")?

Yep.  There's a timeout for each transaction.  If the transaction doesn't
complete in the specified period of time (60 seconds for disk
reads/writes), the timeout fires, a BDR is sent and all transactions that
were queued to the disk are requeued.

> Also, I see 7 phases in case statements:
> 
> ===
> 	case P_DATAOUT:
> 		printf("in dataout phase");
> 		break;
> 	case P_DATAIN:
> 		printf("in datain phase");
> 		break;
> 	case P_COMMAND:
> 		printf("in command phase");
> 		break;
> 	case P_MESGOUT:
> 		printf("in message out phase");
> 		break;
> 	case P_STATUS:
> 		printf("in status phase");
> 		break;
> 	case P_MESGIN:
> 		printf("in message in phase");
> 		break;
> 	case P_BUSFREE:
> 		printf("while idle, LASTPHASE == 0x%x",
> 			bus_state);
> 		break;
> ===
> 
> Is there some place that explans roughly what these correspond to?
> The ones we see most often are P_BUSFREE, P_COMMAND and P_DATAIN.  I
> see that you refer to Adaptec databooks in aic7xxx.reg but since we
> don't have those, any web page or other on-line documentation that we
> can refer to will be great.

Those are SCSI bus phases.  If you're seeing timeouts in datain phase or
command phase, that often indicates a termination or cabling problem.

Just look at the SCSI specs if you want to find out about the different
SCSI bus phases.

Ken
-- 
Kenneth Merry
ken@plutotech.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199904131623.KAA03308>