Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Aug 1997 09:34:30 +0000 (GMT)
From:      Steve Kargl <sgk@troutmask.apl.washington.edu>
To:        jwd@unx.sas.com (John W. DeBoskey)
Cc:        freebsd-current@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG
Subject:   Re: scsi time-out & lockup under smp
Message-ID:  <199708080934.JAA03485@troutmask.apl.washington.edu>
In-Reply-To: <199708081208.AA04989@iluvatar.unx.sas.com> from "John W. DeBoskey" at "Aug 8, 97 08:08:50 am"

next in thread | previous in thread | raw e-mail | index | archive | help
According to John W. DeBoskey:
> Hello,
> 
>    I'm wondering if anyone might have some information relating to the
> following problem.
> 
>    I have the 3.0-970731-SNAP installed on a Dell PowerEdge 6100/200,
> four processor machine. The problem occurs on either of the two
> aic7880 onboard scsi devices, or a 2940 adapter board, when in 
> multi-proccessor mode. Anywhere from 5 to 60 minutes after booting
> the machine, it freezes with the following messages on the console:
> 
> sd0: SCB 0x1 - timed out in command pahse, SCSISIGI == 0x86
> SEQADDR = 0x8c SCSISEQ = 0x12 SSTAT0 = 0x7 SSTAT1 = 0x3
> sd0: abort message in message buffer
> sd0: SCB 1 - Abort Completed.
> sd0: no longer in timeout
> sd0: SCB 0x1 - timed our while idle, LASTPHASE == 0x1, SCSISIGI = 0x0
> SEQADDR = 0xb SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0x2
> sd0: Queueing an Abort SCB
> 
> 
>    It only seems to occur when I start to initiate heavy disk io. It
> does not happen in the uni-proccesor situation. The complete output
> from dmesg is appended to this mail. If anyone can help me track this
> down, I'd really appreciate it.
> 
> Thanks,
> John
> 


It occurs on uni-processor system, too.  If I use dump(1) to backup
my system, I eventually get the following:

st0(ahc0:2:0): SCB 0x0 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
SEQADDR = 0x5 SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0xa
st0(ahc0:2:0): Queueing an Abort SCB
st0(ahc0:2:0): Abort Message Sent
st0(ahc0:2:0): SCB 0 - Abort Completed.
st0(ahc0:2:0): no longer in timeout
st0(ahc0:2:0): SCB 0x0 - timed out in dataout phase, SCSISIGI == 0xc6
SEQADDR = 0x42 SCSISEQ = 0x12 SSTAT0 = 0x7 SSTAT1 = 0x13
sd0(ahc0:0:0): abort message in message buffer
sd0(ahc0:0:0): SCB 3 - Abort Completed.
sd0(ahc0:0:0): no longer in timeout
st0(ahc0:2:0): SCB 0x0 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
SEQADDR = 0x5 SCSISEQ = 0x12 SSTAT0 = 0x5 SSTAT1 = 0xa
st0(ahc0:2:0): SCB 0: Immediate reset.  Flags = 0x1
ahc0: Issued Channel A Bus Reset. 3 SCBs aborted
Clearing bus reset
Clearing 'in-reset' flag
st0(ahc0:2:0): no longer in timeout
sd0(ahc0:0:0): UNIT ATTENTION asc:29,0
sd0(ahc0:0:0):  Power on, reset, or bus device reset occurred, retries:3
sd1(ahc0:1:0): UNIT ATTENTION asc:29,0
sd1(ahc0:1:0):  Power on, reset, or bus device reset occurred, retries:4


>From dmesg:

ahc0: <Adaptec 2940 SCSI host adapter> rev 0x03 int a irq 11 on pci0.12.0
ahc0: aic7870 Single Channel, SCSI Id=7, 16 SCBs
ahc0: waiting for scsi devices to settle
scbus0 at ahc0 bus 0
sd0 at scbus0 target 0 lun 0
sd0: <SEAGATE ST51080N 0943> type 0 fixed SCSI 2
sd0: Direct-Access 1030MB (2109840 512 byte sectors)
st0 at scbus0 target 2 lun 0
st0: <HP HP35480A 1109> type 1 removable SCSI 2
st0: Sequential-Access density code 0x13,  drive empty

-- 
Steve

finger kargl@troutmask.apl.washington.edu
http://troutmask.apl.washington.edu/~kargl/sgk.html



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708080934.JAA03485>