Date: Wed, 20 May 1998 21:45:06 +0000 From: "Greg Rowe" <greg@uswest.net> To: "Justin T. Gibbs" <gibbs@plutotech.com> Cc: scsi@FreeBSD.ORG Subject: Re: CAM and Adaptec 2940UW Rev E Message-ID: <9805202145.ZM12757@psv.oss.uswest.net> In-Reply-To: "Justin T. Gibbs" <gibbs@plutotech.com> "Re: CAM and Adaptec 2940UW Rev E" (May 20, 11:41am) References: <199805201745.LAA16065@pluto.plutotech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Justin, The cards we are having problems with have a white barcode label over the 7880 chip and are marked: AHA-2940UW 945300-01 E (We have 50 or 60 marked "C" and a 150 or so marked "D") 9749 (We have a dozen or so marked "E" that all exhibit the problem) Under the label, the chip is marked: AIC-7880P BQWA740 740111 BC9564.1 We added the printf statement as you requested, built and installed a new kernel, and made a number of Iozone runs, but I don't think we ever recieved the debug message you were looking for. This is from the last run we did before the system crashed: File '/home/Bonnie.204', size: 1048576000 Writing with putc()...May 20 16:11:21 test6 /kernel: (da1:ahc1:0:0:0): SCB 0x0 - timed out in data t phase, SCSISIGI == 0xe6 May 20 16:11:39 test6 /kernel: SEQADDR == 0x115 May 20 16:11:39 test6 /kernel: SSTAT1 == 0x13 May 20 16:11:39 test6 /kernel: (da1:ahc1:0:0:0): BDR message in message buffer May 20 16:11:39 test6 /kernel: (da1:ahc1:0:0:0): SCB 0x0 - timed out in dataout phase, SCSISIGI == xf6 May 20 16:11:39 test6 /kernel: SEQADDR == 0x115 May 20 16:11:39 test6 /kernel: SSTAT1 == 0x13 May 20 16:11:39 test6 /kernel: (da1:ahc1:0:0:0): no longer in timeout May 20 16:11:39 test6 /kernel: ahc1: Issued Channel A Bus Reset. 128 SCBs aborted May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): SCB 0x0 - timed out while idle, LASTPHASE == 0x1, CSISIGI == 0x0 May 20 16:11:39 test6 /kernel: SEQADDR == 0x18b May 20 16:11:39 test6 /kernel: SSTAT1 == 0x0 May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): Queuing a BDR SCB May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): SCB 0x0 - timed out while idle, LASTPHASE == 0x1, CSISIGI == 0x0 May 20 16:11:39 test6 /kernel: SEQADDR == 0x18b May 20 16:11:39 test6 /kernel: SSTAT1 == 0x0 May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): no longer in timeout May 20 16:11:39 test6 /kernel: ahc1: Issued Channel A Bus Reset. 65 SCBs aborted May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): WRITE(06). CDB: a 1e 85 10 80 0 May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): UNIT ATTENTION asc:29,2 May 20 16:11:39 test6 /kernel: (da2:ahc1:0:1:0): Scsi bus reset occurred field replaceable unit: 2 May 20 16:11:39 test6 /kernel: (da1:ahc1:0:0:0): WRITE(06). CDB: a 1e 85 10 80 0 May 20 16:11:39 test6 /kernel: (da1:ahc1:0:0:0): UNIT ATTENTION asc:29,2 May 20 16:11:39 test6 /kernel: (da1:ahc1:0:0:0): Scsi bus reset occurred field replaceable unit: 2 done We can drop the transfer rate down to 10MBS in the card setup and we don't see any problems. Note that da1 and da2 are ccd devices, but we have similar errors on non-ccd devices. We can give you access to the system if that would help ? Thanks, Greg On May 20, 11:41am, Justin T. Gibbs wrote: > Subject: Re: CAM and Adaptec 2940UW Rev E > >Justin, > > > > We finally got a chance to test the Adaptec 2940UW Revision E cards with CAM > >and the problems still exist. Our configuration is a Tyan MB, 2 Adaptec > >2940UW's , 3 - 4.4 Gig Seagate ST34572W drives. OS is 3.0 current as of 05/15 > >and CAM-980513. The revision D cards work with no problem, but running bonnie > >or iozone with the Rev E cards produce the following (and a system crash): > > I know about one system crash during recovery and have fixed that locally, > but I'm working now to try and reproduce your error. I have to plead > ignorance to what you mean by rev D and rev E cards. I have cards with > both rev 0 and 1 aic7880s on them, but I don't know where you are getting > the D and E letter. > > This is the card I'm going to work with: > > ahc0: <Adaptec 2940 Ultra SCSI adapter> rev 0x01 int a irq 9 on pci0.9.0 > ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs > > As you can see, it has a rev 1 aic7880. > > I have access to ultra-narrow versions of the 4gig Barracuda you're using > (same firmware rev. even), but if this turns out to be a wide problem, I > won't be able to reproduce it here. > > >From the debugging messages you've sent me, it's clear that the chip is > hanging up attempting to turn off the DMA fifo, and I don't know why this > is the case. It would be useful for you to add a printf in ahc_timeout() > that give the contents of a register: > > printf("DFCNTRL = %x\n", ahc_inb(ahc, DFCNTRL)); > > Place it down near the printf for SEQADDR and friends. > > -- > Justin > >-- End of excerpt from Justin T. Gibbs -- Greg Rowe <greg@uswest.net> US WEST - !NTERACT Internet Services "To err is human, to really foul up requires the root password." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9805202145.ZM12757>