From owner-freebsd-scsi Sun May 31 08:10:19 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id IAA02530 for freebsd-scsi-outgoing; Sun, 31 May 1998 08:10:19 -0700 (PDT) (envelope-from owner-freebsd-scsi@FreeBSD.ORG) Received: from dragon.awen.com (dragon.awen.com [207.33.155.10]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id IAA02525 for ; Sun, 31 May 1998 08:10:17 -0700 (PDT) (envelope-from mburgett@dragon.awen.com) Received: (from mburgett@localhost) by dragon.awen.com (8.8.8/8.8.7) id IAA06108; Sun, 31 May 1998 08:10:17 -0700 (PDT) Message-ID: X-Mailer: XFMail 1.2 [p0] on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Date: Sun, 31 May 1998 08:09:26 -0700 (PDT) Reply-To: Mike Burgett From: Mike Burgett To: freebsd-scsi@FreeBSD.ORG Subject: wide scsi woes... Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I'm trying to bring up a new machine, and having a world of grief. I really don't think this is a freebsd problem, but hope maybe someone here has dealt with this before. The only system on the machine is freebsd, so I can't testing with other systems would be problematic. I hope this is the right place to ask these questions... I've been bringing up a new machine, p6 with an adaptec 2940uw controller, with two drives on it. drive 0 is an IBM DCAS 32160W, drive 1 is a DCAS 34330. I'm using the cable that came with the controller currently, but I have tried another cable. I get the same error, running 2.2.6-RELEASE or the 980523-SNAP of current (generic and custom kernels on both), but I'm mainly been working with current, since that's what I intend to run on this machine, and more importantly right now, current recovers from the error without a hard reset. :) First, here's my probes: Copyright (c) 1992-1998 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-980523-SNAP #0: Sat May 23 10:51:06 GMT 1998 root@make.ican.net:/usr/src/sys/compile/GENERIC Timecounter "i8254" frequency 1193182 Hz cost 3329 ns Timecounter "TSC" frequency 199432992 Hz cost 216 ns CPU: Pentium Pro (199.43-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x617 Stepping=7 Features=0xfbff real memory = 134217728 (131072K bytes) avail memory = 127688704 (124696K bytes) Probing for devices on PCI bus 0: Correcting Natoma config for non-SMP chip0: rev 0x02 on pci0.0.0 chip1: rev 0x01 on pci0.1.0 ide_pci0: rev 0x00 on pci0.1.1 vga0: rev 0x01 int a irq 0 on pci0.9.0 ncr0: rev 0x12 int a irq 9 on pci0.10.0 ncr0: waiting for scsi devices to settle scbus0 at ncr0 bus 0 st0 at scbus0 target 5 lun 0 st0: type 1 removable SCSI 2 st0: Sequential-Access st0: 10.0 MB/s (100 ns, offset 8) density code 0x13, variable blocks, write-enabled cd0 at scbus0 target 6 lun 0 cd0: type 5 removable SCSI 2 cd0: CD-ROM cd0: asynchronous. can't get the size fxp0: rev 0x04 int a irq 10 on pci0.11.0 fxp0: Ethernet address 00:a0:c9:b7:e3:87 ahc0: rev 0x01 int a irq 11 on pci0.12.0 ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs ahc0: waiting for scsi devices to settle scbus1 at ahc0 bus 0 sd0 at scbus1 target 0 lun 0 sd0: type 0 fixed SCSI 2 sd0: Direct-Access 2063MB (4226725 512 byte sectors) Sending SDTR!! sd1 at scbus1 target 1 lun 0 sd1: type 0 fixed SCSI 2 sd1: Direct-Access 4134MB (8467200 512 byte sectors) [isa probing deleted .... ] Then, after logging in, I can generate the below error at will, running something like bonnie, or a large make (world, buildworld) sd1: SCB 0x1 - timed out in dataout phase, SCSISIGI == 0xe6 SEQADDR = 0x129 SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x13 sd1: abort message in message buffer sd1: SCB 0x0 timedout while recovery in progress sd0: SCB 0x2 timedout while recovery in progress sd1: SCB 0x1 - timed out in dataout phase, SCSISIGI == 0xf6 SEQADDR = 0x129 SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x13 sd1: no longer in timeout ahc0: Issued Channel A Bus Reset. 4 SCBs aborted sd1: SCB 0x0 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 SEQADDR = 0x15f SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x0 sd1: Queueing an Abort SCB sd1: SCB 0x1 timedout while recovery in progress sd0: SCB 0x2 timedout while recovery in progress sd0: SCB 0x3 timedout while recovery in progress sd1: SCB 0x0 - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0 SEQADDR = 0x15f SCSISEQ = 0x12 SSTAT0 = 0x2 SSTAT1 = 0x0 sd1: no longer in timeout ahc0: Issued Channel A Bus Reset. 4 SCBs aborted sd0: UNIT ATTENTION asc:29,0 sd0: Power on, reset, or bus device reset occurred , retries:2 Sending SDTR!! The last line is related to how I have drive 1 jumpered currently, so that it doesn't generate unit attention, and initiates wide negeotiation after a reset. I've tried it without those jumpers as well, and the errors still occur. What I've tried: Different cables. Both drives as the end of the chain with termination enabled. Unit Atten on POR disabled Initiate Sync/Wide negeotiation on reset enabled. Manually setting HBA termination to 'ON/ON' as per manual. (HBA Bios setup) (there is nothing hooked to the external, or internal 50 pin connectors) Limiting speed to 10Mhz (HBA Bios setup) Different kernels (2.2.6-Rel, 980523-SNAP) generic and custom. This is a first experience for me on several fronts: Freebsd/2940 combination Ultra-wide scsi with any OS Freebsd/P6 combination I'm seeing this error on both drives, (both new) so I don't think it's a problem with the drives themselves. It 'feels' like a termination problem, though I've tried each of the drives in the last position with termination enabled. I suppose it could be a HBA problem, but it's also new, and this seems like an odd failure mode. Any suggestions would be helpful, I'm kind of at the end of my interrupt chain here. Thanks, Mike To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message