From owner-freebsd-hackers Tue Apr 1 12:54:35 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id MAA29469 for hackers-outgoing; Tue, 1 Apr 1997 12:54:35 -0800 (PST) Received: from awfulhak.demon.co.uk (awfulhak.demon.co.uk [158.152.17.1]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id MAA29447 for ; Tue, 1 Apr 1997 12:54:30 -0800 (PST) Received: from awfulhak.demon.co.uk (localhost.lan.awfulhak.org [127.0.0.1]) by awfulhak.demon.co.uk (8.8.5/8.8.5) with ESMTP id VAA01628; Tue, 1 Apr 1997 21:17:02 +0100 (BST) Message-Id: <199704012017.VAA01628@awfulhak.demon.co.uk> X-Mailer: exmh version 1.6.9 8/22/96 To: Greg Rowe cc: Thomas David Rivers , hackers@FreeBSD.ORG Subject: Re: aha2940 problems on 2.1.7.1. In-reply-to: Your message of "Tue, 01 Apr 1997 08:28:28 MDT." <9704010828.ZM8938@nevis.oss.uswest.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 01 Apr 1997 21:17:02 +0100 From: Brian Somers Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk FWIW, I'm getting similar problems in -current. I'm a bit naieve when it comes to this stuff, and I can't confirm that the hardware works 100%, all I can say is that it worked 'till October last year and then stopped. Being lazy, I just stopped doing backups (this is on my home machine), and hoped the problem would go away. Then, last month, I lost a 4Gb drive !!!! All my company documents etc - I was lucky that I'd backed up my company accounts just that day ! Anyway, I'm getting the following: sd0: SCB 0x6 - timed out in command phase, SCSISIGI == 0x84 SEQADDR == 0x42 st0: abort message in message buffer sd0: SCB 0x7 timedout while recovery in progress st0: SCB 2 - Abort Completed Panic: Couldn't find next SCB Debugger ("panic") The trace was in panic() in ahc_reset_device() in ahc_handle_scsiint() in ahc_intr() in Xresume10() This is reproducable by trying to write to a tape - even a brand new one. The hardware when that happened was a 2940W. Since then, I've got myself a new 1542C - when this drive worked before, it was on a 1542 (albeit not the same one). The tape drive (a 4/8 Gb DAT drive) is now alone on the 1542 and produces the same results (without the sd0 problem of course). No panics happen with the DAT alone on its own controller, but lots of nasty printfs from the kernel (as above) rear their ugly heads. To be more specific, I can read and write a variable amount of data from or to the drive using tar or dump/restore before this happens, I'm pretty sure it's not getting to the end of any tapes (all 120M 4mm). I don't think I've missed any details in my ramblings Now that I've got a scenario where I can test without panicing (I hate panicing when the sync's don't work, I have no backup), I'm willing to try things out if anyone can suggest anything. I can also arrange for an account on the machine if anyone needs to reproduce stuff on demand. Cheers. $ dmesg Copyright (c) 1992-1997 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-CURRENT #0: Tue Apr 1 02:21:39 BST 1997 brian@awfulhak.demon.co.uk:/usr/src/sys/compile/AWFULHAK CPU: Pentium Pro (199.43-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x617 Stepping=7 Features=0xf9ff,MTRR,PGE,MCA,CMOV> real memory = 83886080 (81920K bytes) avail memory = 78352384 (76516K bytes) DEVFS: ready for devices bdevsw_add_generic: adding D_DISK flag for device 15 Probing for devices on PCI bus 0: chip0 rev 2 on pci0:0:0 chip1 rev 1 on pci0:7:0 chip2 rev 0 on pci0:7:1 vga0 rev 0 on pci0:11:0 ahc0 rev 0 int a irq 10 on pci0:15:0 ahc0: aic7870 Wide Channel, SCSI Id=7, 16 SCBs scbus1 at ahc0 bus 0 sd0 at scbus1 target 0 lun 0 sd0: type 0 fixed SCSI 2 sd0: Direct-Access 4153MB (8506782 512 byte sectors) sd0: with 3421 cyls, 18 heads, and an average 138 sectors/track sd1 at scbus1 target 1 lun 0 sd1: type 0 fixed SCSI 2 sd1: Direct-Access 507MB (1039329 512 byte sectors) sd1: with 2380 cyls, 6 heads, and an average 72 sectors/track sd2 at scbus1 target 2 lun 0 sd2: type 0 fixed SCSI 2 sd2: Direct-Access 2048MB (4194685 512 byte sectors) sd2: with 2621 cyls, 19 heads, and an average 84 sectors/track sd3 at scbus1 target 3 lun 0 sd3: type 0 fixed SCSI 2 sd3: Direct-Access 234MB (479350 512 byte sectors) sd3: with 1818 cyls, 4 heads, and an average 65 sectors/track sd4 at scbus1 target 4 lun 0 sd4: type 0 fixed SCSI 2 sd4: Direct-Access 1908MB (3907911 512 byte sectors) sd4: with 2621 cyls, 21 heads, and an average 71 sectors/track de0 rev 17 int a irq 9 on pci0:17:0 de0: 21041 [10Mb/s] pass 1.1 de0: address 00:00:c0:ff:e9:ce Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <12 virtual consoles, flags=0x0> sio0 at 0x3f8-0x3ff irq 4 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A sio2 at 0x3e8-0x3ef irq 5 on isa sio2: type 16550A sio3 not found at 0x2e8 <- de0 stole the IRQ instead of 15 :( lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface psm0 at 0x60-0x64 irq 12 on motherboard psm0: device ID 0, 2 buttons 100 nSEC ok, using 150 nSEC aha0 at 0x330-0x333 irq 11 drq 5 on isa scbus0 at aha0 bus 0 st0 at scbus0 target 5 lun 0 st0: type 1 removable SCSI 2 st0: Sequential-Access density code 0x24, variable blocks, write-enabled cd0 at scbus0 target 6 lun 0 cd0: type 5 removable SCSI 2 cd0: CD-ROM can't get the size fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: NEC 72065B fd0: 1.44MB 3.5in wdc0 at 0x1f0-0x1f7 irq 14 on isa wdc0: unit 0 (wd0): wd0: 1549MB (3173184 sectors), 3148 cyls, 16 heads, 63 S/T, 512 B/S npx0 on motherboard npx0: INT 16 interface changing root device to wd0a DEVFS: ready to run IP packet filtering initialized, divert enabled, logging disabled de0: enabling 10baseT port -- Brian , Don't _EVER_ lose your sense of humour....