From owner-freebsd-stable Tue May 22 2:25:52 2001 Delivered-To: freebsd-stable@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 704EF37B422 for ; Tue, 22 May 2001 02:25:45 -0700 (PDT) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.3/8.11.2) id f4M9Paf00409; Tue, 22 May 2001 02:25:36 -0700 (PDT) (envelope-from dillon) Date: Tue, 22 May 2001 02:25:36 -0700 (PDT) From: Matt Dillon Message-Id: <200105220925.f4M9Paf00409@earth.backplane.com> To: "Justin T. Gibbs" Cc: stable@FreeBSD.ORG Subject: Continuing ahc problems - also cause fxp failure Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This is getting weirder and weirder. 4.2 or 4.3-RC AHC failure once or twice a month (as previously posted last month) FXP Ethernet (appeared to be) working perfectly RELENG_4 (After Justin's adaptec fix) FXP failure one week (old FXP driver) FXP failure the next week (old FXP driver) FXP *and* AHC failures tonight (new FXP driver) What I got tonight was basically a system lockup with the kernel generating console messages every few seconds from both the FXP and the AHC drivers. I *was* able to break into the debugger, but with ahc dead I couldn't generate a core. I think the system itself is fine and the problem is somewhere in the AHC or FXP drivers. I had failures with the old FXP driver as well as the new, and the old driver hasn't changed in months so the problem is either a PCI bug (cycle timer issues?) or there are still AHC bugs. Note the time. Not fun, but at least I managed to play with the console before someone else came in and rebooted the system :-) dmesg output is at the end. Here is what I was seeing on the console: fxp0: SCB timeout: 0xe0, 0, 0x90, 0x400 (other SCB timeout messages) fxp0: DMA timeout fxp0: command queue timeout fxp0: device timeout ... various repetitions ahc0: issued channel A bus reset, 4 SCB's aborted pci error interrupt at seqaddr 2 scb 0x40 timed out while IDLE seqaddr 0x181 stack 0x17e, e, e, e SXFRCTL0 = 0x80 Dumping card state: SCSISEQ = 0x12, SBLKCTL = 0xA, SSTAT0 = 0x0, SCB Count = 250 Kernel NEXTQSCB = 17 Card NEXTQSCB = 64 (I squiggled this down from the console so it is not an exact representation, but I think I got the meat). As I said, I was able to break into the debugger and apart from ahc and fxp being completely failed, nothing else was wrong. The failure occured during the nightly dump. The network was under a medium load (the backup is running over a T1) and the hard drives were probably under a heavy load. All previous failures seemed to have occured in the wee hours of the morning during our nightly dumps. The disks do not have an appreciable load during the day. -Matt Here is my dmesg output: Copyright (c) 1992-2001 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.3-STABLE #2: Fri May 18 11:36:08 PDT 2001 dillon@ns1.backplane.com:/usr/src/sys/compile/EARTH Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (531.65-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x681 Stepping = 1 Features=0x383fbff real memory = 536862720 (524280K bytes) avail memory = 519012352 (506848K bytes) Preloaded elf kernel "kernel" at 0xc0350000. Pentium Pro MTRR support enabled md0: Malloc disk npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 ahc0: port 0xfc00-0xfcff mem 0xfcfff000-0xfcffffff irq 14 at device 4.0 on pci1 aic7890/91: Wide Channel A, SCSI Id=7, 32/255 SCBs ahc1: port 0xf800-0xf8ff mem 0xfcffe000-0xfcffefff irq 10 at device 6.0 on pci1 aic7880: Single Channel A, SCSI Id=7, 16/255 SCBs fxp0: port 0xecc0-0xecff mem 0xfe000000-0xfe0fffff,0xfe101000-0xfe101fff irq 11 at device 8.0 on pci0 fxp0: Ethernet address 00:b0:d0:22:fb:03 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto pci0: at 14.0 isab0: at device 15.0 on pci0 isa0: on isab0 pcib2: on motherboard pci2: on pcib2 fxp1: port 0xdcc0-0xdcff mem 0xf6100000-0xf61fffff,0xf6201000-0xf6201fff irq 5 at device 6.0 on pci2 fxp1: Ethernet address 00:d0:b7:7e:75:c3 inphy1: on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp2: port 0xdc80-0xdcbf mem 0xf6000000-0xf60fffff,0xf6200000-0xf6200fff irq 14 at device 8.0 on pci2 fxp2: Ethernet address 00:d0:b7:7e:77:31 inphy2: on miibus2 inphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: on isa0 sc0: VGA <16 virtual consoles, flags=0x200> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to deny, logging disabled IPsec: Initialized Security Association Processing. IP Filter: v3.4.16 initialized. Default = pass all, Logging = disabled Waiting 5 seconds for SCSI devices to settle pass4 at ahc0 bus 0 target 6 lun 0 pass4: Fixed Processor SCSI-2 device pass4: 3.300MB/s transfers da2 at ahc0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da2: 34732MB (71132998 512 byte sectors: 255H 63S/T 4427C) da3 at ahc0 bus 0 target 3 lun 0 da3: Fixed Direct Access SCSI-3 device da3: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da3: 8683MB (17783249 512 byte sectors: 255H 63S/T 1106C) da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 34732MB (71132960 512 byte sectors: 255H 63S/T 4427C) da1 at ahc0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 34732MB (71132960 512 byte sectors: 255H 63S/T 4427C) cd0 at ahc1 bus 0 target 5 lun 0 cd0: Removable CD-ROM SCSI-2 device cd0: 20.000MB/s transfers (20.000MHz, offset 15) cd0: Attempt to query device size failed: NOT READY, Medium not present Mounting root from ufs:/dev/da0s1a WARNING: / was not properly dismounted To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message