Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 May 2001 02:25:36 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        "Justin T. Gibbs" <gibbs@scsiguy.com>
Cc:        stable@FreeBSD.ORG
Subject:   Continuing ahc problems - also cause fxp failure
Message-ID:  <200105220925.f4M9Paf00409@earth.backplane.com>

next in thread | raw e-mail | index | archive | help
    This is getting weirder and weirder.

    4.2 or 4.3-RC
	AHC failure once or twice a month (as previously posted last month)
	FXP Ethernet (appeared to be) working perfectly

    RELENG_4	(After Justin's adaptec fix)
	FXP failure	one week       (old FXP driver)
	FXP failure	the next week  (old FXP driver)
	FXP *and* AHC failures tonight (new FXP driver)


    What I got tonight was basically a system lockup with the kernel 
    generating console messages every few seconds from both the FXP
    and the AHC drivers.  I *was* able to break into the debugger, but
    with ahc dead I couldn't generate a core.  I think the system itself
    is fine and the problem is somewhere in the AHC or FXP drivers.

    I had failures with the old FXP driver as well as the new, and the
    old driver hasn't changed in months so the problem is either a PCI
    bug (cycle timer issues?) or there are still AHC bugs.

    Note the time.  Not fun, but at least I managed to play with the console
    before someone else came in and rebooted the system :-)

    dmesg output is at the end.  Here is what I was seeing on the console:

    fxp0: SCB timeout: 0xe0, 0, 0x90, 0x400
    (other SCB timeout messages)
    fxp0: DMA timeout
    fxp0: command queue timeout
    fxp0: device timeout
	... various repetitions

    ahc0: issued channel A bus reset, 4 SCB's aborted
	  pci error interrupt at seqaddr 2
	  scb 0x40 timed out while IDLE seqaddr 0x181

	  stack 0x17e, e, e, e
	  SXFRCTL0 = 0x80
	  Dumping card state: SCSISEQ = 0x12, SBLKCTL = 0xA, SSTAT0 = 0x0,
	  SCB Count = 250

	  Kernel NEXTQSCB = 17
	  Card NEXTQSCB = 64

	  (I squiggled this down from the console so it is not an 
	  exact representation, but I think I got the meat).

    As I said, I was able to break into the debugger and apart from ahc
    and fxp being completely failed, nothing else was wrong.  

    The failure occured during the nightly dump.  The network was
    under a medium load (the backup is running over a T1) and the hard
    drives were probably under a heavy load.  All previous failures seemed
    to have occured in the wee hours of the morning during our nightly
    dumps.  The disks do not have an appreciable load during the day.

						-Matt

    Here is my dmesg output:

Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD 4.3-STABLE #2: Fri May 18 11:36:08 PDT 2001
    dillon@ns1.backplane.com:/usr/src/sys/compile/EARTH
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (531.65-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x681  Stepping = 1
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
real memory  = 536862720 (524280K bytes)
avail memory = 519012352 (506848K bytes)
Preloaded elf kernel "kernel" at 0xc0350000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <ServerWorks NB6635 3.0LE host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib1: <PCI to PCI bridge (vendor=8086 device=0962)> at device 2.0 on pci0
pci1: <PCI bus> on pcib1
ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> port 0xfc00-0xfcff mem 0xfcfff000-0xfcffffff irq 14 at device 4.0 on pci1
aic7890/91: Wide Channel A, SCSI Id=7, 32/255 SCBs
ahc1: <Adaptec aic7880 Ultra SCSI adapter> port 0xf800-0xf8ff mem 0xfcffe000-0xfcffefff irq 10 at device 6.0 on pci1
aic7880: Single Channel A, SCSI Id=7, 16/255 SCBs
fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xecc0-0xecff mem 0xfe000000-0xfe0fffff,0xfe101000-0xfe101fff irq 11 at device 8.0 on pci0
fxp0: Ethernet address 00:b0:d0:22:fb:03
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pci0: <ATI model 4759 graphics accelerator> at 14.0
isab0: <ServerWorks IB6566 PCI to ISA bridge> at device 15.0 on pci0
isa0: <ISA bus> on isab0
pcib2: <ServerWorks NB6635 3.0LE host to PCI bridge> on motherboard
pci2: <PCI bus> on pcib2
fxp1: <Intel Pro 10/100B/100+ Ethernet> port 0xdcc0-0xdcff mem 0xf6100000-0xf61fffff,0xf6201000-0xf6201fff irq 5 at device 6.0 on pci2
fxp1: Ethernet address 00:d0:b7:7e:75:c3
inphy1: <i82555 10/100 media interface> on miibus1
inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp2: <Intel Pro 10/100B/100+ Ethernet> port 0xdc80-0xdcbf mem 0xf6000000-0xf60fffff,0xf6200000-0xf6200fff irq 14 at device 8.0 on pci2
fxp2: Ethernet address 00:d0:b7:7e:77:31
inphy2: <i82555 10/100 media interface> on miibus2
inphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to deny, logging disabled
IPsec: Initialized Security Association Processing.
IP Filter: v3.4.16 initialized.  Default = pass all, Logging = disabled
Waiting 5 seconds for SCSI devices to settle
pass4 at ahc0 bus 0 target 6 lun 0
pass4: <DELL 1x4 U2W SCSI BP 5.35> Fixed Processor SCSI-2 device 
pass4: 3.300MB/s transfers
da2 at ahc0 bus 0 target 2 lun 0
da2: <QUANTUM ATLAS V 36 SCA 0201> Fixed Direct Access SCSI-3 device 
da2: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da2: 34732MB (71132998 512 byte sectors: 255H 63S/T 4427C)
da3 at ahc0 bus 0 target 3 lun 0
da3: <QUANTUM ATLAS V  9 SCA 0201> Fixed Direct Access SCSI-3 device 
da3: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da3: 8683MB (17783249 512 byte sectors: 255H 63S/T 1106C)
da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST336704LC 0004> Fixed Direct Access SCSI-3 device 
da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da0: 34732MB (71132960 512 byte sectors: 255H 63S/T 4427C)
da1 at ahc0 bus 0 target 1 lun 0
da1: <SEAGATE ST336704LC 0004> Fixed Direct Access SCSI-3 device 
da1: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
da1: 34732MB (71132960 512 byte sectors: 255H 63S/T 4427C)
cd0 at ahc1 bus 0 target 5 lun 0
cd0: <NEC CD-ROM DRIVE:466 1.06> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present
Mounting root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200105220925.f4M9Paf00409>