Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Sep 2003 17:01:20 -0400
From:      Mike Tancsa <mike@sentex.net>
To:        John Polstra <jdp@polstra.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: recent stability problems with fxp driver
Message-ID:  <6.0.0.22.0.20030912165704.0353cbf0@209.112.4.2>
In-Reply-To: <XFMail.20030912135156.jdp@polstra.com>
References:  <6.0.0.22.0.20030912134112.05891060@209.112.4.2> <XFMail.20030912135156.jdp@polstra.com>

next in thread | previous in thread | raw e-mail | index | archive | help

I recall jlemon fixed similar fxp problems with certain integrated 815E 
NICS. The fix was to disable "dynamic standby mode".  Perhaps these problem 
versions need the same fix ?

         ---Mike


fxp0: <Intel Pro/100 Ethernet> port 0xc400-0xc43f mem 0xd5001000-0xd5001fff 
irq 11 at device 8.0 on pci1
fxp0: *** DISABLING DYNAMIC STANDBY MODE IN EEPROM ***
fxp0: New EEPROM ID: 0x49a0
fxp0: EEPROM checksum @ 0xff: 0xe441 -> 0xe443
fxp0: *** PLEASE REBOOT THE SYSTEM NOW FOR CORRECT OPERATION ***
fxp0: Ethernet address 00:01:80:02:d0:34
inphy0: <i82562EM 10/100 media interface> on miibus1
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto


At 04:51 PM 12/09/2003, John Polstra wrote:
>     Sep 12 10:18:22 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x90 0x0
>     Sep 12 10:18:31 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x90 0x0
>     Sep 12 10:18:32 thin su: jdp to root on /dev/ttyp1
>     Sep 12 10:18:39 thin /kernel: fxp0: DMA timeout
>     Sep 12 10:18:39 thin last message repeated 2 times
>     Sep 12 10:18:49 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0
>     Sep 12 10:18:51 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:18:54 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0
>     Sep 12 10:18:56 thin /kernel: fxp0: device timeout
>     Sep 12 10:18:56 thin /kernel: fxp0: DMA timeout
>     Sep 12 10:19:10 thin last message repeated 5 times
>     Sep 12 10:19:10 thin /kernel: fxp0: SCB timeout: 0x1 0x20 0x80 0x0
>     Sep 12 10:19:13 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0
>     Sep 12 10:19:14 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:19:15 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0
>     Sep 12 10:19:16 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0
>     Sep 12 10:19:36 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:19:38 thin /kernel: fxp0: device timeout
>     Sep 12 10:19:38 thin /kernel: fxp0: DMA timeout
>     Sep 12 10:19:38 thin last message repeated 2 times
>     Sep 12 10:19:52 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:19:54 thin /kernel: fxp0: device timeout
>     Sep 12 10:19:54 thin /kernel: fxp0: DMA timeout
>     Sep 12 10:19:54 thin last message repeated 2 times
>     Sep 12 10:20:00 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:20:21 thin /kernel: fxp0: device timeout
>     Sep 12 10:20:21 thin /kernel: fxp0: DMA timeout
>     Sep 12 10:20:21 thin last message repeated 2 times
>     Sep 12 10:20:29 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0
>     Sep 12 10:20:35 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:20:35 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0
>     Sep 12 10:21:04 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x90 0x0
>     Sep 12 10:21:09 thin /kernel: fxp0: device timeout
>     Sep 12 10:21:09 thin /kernel: fxp0: DMA timeout
>     Sep 12 10:21:09 thin last message repeated 2 times
>     Sep 12 10:21:09 thin /kernel: fxp0: command queue timeout
>     Sep 12 10:21:12 thin shutdown: reboot by jdp:
>
>This morning I tried regressing the driver to earlier versions in an
>attempt to find the commit that broke it.  Not good news:
>
>     RELENG_4_8_0_RELEASE        bad
>     RELENG_4_7_0_RELEASE        bad
>     RELENG_4_6_0_RELEASE        bad
>     RELENG_4_4_0_RELEASE        bad
>     RELENG_4_2_0_RELEASE        bad
>     RELENG_4_1_0_RELEASE        bad
>
>The problem is easier to reproduce in recent versions of the
>driver than in older versions.  With the current -stable driver, I
>can almost always kill the chips with a single transfer of that 560
>MB file.  With the 4.7.0 driver, it takes about 5 transfers before
>it fails.  With the 4.2.0 driver, it took 15+ transfers.
>
>The devices are Intel 82559 chips.  Here's their pciconf output:
>
>none0@pci0:1:0: class=0x020000 card=0x00da1028 chip=0x12298086 rev=0x08 
>hdr=0x00
>     vendor   = 'Intel Corporation'
>     device   = '82557/8/9 EtherExpress PRO/100(B) Ethernet Adapter'
>     class    = network
>     subclass = ethernet
>none1@pci0:2:0: class=0x020000 card=0x00da1028 chip=0x12298086 rev=0x08 
>hdr=0x00
>     vendor   = 'Intel Corporation'
>     device   = '82557/8/9 EtherExpress PRO/100(B) Ethernet Adapter'
>     class    = network
>     subclass = ethernet
>
>Maybe the problem really is in the Dell 1550.  I have various flavors
>of fxp card in several other machines, and I never have trouble with
>them.  I did check my firmware and BIOS versions, though, and they're
>fully up-to-date.  I have a suspicion that our driver may not be
>dealing properly with Dell's power management or IPMI stuff, but it's
>just a vague suspicion without any real evidence.
>
>John



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.0.0.22.0.20030912165704.0353cbf0>