From owner-freebsd-stable@FreeBSD.ORG Fri Sep 12 13:58:18 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DE41516A4BF for ; Fri, 12 Sep 2003 13:58:18 -0700 (PDT) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by mx1.FreeBSD.org (Postfix) with ESMTP id E5BCF43FCB for ; Fri, 12 Sep 2003 13:58:17 -0700 (PDT) (envelope-from mike@sentex.net) Received: from simian.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.12.9/8.12.8) with ESMTP id h8CKwFCl060148; Fri, 12 Sep 2003 16:58:15 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <6.0.0.22.0.20030912165704.0353cbf0@209.112.4.2> X-Sender: mdtpop@209.112.4.2 (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 6.0.0.22 Date: Fri, 12 Sep 2003 17:01:20 -0400 To: John Polstra From: Mike Tancsa In-Reply-To: References: <6.0.0.22.0.20030912134112.05891060@209.112.4.2> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Virus-Scanned: By Sentex Communications (lava/20020517) cc: Info Account cc: freebsd-stable@freebsd.org Subject: Re: recent stability problems with fxp driver X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Sep 2003 20:58:19 -0000 I recall jlemon fixed similar fxp problems with certain integrated 815E NICS. The fix was to disable "dynamic standby mode". Perhaps these problem versions need the same fix ? ---Mike fxp0: port 0xc400-0xc43f mem 0xd5001000-0xd5001fff irq 11 at device 8.0 on pci1 fxp0: *** DISABLING DYNAMIC STANDBY MODE IN EEPROM *** fxp0: New EEPROM ID: 0x49a0 fxp0: EEPROM checksum @ 0xff: 0xe441 -> 0xe443 fxp0: *** PLEASE REBOOT THE SYSTEM NOW FOR CORRECT OPERATION *** fxp0: Ethernet address 00:01:80:02:d0:34 inphy0: on miibus1 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto At 04:51 PM 12/09/2003, John Polstra wrote: > Sep 12 10:18:22 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x90 0x0 > Sep 12 10:18:31 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x90 0x0 > Sep 12 10:18:32 thin su: jdp to root on /dev/ttyp1 > Sep 12 10:18:39 thin /kernel: fxp0: DMA timeout > Sep 12 10:18:39 thin last message repeated 2 times > Sep 12 10:18:49 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0 > Sep 12 10:18:51 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:18:54 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0 > Sep 12 10:18:56 thin /kernel: fxp0: device timeout > Sep 12 10:18:56 thin /kernel: fxp0: DMA timeout > Sep 12 10:19:10 thin last message repeated 5 times > Sep 12 10:19:10 thin /kernel: fxp0: SCB timeout: 0x1 0x20 0x80 0x0 > Sep 12 10:19:13 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0 > Sep 12 10:19:14 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:19:15 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0 > Sep 12 10:19:16 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0 > Sep 12 10:19:36 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:19:38 thin /kernel: fxp0: device timeout > Sep 12 10:19:38 thin /kernel: fxp0: DMA timeout > Sep 12 10:19:38 thin last message repeated 2 times > Sep 12 10:19:52 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:19:54 thin /kernel: fxp0: device timeout > Sep 12 10:19:54 thin /kernel: fxp0: DMA timeout > Sep 12 10:19:54 thin last message repeated 2 times > Sep 12 10:20:00 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:20:21 thin /kernel: fxp0: device timeout > Sep 12 10:20:21 thin /kernel: fxp0: DMA timeout > Sep 12 10:20:21 thin last message repeated 2 times > Sep 12 10:20:29 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x50 0x0 > Sep 12 10:20:35 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:20:35 thin /kernel: fxp0: SCB timeout: 0x80 0x0 0x50 0x0 > Sep 12 10:21:04 thin /kernel: fxp0: SCB timeout: 0x70 0x0 0x90 0x0 > Sep 12 10:21:09 thin /kernel: fxp0: device timeout > Sep 12 10:21:09 thin /kernel: fxp0: DMA timeout > Sep 12 10:21:09 thin last message repeated 2 times > Sep 12 10:21:09 thin /kernel: fxp0: command queue timeout > Sep 12 10:21:12 thin shutdown: reboot by jdp: > >This morning I tried regressing the driver to earlier versions in an >attempt to find the commit that broke it. Not good news: > > RELENG_4_8_0_RELEASE bad > RELENG_4_7_0_RELEASE bad > RELENG_4_6_0_RELEASE bad > RELENG_4_4_0_RELEASE bad > RELENG_4_2_0_RELEASE bad > RELENG_4_1_0_RELEASE bad > >The problem is easier to reproduce in recent versions of the >driver than in older versions. With the current -stable driver, I >can almost always kill the chips with a single transfer of that 560 >MB file. With the 4.7.0 driver, it takes about 5 transfers before >it fails. With the 4.2.0 driver, it took 15+ transfers. > >The devices are Intel 82559 chips. Here's their pciconf output: > >none0@pci0:1:0: class=0x020000 card=0x00da1028 chip=0x12298086 rev=0x08 >hdr=0x00 > vendor = 'Intel Corporation' > device = '82557/8/9 EtherExpress PRO/100(B) Ethernet Adapter' > class = network > subclass = ethernet >none1@pci0:2:0: class=0x020000 card=0x00da1028 chip=0x12298086 rev=0x08 >hdr=0x00 > vendor = 'Intel Corporation' > device = '82557/8/9 EtherExpress PRO/100(B) Ethernet Adapter' > class = network > subclass = ethernet > >Maybe the problem really is in the Dell 1550. I have various flavors >of fxp card in several other machines, and I never have trouble with >them. I did check my firmware and BIOS versions, though, and they're >fully up-to-date. I have a suspicion that our driver may not be >dealing properly with Dell's power management or IPMI stuff, but it's >just a vague suspicion without any real evidence. > >John