From owner-freebsd-stable Tue Mar 5 13: 6:39 2002 Delivered-To: freebsd-stable@freebsd.org Received: from magnesium.net (toxic.magnesium.net [207.154.84.15]) by hub.freebsd.org (Postfix) with SMTP id 7439B37B404 for ; Tue, 5 Mar 2002 13:06:30 -0800 (PST) Received: (qmail 99402 invoked by uid 1001); 5 Mar 2002 21:06:30 -0000 Date: Tue, 5 Mar 2002 13:06:30 -0800 From: Bill Swingle To: stable@freebsd.org Cc: ops@swansystems.com Subject: DMA/SCB timeouts with fxp driver Message-ID: <20020305210629.GA93389@dub.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="pf9I7BMVVzbSWLtt" Content-Disposition: inline User-Agent: Mutt/1.3.27i X-Operating-System: FreeBSD toxic.magnesium.net 4.5-STABLE FreeBSD 4.5-STABLE Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --pf9I7BMVVzbSWLtt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I have a farm full of Rackable Systems 1U machines. Recently I've had a few machines that were crashing on a all-to-frequent basis. Just prior to crashing the machine would barf these lines into the logs several thousand times: Sometimes this appears just before the timeouts: Mar 5 12:10:29 and-app /kernel: NMI ISA 2c, EISA ff Mar 5 12:10:29 and-app /kernel: Mar 5 12:10:29 and-app /kernel: NMI ISA 2= c, EISA ff Then 1000's of these: Mar 5 12:10:45 and-app /kernel: fxp0: device timeout Mar 5 12:10:45 and-app /kernel: fxp0: SCB timeout: 0xff 0xff 0xff 0xffff Mar 5 12:10:45 and-app /kernel: fxp0: DMA timeout Mar 5 12:10:45 and-app /kernel: fxp0: SCB timeout: 0xff 0xff 0xff 0xffff Mar 5 12:10:45 and-app /kernel: fxp0: DMA timeout Mar 5 12:10:45 and-app /kernel: fxp0: SCB timeout: 0xff 0xff 0xff 0xffff Mar 5 12:10:45 and-app /kernel: fxp0: DMA timeout Mar 5 12:10:45 and-app /kernel: fxp0: SCB timeout: 0xff 0xff 0xff 0xffff I've seen this on machines running 4.4-STABLE from early December and again this morning on a machine running 4.5-STABLE from yesterday. None of these machines get large amounts of any sort of traffic. All the machines in question are identical hardware-wise. They're all dual PIII/900's, 512Mb RAM on a Tyan Thunder 251x motherboard.=20 dmesg and kernel config included below. Anyone have any ideas? Let me know if you need more info. I may also be able to provide a place for someone to take a look on the machine if need be. -Bill --=20 -=3D| Bill Swingle - -=3D| Every message PGP signed -=3D| Fingerprint: C1E3 49D1 EFC9 3EE0 EA6E 6414 5200 1C95 8E09 0223 -=3D| "Computers are useless. They can only give you answers" Pablo Picasso= =20 ------ START DMESG --------------- Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.5-STABLE #1: Mon Mar 4 16:07:24 PST 2002 bswingle@and-app.sf.swansystems.com:/usr/obj/usr/src/sys/AND-APP Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (863.93-MHz 686-class CPU) Origin =3D "GenuineIntel" Id =3D 0x686 Stepping =3D 6 Features=3D0x387fbff real memory =3D 536870912 (524288K bytes) avail memory =3D 519016448 (506852K bytes) Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 4, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 5, version: 0x000f0011, at 0xfec01000 Preloaded elf kernel "kernel" at 0xc02e5000. Pentium Pro MTRR support enabled Using $PIR table, 10 entries at 0xc00f51e0 npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard IOAPIC #1 intpin 6 -> irq 2 IOAPIC #1 intpin 4 -> irq 9 IOAPIC #1 intpin 5 -> irq 10 IOAPIC #0 intpin 10 -> irq 11 pci0: on pcib0 pci0: at 1.0 irq 2 fxp0: port 0xd400-0xd43f mem 0xfe900000-0= xfe9fffff,0xfeafe000-0xfeafefff irq 9 at device 4.0 on pci0 fxp0: Ethernet address 00:e0:81:01:85:a7 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp1: port 0xd000-0xd03f mem 0xfe700000-0= xfe7fffff,0xfeafd000-0xfeafdfff irq 10 at device 5.0 on pci0 fxp1: Ethernet address 00:e0:81:01:85:a8 inphy1: on miibus1 inphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto isab0: at device 15.0 on pci0 isa0: on isab0 atapci0: port 0xffa0-0xffaf at device = 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: at 15.2 irq 11 pcib1: on motherboard pci1: on pcib1 orm0: