From owner-freebsd-bugs@FreeBSD.ORG Sun Apr 20 10:40:02 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B92541065671 for ; Sun, 20 Apr 2008 10:40:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 945CB8FC21 for ; Sun, 20 Apr 2008 10:40:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m3KAe2r8066365 for ; Sun, 20 Apr 2008 10:40:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m3KAe28F066364; Sun, 20 Apr 2008 10:40:02 GMT (envelope-from gnats) Resent-Date: Sun, 20 Apr 2008 10:40:02 GMT Resent-Message-Id: <200804201040.m3KAe28F066364@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Neil Hoggarth Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54374106566B for ; Sun, 20 Apr 2008 10:36:51 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 40F688FC0C for ; Sun, 20 Apr 2008 10:36:51 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m3KAaWa4095135 for ; Sun, 20 Apr 2008 10:36:32 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.2/8.14.1/Submit) id m3KAaWHV095134; Sun, 20 Apr 2008 10:36:32 GMT (envelope-from nobody) Message-Id: <200804201036.m3KAaWHV095134@www.freebsd.org> Date: Sun, 20 Apr 2008 10:36:32 GMT From: Neil Hoggarth To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/122928: em net interface watchdog timeouts and stops receiving packets X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Apr 2008 10:40:02 -0000 >Number: 122928 >Category: kern >Synopsis: em net interface watchdog timeouts and stops receiving packets >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Apr 20 10:40:02 UTC 2008 >Closed-Date: >Last-Modified: >Originator: Neil Hoggarth >Release: 7.0-STABLE >Organization: >Environment: FreeBSD neilhoggarth-2.dsl.easynet.co.uk 7.0-STABLE FreeBSD 7.0-STABLE #0: Thu Apr 3 00:57:21 BST 2008 root@neilhoggarth-2.dsl.easynet.co.uk:/usr/obj/usr/src/sys/GENERIC amd64 >Description: I have a recently built system using an ASRock AM2NF3-VSTA motherboard and an AMD Athlon X2 BE-2400 dual-core processor, running 7-STABLE (amd64), with the GENERIC kernel. This system uses an Intel PRO/1000 MT PCI Gigabit Ethernet adaptor (with a 82540EM chip). The same adaptor was used for several years in my previous system, a uniprocessor AthlonXP system running 6-STALE (i386), where it worked without any visible issues. The Intel PCI card provides interface em0, which is connected to my home LAN (a 3Com 3C1670500 OfficeConnect Gigabit switch). The motherboard's built-in ethernet provides interface nfe0, which is connected to my ADSL router. The system uses ipf(5) for filtering incoming packets on nfe0, and ipnat(5) to NAT connections from other machines on the LAN to the public Internet. The machine frequently gets into a state where traffic no longer reliably passes through the em0 interface (devices on the LAN can no longer communicate with the machine itself or with the public Internet), and the kernel starts logging an endless succession of messages of the form: Apr 12 07:38:50 neilhoggarth-2 kernel: em0: watchdog timeout -- resetting Apr 12 07:38:50 neilhoggarth-2 kernel: em0: link state changed to DOWN Apr 12 07:38:52 neilhoggarth-2 kernel: em0: link state changed to UP Apr 12 07:38:57 neilhoggarth-2 kernel: em0: watchdog timeout -- resetting Apr 12 07:38:57 neilhoggarth-2 kernel: em0: link state changed to DOWN Apr 12 07:39:01 neilhoggarth-2 kernel: em0: link state changed to UP As far as I can tell from using tcpdump on the machine itself and a laptop connected to the internal LAN, the machine still sends packets out through the em0 interface, but does not receive incoming packets. The only way that I have found of recovering normal function is to reboot the machine. The same problem was present when I first installed 7.0-RELEASE. I have tried a variety of different things in an attempt to gather more information or find a workaround, including building a kernel with WITNESS and INVARIANTS (which didn't seem to provide any extra diagnostics), building a kernel with DEVICE_POLLING and enabling polling on the em0 interface, building a kernel without PREEMPTION, and building a kernel without the SMP option. I have also experimented with running the interface at various other speeds (all the way down to 10baseT half-duplex) rather than allowing it to autonegotiate 1000baseTX. The problem still occurs under all the circumstances that I've tried. Any suggestions on how I can try to debug this further would be appreciated. pciconf -lv: ============ hostb0@pci0:0:0:0: class=0x060000 card=0x00e11849 chip=0x00e110de rev=0xa1 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 Host/PCI Bridge' class = bridge subclass = HOST-PCI isab0@pci0:0:1:0: class=0x060100 card=0x00e01849 chip=0x00e010de rev=0xa2 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 LPC Interface Bridge' class = bridge subclass = PCI-ISA none0@pci0:0:1:1: class=0x0c0500 card=0x00e41849 chip=0x00e410de rev=0xa1 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 PCI System Management' class = serial bus subclass = SMBus ohci0@pci0:0:2:0: class=0x0c0310 card=0x00e71849 chip=0x00e710de rev=0xa1 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 OpenHCD USB Controller' class = serial bus subclass = USB ohci1@pci0:0:2:1: class=0x0c0310 card=0x00e71849 chip=0x00e710de rev=0xa1 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 OpenHCD USB Controller' class = serial bus subclass = USB ehci0@pci0:0:2:2: class=0x0c0320 card=0x00e81849 chip=0x00e810de rev=0xa2 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 Enhanced PCI to USB Controller' class = serial bus subclass = USB nfe0@pci0:0:5:0: class=0x068000 card=0x00df1849 chip=0x00df10de rev=0xa2 hdr=0x00 vendor = 'Nvidia Corp' device = 'Marvell 88E1111 Network adapter' class = bridge atapci0@pci0:0:8:0: class=0x01018a card=0x00e51849 chip=0x00e510de rev=0xa2 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 Parallel ATA Controller' class = mass storage subclass = ATA atapci1@pci0:0:10:0: class=0x010185 card=0x00e31849 chip=0x00e310de rev=0xa2 hdr=0x00 vendor = 'Nvidia Corp' device = 'nForce3 250 Serial ATA Controller' class = mass storage subclass = ATA pcib1@pci0:0:11:0: class=0x060400 card=0x00000000 chip=0x00e210de rev=0xa2 hdr=0x01 vendor = 'Nvidia Corp' device = 'nForce3 250 AGP Host to PCI Bridge' class = bridge subclass = PCI-PCI pcib2@pci0:0:14:0: class=0x060400 card=0x00000000 chip=0x00ed10de rev=0xa2 hdr=0x01 vendor = 'Nvidia Corp' device = 'nForce3 250 PCI-PCI Bridge' class = bridge subclass = PCI-PCI hostb1@pci0:0:24:0: class=0x060000 card=0x00000000 chip=0x11001022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices (AMD)' device = '(K8) Athlon 64/Opteron HyperTransport Technology Configuration' class = bridge subclass = HOST-PCI hostb2@pci0:0:24:1: class=0x060000 card=0x00000000 chip=0x11011022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices (AMD)' device = '(K8) Athlon 64/Opteron Address Map' class = bridge subclass = HOST-PCI hostb3@pci0:0:24:2: class=0x060000 card=0x00000000 chip=0x11021022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices (AMD)' device = '(K8) Athlon 64/Opteron DRAM Controller' class = bridge subclass = HOST-PCI hostb4@pci0:0:24:3: class=0x060000 card=0x00000000 chip=0x11031022 rev=0x00 hdr=0x00 vendor = 'Advanced Micro Devices (AMD)' device = '(K8) Athlon 64/Opteron Miscellaneous Control' class = bridge subclass = HOST-PCI vgapci0@pci0:1:0:0: class=0x030000 card=0x2062148c chip=0x59611002 rev=0x01 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'RV280 ATI RADEON 9200 se agp' class = display subclass = VGA vgapci1@pci0:1:0:1: class=0x038000 card=0x2063148c chip=0x59411002 rev=0x01 hdr=0x00 vendor = 'ATI Technologies Inc' device = 'RV280 ATI Radeon 9200 - Secondary' class = display atapci2@pci0:2:8:0: class=0x010400 card=0x00011103 chip=0x00081103 rev=0x07 hdr=0x00 vendor = 'Triones Technologies Inc. (HighPoint)' device = 'HPT374 Rocket 154x/1640, RocketRAID 154x/1640 RAID EIDE Controller' class = mass storage subclass = RAID atapci3@pci0:2:8:1: class=0x010400 card=0x00011103 chip=0x00081103 rev=0x07 hdr=0x00 vendor = 'Triones Technologies Inc. (HighPoint)' device = 'HPT374 Rocket 154x/1640, RocketRAID 154x/1640 RAID EIDE Controller' class = mass storage subclass = RAID em0@pci0:2:9:0: class=0x020000 card=0x002e8086 chip=0x100e8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82540EM Gigabit Ethernet Controller' class = network subclass = ethernet vmstat -i: ========== interrupt total rate irq14: ata0 525 0 irq15: ata1 246 0 irq16: drm0 172525 48 irq17: atapci2+ 132203 36 irq18: em0 72904 20 irq20: nfe0 ohci0 13790 3 irq21: ohci1+ 1 0 irq22: ehci0 1149 0 cpu0: timer 7146437 1999 cpu1: timer 7138376 1997 Total 14678156 4106 dmesg: ====== Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-STABLE #0: Thu Apr 3 00:57:21 BST 2008 root@neilhoggarth-2.dsl.easynet.co.uk:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) X2 Dual Core Processor BE-2400 (2331.98-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x60fb2 Stepping = 2 Features=0x178bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x11f Cores per package: 2 usable memory = 8575258624 (8178 MB) avail memory = 8289517568 (7905 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Changing APIC ID to 2 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, d5f00000 (3) failed Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 cpu0: on acpi0 powernow0: on cpu0 device_attach: powernow0 attach returned 6 cpu1: on acpi0 powernow1: on cpu1 device_attach: powernow1 attach returned 6 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: on hostb0 isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) ohci0: mem 0xfebff000-0xfebfffff irq 20 at device 2.0 on pci0 ohci0: [GIANT-LOCKED] ohci0: [ITHREAD] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 4 ports with 4 removable, self powered ohci1: mem 0xfebfe000-0xfebfefff irq 21 at device 2.1 on pci0 ohci1: [GIANT-LOCKED] ohci1: [ITHREAD] usb1: OHCI version 1.0, legacy support usb1: SMM does not respond, resetting usb1: on ohci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 4 ports with 4 removable, self powered ehci0: mem 0xfebfdc00-0xfebfdcff irq 22 at device 2.2 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb2: EHCI version 1.0 usb2: companion controllers, 4 ports each: usb0 usb1 usb2: on ehci0 usb2: USB revision 2.0 uhub2: on usb2 uhub2: 8 ports with 8 removable, self powered uhub3: on uhub2 uhub3: single transaction translator uhub3: 4 ports with 4 removable, self powered uscanner0: on uhub3 uhid0: on uhub3 ukbd0: on uhub3 kbd2 at ukbd0 uhid1: on uhub3 nfe0: port 0xec00-0xec07 mem 0xfebfc000-0xfebfcfff irq 20 at device 5.0 on pci0 miibus0: on nfe0 rlphy0: PHY 1 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto nfe0: Ethernet address: 00:19:66:47:21:9d nfe0: [FILTER] atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 8.0 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port 0xf80-0xf87,0xf00-0xf03,0xe80-0xe87,0xe00-0xe03,0xe000-0xe00f,0xd800-0xd87f irq 21 at device 10.0 on pci0 atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] pcib1: at device 11.0 on pci0 pci1: on pcib1 vgapci0: port 0xa000-0xa0ff mem 0xe8000000-0xefffffff,0xfe9f0000-0xfe9fffff irq 16 at device 0.0 on pci1 vgapci1: mem 0xe0000000-0xe7ffffff,0xfe9e0000-0xfe9effff at device 0.1 on pci1 pcib2: at device 14.0 on pci0 pci2: on pcib2 atapci2: port 0xcc00-0xcc07,0xc880-0xc883,0xc800-0xc807,0xc480-0xc483,0xc000-0xc0ff irq 17 at device 8.0 on pci2 atapci2: [ITHREAD] ata4: on atapci2 ata4: [ITHREAD] ata5: on atapci2 ata5: [ITHREAD] atapci3: port 0xc400-0xc407,0xbc00-0xbc03,0xb880-0xb887,0xb800-0xb803,0xb400-0xb4ff irq 17 at device 8.1 on pci2 atapci3: [ITHREAD] ata6: on atapci3 ata6: [ITHREAD] ata7: on atapci3 ata7: [ITHREAD] em0: port 0xb080-0xb0bf mem 0xfeac0000-0xfeadffff,0xfeaa0000-0xfeabffff irq 18 at device 9.0 on pci2 em0: Ethernet address: 00:0e:0c:06:c2:3a em0: [FILTER] acpi_button0: on acpi0 sio0: configured irq 3 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 3 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: port 0x2f8-0x2ff irq 3 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] ppc0: port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: on ppc0 ppbus0: [ITHREAD] plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 ppc0: [GIANT-LOCKED] ppc0: [ITHREAD] sio1: configured irq 4 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 4 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 on acpi0 sio1: type 16550A sio1: [FILTER] orm0: at iomem 0xc0000-0xccfff,0xcd000-0xce7ff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 uaudio0: on uhub0 uaudio0: audio rev 1.00 pcm0: on uaudio0 uhid2: on uhub0 WARNING: ZFS is considered to be an experimental feature in FreeBSD. Timecounters tick every 1.000 msec ad0: DMA limited to UDMA33, device found non-ATA66 cable ad0: FAILURE - SET_MULTI status=51 error=4 ad0: 976MB at ata0-master UDMA33 ZFS filesystem version 6 ZFS storage pool version 6 acd0: DVDR at ata1-master UDMA66 ad8: 715404MB at ata4-master UDMA133 ad10: 715404MB at ata5-master UDMA133 acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x48 0x00 0x01 acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x48 0x00 0x01 SMP: AP CPU #1 Launched! cd0 at ata1 bus 0 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 66.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present Trying to mount root from zfs:newtank IP Filter: v4.1.28 initialized. Default = pass all, Logging = enabled em0: link state changed to UP >How-To-Repeat: The problem happens intermittently, typically with a frequency on the order of 2 or 3 times a week, and I have not found a way to repeatably trigger it. >Fix: >Release-Note: >Audit-Trail: >Unformatted: