Date: Fri, 19 Nov 2010 23:01:20 +0100 From: Ivan Voras <ivoras@freebsd.org> To: freebsd-hardware@freebsd.org Cc: freebsd-net@freebsd.org Subject: em card wedging Message-ID: <ic6s3h$mtn$1@dough.gmane.org>
next in thread | raw e-mail | index | archive | help
This problem is separate, on a separate system, from those I've been reporting the last few days, just in case someone read them all :) An on-board em card in a server (supermicro motherboard) wedges after a couple of minutes of operation and while there are continuous "watchdog timeout" messages on the console, it doesn't help the card and it stays wedged forever. When this problem happens, monitoring the network state with "netstat 1" suddenly starts outputing garbage values (large 64-bit numbers, always constant) for incoming and outgoing packet counts, like there is some kind of kernel memory corruption. This can be quickly provoked on-demand by doing flood-ping (ping -f). There are two ports to the card, em0 and em1 and if I transfer the Ethernet cable from em0 to em1 and bring it up, then *both* cards indicate in ifconfig status that they have signal (active) but after a few packets exchanged over em1 (DHCP) it also hangs. This is 8-stable amd64 (the behaviour was much worse on 8.0-release and 8.1-release - the card stopped working after a few seconds) with this hardware: em0: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xdc00-0xdc1f mem 0xfb5e0000-0xfb5fffff,0xfb5dc000-0xfb5dffff irq 16 at device 0.0 on pci3 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:25:90:0b:77:5c em1: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0xec00-0xec1f mem 0xfb6e0000-0xfb6fffff,0xfb6dc000-0xfb6dffff irq 17 at device 0.0 on pci4 em1: Using MSI interrupt em1: [FILTER] em1: Ethernet address: 00:25:90:0b:77:5d em0@pci0:3:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfb5e0000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xdc00, size 32, enabled bar [1c] = type Memory, range 32, base 0xfb5dc000, size 16384, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) cap 11[a0] = MSI-X supports 5 messages in map 0x1c em1@pci0:4:0:0: class=0x020000 card=0x040d15d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xfb6e0000, size 131072, enabled bar [18] = type I/O Port, range 32, base 0xec00, size 32, enabled bar [1c] = type Memory, range 32, base 0xfb6dc000, size 16384, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) cap 11[a0] = MSI-X supports 5 messages in map 0x1c Interestingly, IPMI, which also works over the same port (and is in fact on the same subnet as the "main" port) continues working while all this is happening. The BIOS configuration doesn't contain anything directly connected to advanced NIC settings but it contains several PCI-E settings, if there is a chance toggling them will work. While the card is wedged like this, the server cannot be shutdown or restarted by software - the whole machine hangs after flushing vnodes & buffers and has to be cold-cycled.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ic6s3h$mtn$1>