From owner-freebsd-net@FreeBSD.ORG Thu Jan 17 17:15:54 2008 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06B4416A419 for ; Thu, 17 Jan 2008 17:15:54 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from outbound0.mx.meer.net (outbound0.mx.meer.net [209.157.153.23]) by mx1.freebsd.org (Postfix) with ESMTP id 0785C13C458 for ; Thu, 17 Jan 2008 17:15:53 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from mail.meer.net (mail.meer.net [209.157.152.14]) by outbound0.mx.meer.net (8.12.10/8.12.6) with ESMTP id m0HHFr7T057609 for ; Thu, 17 Jan 2008 09:15:53 -0800 (PST) (envelope-from gnn@neville-neil.com) Received: from mail2.meer.net (mail2.meer.net [64.13.141.16]) by mail.meer.net (8.13.3/8.13.3/meer) with ESMTP id m0HHFrRI058715 for ; Thu, 17 Jan 2008 09:15:53 -0800 (PST) (envelope-from gnn@neville-neil.com) Received: from gnnbsd.hudson-trading.com.neville-neil.com ([66.150.84.1]) (authenticated bits=0) by mail2.meer.net (8.14.1/8.14.1) with ESMTP id m0HHFqV9030850 for ; Thu, 17 Jan 2008 09:15:53 -0800 (PST) (envelope-from gnn@neville-neil.com) Date: Thu, 17 Jan 2008 12:15:52 -0500 Message-ID: <7izlv4pe47.wl%gnn@neville-neil.com> From: gnn@freebsd.org To: net@freebsd.org User-Agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (=?ISO-8859-4?Q?Shij=F2?=) APEL/10.6 Emacs/21.3 (amd64--freebsd) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Cc: Subject: Are there known issues with multicast on Intel Pro 1000? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jan 2008 17:15:54 -0000 Howdy, At my current gig we find that the network interface locks up if we subject it to a high rate of multicast traffic. Since the whole purpose of this box is to do multicast (it absorbs a feed of data over multicast manipulates and then sends it out again over multicast) it's a "bad thing" if this kind of thing does not work. What I currently know is not complete but I figured I could start here. The symptom is that all network communication stops, but the system itself is still responsive, so I can get to the console and get information. Release: 6.2 and 6.3-PRERELEASE (6.3 as of Wed Jan 16th) `Motherboard: CPU: 2 x Intel Xeon X5365 3GHz (4 cores each) Memory: 8G em0: Intel PRO/1000 6.7.3 port 0x2000-0x201f mem 0xd8320000-0xd833ffff em1: Intel PRO/1000 6.7.3 port 0x2020-0x203f mem 0xd8320000-0xd833ffff em2: Intel PRO/1000 6.7.3 port 0x3000-0x303f mem 0xd8240000-0xd825ffff, 0xd8200000-0xd823ffff em3: Intel PRO/1000 6.7.3 port 0x3040-0x307f mem 0xd8260000-0xd827ffff Other data: em2 is the interface that multicasts out our digested data and it also is receiving a lot of digested multicast traffic, which is being recorded by a proprietary program sysctl dev.em.2.debug=1 em2: CTRL = 0x487c0a01 RCTL=0x8002 em2: Pcket buffer = Tx=16k Rx=48k em2: fifo workaround = 0, fifo_reset_count = 0 em2: hw tdh = 76, hw tdt = 76 em2: hw rdh = 213, hw rdt = 212 em2: Num Tx descriptors avail = 256 em2: Tx Descriptors not avail1 = 0 em2: Tx Descriptors not avail2 = 0 em2: Std mbuf failed = 0 em2: Std mbuf cluster fialed = 1247383 (this number is increasing by about 1 a second) em2: Driver dropped packets = 0 em2: Driver tx dma failure in encap = 0 sysctl dev.em.2.stats=1 (all are zero except what is recorded) em2: Missed Packets = 4683 em2: Receive No Buffers = 46905 em2: RX overruns = 83 em2: Good Packets Rcvd = 11416687 em2: Good Packets Xmtd = 146576 em0 is the interface we receive the raw data over multicast on em0: hw tdh = 130, hw tdt = 130 em0: hw rdh = 13, hw rdt = 12 em0: Num Tx descriptors avail = 256 em0: Std mbuf cluster failed = 5111461 (this number is going up by about 1 a second) sysctl dev.em.0.stats=1 (all are zero except what is recorded) em0: Missed Packets = 292778 em0: Receive No Buffers = 96211 em0: RX overruns = 1092 em0: Good Packets Rcvd = 5386001 em0: Good Packets Xmtd = 12418 em3 receives a little data from multicast and it is recorded using a proprietary program em3: hw tdh = 45, hw tdt = 45 em3: hw rdh = 216, hw rdt = 215 em3: Num Tx descriptors avail = 256 em3: Std mbuf cluster failed = 195951 (also going up by 1 very slowly) sysctl dev.em.3.stats=1 (all are zero except what is recorded) em3: Good Packets Rcvd = 9637851 em3: Good Packets Xmtd = 8237 One odd thing is that when the system boots, em1, which is unused in this case complains of: em1: Using MSI interrupt em1: Setup of Shared code failed What more do people need to help debug this? Best, George