From owner-freebsd-net@FreeBSD.ORG Thu Jan 17 20:09:03 2008 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E51016A419 for ; Thu, 17 Jan 2008 20:09:03 +0000 (UTC) (envelope-from rrs@cisco.com) Received: from sj-iport-3.cisco.com (sj-iport-3-in.cisco.com [171.71.176.72]) by mx1.freebsd.org (Postfix) with ESMTP id 6CFDE13C467 for ; Thu, 17 Jan 2008 20:09:03 +0000 (UTC) (envelope-from rrs@cisco.com) Received: from sj-dkim-3.cisco.com ([171.71.179.195]) by sj-iport-3.cisco.com with ESMTP; 17 Jan 2008 12:09:03 -0800 Received: from sj-core-1.cisco.com (sj-core-1.cisco.com [171.71.177.237]) by sj-dkim-3.cisco.com (8.12.11/8.12.11) with ESMTP id m0HK93Wt014544; Thu, 17 Jan 2008 12:09:03 -0800 Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by sj-core-1.cisco.com (8.12.10/8.12.6) with ESMTP id m0HK8wnx003977; Thu, 17 Jan 2008 20:09:03 GMT Received: from xfe-sjc-212.amer.cisco.com ([171.70.151.187]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 17 Jan 2008 12:08:53 -0800 Received: from [128.107.109.215] ([128.107.109.215]) by xfe-sjc-212.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 17 Jan 2008 12:08:52 -0800 Message-ID: <478FB53B.60808@cisco.com> Date: Thu, 17 Jan 2008 15:06:19 -0500 From: Randall Stewart User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.13) Gecko/20070601 X-Accept-Language: en-us, en MIME-Version: 1.0 To: gnn@freebsd.org References: <7izlv4pe47.wl%gnn@neville-neil.com> In-Reply-To: <7izlv4pe47.wl%gnn@neville-neil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 Jan 2008 20:08:52.0975 (UTC) FILETIME=[C81867F0:01C85944] DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; l=4023; t=1200600543; x=1201464543; c=relaxed/simple; s=sjdkim3002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=rrs@cisco.com; z=From:=20Randall=20Stewart=20 |Subject:=20Re=3A=20Are=20there=20known=20issues=20with=20m ulticast=20on=20Intel=20Pro=201000? |Sender:=20; bh=fF3GsA+FjJxkc5lvEkeMeIofc+5/PCkeyfxe+MmVax4=; b=UoeQ95Rlhc0aoQBnjqnVa6c0v1JfY0UzQ1v4Zou1dnl3ZrBHec3fIO9a6p FmIotJTC+aTe2SxQhIaU2RtjjqTH43RsY2c7YYT8wmWo+kCBcXimMht729fB YzsQCk+dNJ; Authentication-Results: sj-dkim-3; header.From=rrs@cisco.com; dkim=pass ( sig from cisco.com/sjdkim3002 verified; ); Cc: net@freebsd.org Subject: Re: Are there known issues with multicast on Intel Pro 1000? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jan 2008 20:09:03 -0000 gnn@freebsd.org wrote: > Howdy, > > At my current gig we find that the network interface locks up if we > subject it to a high rate of multicast traffic. Since the whole > purpose of this box is to do multicast (it absorbs a feed of data over > multicast manipulates and then sends it out again over multicast) it's > a "bad thing" if this kind of thing does not work. > > What I currently know is not complete but I figured I could start > here. > > The symptom is that all network communication stops, but the system > itself is still responsive, so I can get to the console and get > information. If you let it run long enough does it eventually lock up? I have seen similar behavior when a lock is not released when I was breaking things :-) Everything is fine EXCEPT the interface.. for a while.. then eventually you get a train-wreck :-) I would drop to ddb and do the show locks.. Also I believe top (or ps) will tell you what locks are being waited on in a course way... I think the ps in DDB will do this. R > > Release: 6.2 and 6.3-PRERELEASE (6.3 as of Wed Jan 16th) > > `Motherboard: > > CPU: 2 x Intel Xeon X5365 3GHz (4 cores each) > > Memory: 8G > > em0: Intel PRO/1000 6.7.3 port 0x2000-0x201f mem 0xd8320000-0xd833ffff > em1: Intel PRO/1000 6.7.3 port 0x2020-0x203f mem 0xd8320000-0xd833ffff > em2: Intel PRO/1000 6.7.3 port 0x3000-0x303f mem 0xd8240000-0xd825ffff, 0xd8200000-0xd823ffff > em3: Intel PRO/1000 6.7.3 port 0x3040-0x307f mem 0xd8260000-0xd827ffff > > Other data: > > em2 is the interface that multicasts out our digested data and it also > is receiving a lot of digested multicast traffic, which is being > recorded by a proprietary program > > sysctl dev.em.2.debug=1 > em2: CTRL = 0x487c0a01 RCTL=0x8002 > em2: Pcket buffer = Tx=16k Rx=48k > em2: fifo workaround = 0, fifo_reset_count = 0 > em2: hw tdh = 76, hw tdt = 76 > em2: hw rdh = 213, hw rdt = 212 > em2: Num Tx descriptors avail = 256 > em2: Tx Descriptors not avail1 = 0 > em2: Tx Descriptors not avail2 = 0 > em2: Std mbuf failed = 0 > em2: Std mbuf cluster fialed = 1247383 (this number is increasing by about 1 a > second) > em2: Driver dropped packets = 0 > em2: Driver tx dma failure in encap = 0 > sysctl dev.em.2.stats=1 > (all are zero except what is recorded) > em2: Missed Packets = 4683 > em2: Receive No Buffers = 46905 > em2: RX overruns = 83 > em2: Good Packets Rcvd = 11416687 > em2: Good Packets Xmtd = 146576 > > em0 is the interface we receive the raw data over multicast on > > em0: hw tdh = 130, hw tdt = 130 > em0: hw rdh = 13, hw rdt = 12 > em0: Num Tx descriptors avail = 256 > em0: Std mbuf cluster failed = 5111461 (this number is going up by about 1 a > second) > sysctl dev.em.0.stats=1 > (all are zero except what is recorded) > em0: Missed Packets = 292778 > em0: Receive No Buffers = 96211 > em0: RX overruns = 1092 > em0: Good Packets Rcvd = 5386001 > em0: Good Packets Xmtd = 12418 > > em3 receives a little data from multicast and it is recorded using > a proprietary program > > em3: hw tdh = 45, hw tdt = 45 > em3: hw rdh = 216, hw rdt = 215 > em3: Num Tx descriptors avail = 256 > em3: Std mbuf cluster failed = 195951 (also going up by 1 very slowly) > > sysctl dev.em.3.stats=1 > (all are zero except what is recorded) > em3: Good Packets Rcvd = 9637851 > em3: Good Packets Xmtd = 8237 > > > > One odd thing is that when the system boots, em1, which is unused in > this case complains of: > > em1: Using MSI interrupt > em1: Setup of Shared code failed > > > > What more do people need to help debug this? > > Best, > George > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- Randall Stewart NSSTG - Cisco Systems Inc. 803-345-0369 803-317-4952 (cell)