From owner-freebsd-net@FreeBSD.ORG Tue May 17 21:17:13 2005 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 656D616A4CE for ; Tue, 17 May 2005 21:17:13 +0000 (GMT) Received: from zixvpm01.seton.org (zixvpm01.seton.org [207.193.126.161]) by mx1.FreeBSD.org (Postfix) with ESMTP id BC0DE43D31 for ; Tue, 17 May 2005 21:17:12 +0000 (GMT) (envelope-from mgrooms@seton.org) Received: from zixvpm01.seton.org (ZixVPM [127.0.0.1]) by Outbound.seton.org (Proprietary) with ESMTP id 03EB936007B for ; Tue, 17 May 2005 16:17:08 -0500 (CDT) Received: from mx2-out.seton.org (unknown [10.21.254.241]) by zixvpm01.seton.org (Proprietary) with ESMTP id BB022330057 for ; Tue, 17 May 2005 16:17:07 -0500 (CDT) Received: from localhost (unknown [127.0.0.1]) by mx2-out.seton.org (Postfix) with ESMTP id 9B79D808 for ; Tue, 17 May 2005 15:10:03 -0500 (CDT) Received: from mx2-out.seton.org ([10.21.254.241]) by localhost (mx2 [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 18289-02 for ; Tue, 17 May 2005 15:10:03 -0500 (CDT) Received: from ausexfe02.seton.org (unknown [10.20.10.185]) by mx2-out.seton.org (Postfix) with ESMTP id 605BA7C9 for ; Tue, 17 May 2005 15:10:03 -0500 (CDT) Received: from [10.20.160.190] ([10.20.160.190]) by ausexfe02.seton.org with Microsoft SMTPSVC(6.0.3790.211); Tue, 17 May 2005 16:17:07 -0500 Message-ID: <428A606C.4070902@seton.org> Date: Tue, 17 May 2005 16:21:48 -0500 From: Matthew Grooms Organization: Seton Healthcare Network User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-net@freebsd.org References: <428A0E02.2010607@seton.org> In-Reply-To: <428A0E02.2010607@seton.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 17 May 2005 21:17:07.0363 (UTC) FILETIME=[C7C82F30:01C55B25] X-Virus-Scanned: by amavisd-new at seton.org Subject: Re: 5.4 amd64 kernel and em ... FIXED X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2005 21:17:13 -0000 I know its bad form to respond to myself. Anyhow, please disregard the previous post. The problem has been resolved. Thanks, -Matthew > All, > > Has anyone done any extensive testing with the em driver on a 5.4 > release amd64 SMP kernel? I have two boxes in a firewall setup that > contain 6 em interfaces each. The public interface on both of them ( em0 > ) will simply stop transmitting and then start working again after some > time all by themselves. > > I did a lot of testing with 5.3 release candidates and did not see > this behavior. Did anything go into the 5.4 kernel late in the release > cycle that could have effected this? > > My kernel config is GENERIC with the following modifications ... > > 1) removal of IPV6 and faith > 2) removal of USB, USB Ethernet and Firewire > 3) addition of SMP support > 4) addition of pf, pflog, pfsync, carp and ALTQ > > Detailed description of the problem ... > > From the firewall itself I could be pinging www.google.com and it will > just stop. After a few minutes to an hour or so later it will just start > working again. The really odd thing is that I can always ssh into the > box on private em interface. The really really odd thing is that I can > run tcpdump the public interface ( that I can't talk out of ) while the > problem occurs and see traffic on the wire like ... > > 1) ICMP packets still coming from ping on my firewall to google > ( maybe BPF picks it up early and the interface is dropping it ? ) > 2) ARP requests > 3) CDP advertisements > 4) Misc other broadcast traffic > > What I have tried so far to diagnose the issue ... > > 1) disabling pf using -d > 2) disabling SMP in kernel > 3) disabling carp in kernel > 4) disabling ALTQ in kernel > 5) hard coding the link speed to either half or full duplex > 6) trimming down my route table > 7) replacing both network cables > 8) moving to different ports on the switch > 9) moving to a different switch all together > 10) running with mpsafenet disabled > > What I am testing right now ... > > 1) disabling pf in the kernel > 2) disabling HTT in hardware > 3) disabling USB & Firewire in hardware > 4) sacrificing a chicken on the alter of the Ethernet gods > > Any help is _GREATLY_ appreciated as I have to get these boxes out into > production quickly. Am I missing something obvious? Could I be having a > resource conflict somehow? Could I be missing a lock assertion or LOR > for lack of witness or invariants? I will do whatever I can to provide > any info to help diagnose this problem. For starters, here is my kernel > config and dmesg output. > > http://hole.shrew.net/~mgrooms/files/freebsd/custom.txt > http://hole.shrew.net/~mgrooms/files/freebsd/dmesg.txt > > Thanks in advance, > > -Matthew >