From owner-freebsd-net@FreeBSD.ORG Wed May 12 14:28:11 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 73A1116A4CE for ; Wed, 12 May 2004 14:28:11 -0700 (PDT) Received: from cube.gelatinous.com (rdns.106.161.62.64.fre.communitycolo.net [64.62.161.106]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0413743D41 for ; Wed, 12 May 2004 14:28:11 -0700 (PDT) (envelope-from scott@gelatinous.com) Received: (qmail 49120 invoked from network); 12 May 2004 21:28:10 -0000 Received: from dsl093-129-198.sfo4.dsl.speakeasy.net (HELO ?192.168.1.183?) (66.93.129.198)SMTP; 12 May 2004 21:28:10 -0000 From: "Scott T. Smith" To: freebsd-net@freebsd.org Content-Type: text/plain Message-Id: <1084397289.8017.30.camel@tinny.home.foo> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Wed, 12 May 2004 14:28:09 -0700 Content-Transfer-Encoding: 7bit Subject: em driver losing packets X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 May 2004 21:28:11 -0000 I have a Sun 1U server with 2 built in Intel Pro/1000 "LOMs" (though I had the exact same problem with a previous machine using a standalone Intel NIC). I notice that after the machine has been up for 12-20 hours, the network card starts dropping packets. Here is the relevant dmesg info: em0: port 0x2040-0x207f mem 0xfe680000-0xfe69ffff irq 30 at device 7.0 on pci3 em0: Speed:N/A Duplex:N/A em1: port 0x2000-0x203f mem 0xfe6a0000-0xfe6bffff irq 31 at device 7.1 on pci3 em1: Speed:N/A Duplex:N/A .... em0: Link is up 100 Mbps Full Duplex em1: Link is up 1000 Mbps Full Duplex .... Limiting icmp unreach response from 1770 to 200 packets/sec ^^^ Not sure what this is, but I received a bunch of them after everything was working and before everything stopped working .... em1: Excessive collisions = 0 em1: Symbol errors = 0 em1: Sequence errors = 0 em1: Defer count = 0 em1: Missed Packets = 1682 em1: Receive No Buffers = 75 em1: Receive length errors = 0 em1: Receive errors = 0 em1: Crc errors = 0 em1: Alignment errors = 0 em1: Carrier extension errors = 0 em1: XON Rcvd = 0 em1: XON Xmtd = 0 em1: XOFF Rcvd = 0 em1: XOFF Xmtd = 0 em1: Good Packets Rcvd = 119975570 em1: Good Packets Xmtd = 164 em1: Adapter hardware address = 0xc76262ec em1:tx_int_delay = 66, tx_abs_int_delay = 66 em1:rx_int_delay = 488, rx_abs_int_delay = 977 em1: fifo workaround = 0, fifo_reset = 0 em1: hw tdh = 170, hw tdt = 170 em1: Num Tx descriptors avail = 256 em1: Tx Descriptors not avail1 = 0 em1: Tx Descriptors not avail2 = 0 em1: Std mbuf failed = 0 em1: Std mbuf cluster failed = 0 em1: Driver dropped packets = 0 I was running 5.2.1-RELEASE with em driver version 1.7.19 or 1.7.17 (I forget what it comes with). I had the problems so I backported 1.7.25 from 5.2.1-STABLE as of May 10. Same issue. Notice the "missed packets" and "receive no buffers". I assume that means the network card ran out of memory? How much memory does it have? If it uses the mainboard memory, can I make that amount any bigger? The odd thing (which is why I think this is a driver issue) is that it works just fine when the machine is first booted. I am driving approximately 680 mbits/sec of UDP traffic; 1316 byte packets. The only other traffic is arp traffic (em1 has a netmask of 255.255.255.255). I have this problem whether I use kernel polling (HZ=1000) or with rx_abs_int_delay=1000, or with rx_abs_int_delay=500. If I shut off the rx_*int_delay, then CPU load goes to 100% and I still have the same problem. With the abs delay at 1000, cpu load is 90% (about split evenly between user and system). If you have any ideas I'd really appreciate it. Thanks! I'm thinking of trying to backport 1.7.31. Scott