From owner-freebsd-net@FreeBSD.ORG Thu Sep 27 18:02:03 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D41C71065670 for ; Thu, 27 Sep 2012 18:02:03 +0000 (UTC) (envelope-from crapsh@monkeybrains.net) Received: from lavash.monkeybrains.net (mail.monkeybrains.net [208.69.40.9]) by mx1.freebsd.org (Postfix) with ESMTP id B8C968FC15 for ; Thu, 27 Sep 2012 18:02:02 +0000 (UTC) Received: from [10.6.35.123] (208-90-212-192.PUBLIC.monkeybrains.net [208.90.212.192]) (authenticated bits=0) by lavash.monkeybrains.net (8.14.4/8.14.4) with ESMTP id q8RI1uxO020307 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Thu, 27 Sep 2012 11:01:56 -0700 (PDT) (envelope-from crapsh@monkeybrains.net) Message-ID: <50649457.4050701@monkeybrains.net> Date: Thu, 27 Sep 2012 11:00:55 -0700 From: Rudy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120827 Thunderbird/15.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org References: <5060884C.3050709@monkeybrains.net> <506154C7.3040209@sepehrs.com> <50615F6F.1070105@monkeybrains.net> <50616D5C.705@gmail.com> In-Reply-To: <50616D5C.705@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.5 at lavash.monkeybrains.net X-Virus-Status: Clean Subject: Re: ping: sendto: No buffer space available X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Sep 2012 18:02:03 -0000 On 09/25/2012 01:37 AM, Hooman Fazaeli wrote: >> dev.em.1.link_irq: 6379725883 >> dev.em.2.link_irq: 6379294926 > Based on the strangely high value of dev.em.1.link_irq (which means too > many link > status changes: down -> up -> down -> ....), I guess the problem is the > same as > discussed in these threads: > > http://lists.freebsd.org/pipermail/freebsd-net/2011-November/030424.html > http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031648.html > > To confirm, you may run this test: > > 1. Start a ping flood: ping -f > 2. Let it run for a few seconds. > 3. Disconnect the cable. > 4. After a while, you should see "no buffer space" error. > 5. Stop ping flood. > 6. Re-connect the cable and wait 10 seconds. > 7. Start a normal ping. Error messages should show up again. > > To fix, upgrade to the latest e1000 driver from HEAD. > > The very high link_irq may be due to a loose connection. > Replace the patch cord and see if it helps. Thanks for the tips. I will test next time I am at the data center. For now, I rebooted after doubled the default nmbclusters and quadrupled the hw.em.rxd values in loader.conf. # loader.conf kern.ipc.nmbclusters=524288 hw.igb.rxd=4096 hw.igb.txd=4096 hw.em.rxd=4096 hw.em.txd=4096 Rebooting and/or the settings change seems to have stopped the errors. Here is a pretty little graph showing error rate on em1 for the past 3 days. http://www.monkeybrains.net/images/ErrorRate-em1.png Rudy