From owner-freebsd-net@FreeBSD.ORG Fri Sep 4 17:41:17 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4AA69106568D for ; Fri, 4 Sep 2009 17:41:17 +0000 (UTC) (envelope-from manishv@lineratesystems.com) Received: from mail-vw0-f189.google.com (mail-vw0-f189.google.com [209.85.212.189]) by mx1.freebsd.org (Postfix) with ESMTP id DCEE28FC08 for ; Fri, 4 Sep 2009 17:41:16 +0000 (UTC) Received: by vws27 with SMTP id 27so823129vws.3 for ; Fri, 04 Sep 2009 10:41:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.89.2 with SMTP id c2mr15897591vcm.16.1252086075920; Fri, 04 Sep 2009 10:41:15 -0700 (PDT) In-Reply-To: <3413.68869.qm@web56406.mail.re3.yahoo.com> References: <9e20d71e0909040601s100688c2m7d7f73eb187f4809@mail.gmail.com> <3413.68869.qm@web56406.mail.re3.yahoo.com> Date: Fri, 4 Sep 2009 11:41:15 -0600 Message-ID: <5bc218350909041041x49ec9765k81346e90bbfe891a@mail.gmail.com> From: Manish Vachharajani To: alexpalias-bsdnet@yahoo.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org, Artis Caune Subject: Re: em driver input errors X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Sep 2009 17:41:17 -0000 Just decided to follow this thread as it seems to be related to some issues we are seeing as well. It appears that under heavy packet loads, the kernel cannot pull packets off the NIC fast enough and thus is slow to free up descriptors into which the NIC can DMA packets. This causes the NIC to drop the packet after it's internal queue fills up (and record the packet as missed) because the hardware does not have enough descriptors to write the packets into. We ahve this issue with the ixgbe 10 Gb/s card though the absolute packet rates at which we see a problem are higher than those reported here. In our test scenario the problem gets worse with many simultaneous TCP connections, but the issue is the same. Under high packet rates, the driver cannot keep up and the NIC reports missed packets. The issue is not related to data throughput though as turning on jumbo frames solves our issue for a fixed number of connections, and it seems here that reducing the packet rate makes the misses go away. More importantly, in our tests, only the receiver sees a problem, the transmitter is fine. There was also another thread about problems with UDP throughput that I suspect are caused by the same type of packet rate spikes. The question is, why is the kernel stack slow to handle these packet rates, doing some back of the envelope calculations, they don't seem too bad? Where is the time going? And, are our problem, the UDP issue, and this problem all caused by the same source of slowness or are they three unrelated issues. Manish On Fri, Sep 4, 2009 at 11:14 AM, wrote: > --- On Fri, 9/4/09, Artis Caune wrote: > >> Is it still actual? > > Hello. =A0Yes, this is still actual. > > 1> netstat -nbhI em0 ; uptime > Name =A0 =A0Mtu Network =A0 =A0 =A0 Address =A0 =A0 =A0 =A0 =A0 =A0 =A0Ip= kts Ierrs =A0 =A0 Ibytes =A0 =A0Opkts Oerrs =A0 =A0 Obytes =A0Coll > em0 =A0 =A01500 =A0 =A0 =A000:14:22:17:80:dc =A0 =A0 =A031G =A0 = 93M =A0 =A0 =A0 =A018T =A0 =A0 =A036G =A0 =A0 0 =A0 =A0 =A0 =A027T =A0 =A0 = 0 > =A07:50PM =A0up 23 days, 15:40, 1 user, load averages: 0.84, 1.05, 1.16 > > The huge number of input errors is due to a 80-100kpps flood we received = via that interface, which got the errors/sec numbers up in the 50k/s range = for a few minutes. > >> You didn't mention if you are using pf or other firewall. > > Sorry if I didn't mention it. =A0I am using pf, but have tried "kldunload= pf" and the errors didn't disappear. > >> I have similar problem with two boxes replicating zfs >> pools, when I >> noticed input errors. >> After some investigation turns out it was pf overhead, even >> though I >> was skipping on interfaces where zfs sedn/recv. >> >> With pf enables (and skip) I can copy 50-80MB/s with >> 50-80Kpps and >> 0-100+ input drops per second. >> With pf disabled I can copy constantly with 102 or 93 MB/s >> and >> 110-131Kpps, few drops (because 1 CPU almost eaten). > > This is the kind of traffic I am seeing: > > Errors/second (5 minute average) per interface: > http://www.dataxnet.ro/alex/errors.png > Packets/second (5 minute average) per interface: > http://www.dataxnet.ro/alex/packets.png > > Those graphs were saved a few minutes ago, times are EEST (GMT+3) > > I'm sorry I don't have the Mbits/s graphs up, I haven't been collecting t= hat data per interface recently (it's collected per vlan). > > Alex > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >