From owner-freebsd-net@FreeBSD.ORG  Fri Sep  4 17:41:17 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4AA69106568D
	for <freebsd-net@freebsd.org>; Fri,  4 Sep 2009 17:41:17 +0000 (UTC)
	(envelope-from manishv@lineratesystems.com)
Received: from mail-vw0-f189.google.com (mail-vw0-f189.google.com
	[209.85.212.189])
	by mx1.freebsd.org (Postfix) with ESMTP id DCEE28FC08
	for <freebsd-net@freebsd.org>; Fri,  4 Sep 2009 17:41:16 +0000 (UTC)
Received: by vws27 with SMTP id 27so823129vws.3
	for <freebsd-net@freebsd.org>; Fri, 04 Sep 2009 10:41:16 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.89.2 with SMTP id c2mr15897591vcm.16.1252086075920; Fri, 
	04 Sep 2009 10:41:15 -0700 (PDT)
In-Reply-To: <3413.68869.qm@web56406.mail.re3.yahoo.com>
References: <9e20d71e0909040601s100688c2m7d7f73eb187f4809@mail.gmail.com>
	<3413.68869.qm@web56406.mail.re3.yahoo.com>
Date: Fri, 4 Sep 2009 11:41:15 -0600
Message-ID: <5bc218350909041041x49ec9765k81346e90bbfe891a@mail.gmail.com>
From: Manish Vachharajani <manishv@lineratesystems.com>
To: alexpalias-bsdnet@yahoo.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-net@freebsd.org, Artis Caune <artis.caune@gmail.com>
Subject: Re: em driver input errors
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Sep 2009 17:41:17 -0000

Just decided to follow this thread as it seems to be related to some
issues we are seeing as well.

It appears that under heavy packet loads, the kernel cannot pull
packets off the NIC fast enough and thus is slow to free up
descriptors into which the NIC can DMA packets.  This causes the NIC
to drop the packet after it's internal queue fills up (and record the
packet as missed) because the hardware does not have enough
descriptors to write the packets into.  We ahve this issue with the
ixgbe 10 Gb/s card though the absolute packet rates at which we see a
problem are higher than those reported here.

In our test scenario the problem gets worse with many simultaneous TCP
connections, but the issue is the same.  Under high packet rates, the
driver cannot keep up and the NIC reports missed packets.  The issue
is not related to data throughput though as turning on jumbo frames
solves our issue for a fixed number of connections, and it seems here
that reducing the packet rate makes the misses go away.  More
importantly, in our tests, only the receiver sees a problem, the
transmitter is fine.

There was also another thread about problems with UDP throughput that
I suspect are caused by the same type of packet rate spikes.

The question is, why is the kernel stack slow to handle these packet
rates, doing some back of the envelope calculations, they don't seem
too bad?  Where is the time going?  And, are our problem, the UDP
issue, and this problem all caused by the same source of slowness or
are they three unrelated issues.

Manish

On Fri, Sep 4, 2009 at 11:14 AM, <alexpalias-bsdnet@yahoo.com> wrote:
> --- On Fri, 9/4/09, Artis Caune <artis.caune@gmail.com> wrote:
>
>> Is it still actual?
>
> Hello. =A0Yes, this is still actual.
>
> 1> netstat -nbhI em0 ; uptime
> Name =A0 =A0Mtu Network =A0 =A0 =A0 Address =A0 =A0 =A0 =A0 =A0 =A0 =A0Ip=
kts Ierrs =A0 =A0 Ibytes =A0 =A0Opkts Oerrs =A0 =A0 Obytes =A0Coll
> em0 =A0 =A01500 <Link#1> =A0 =A0 =A000:14:22:17:80:dc =A0 =A0 =A031G =A0 =
93M =A0 =A0 =A0 =A018T =A0 =A0 =A036G =A0 =A0 0 =A0 =A0 =A0 =A027T =A0 =A0 =
0
> =A07:50PM =A0up 23 days, 15:40, 1 user, load averages: 0.84, 1.05, 1.16
>
> The huge number of input errors is due to a 80-100kpps flood we received =
via that interface, which got the errors/sec numbers up in the 50k/s range =
for a few minutes.
>
>> You didn't mention if you are using pf or other firewall.
>
> Sorry if I didn't mention it. =A0I am using pf, but have tried "kldunload=
 pf" and the errors didn't disappear.
>
>> I have similar problem with two boxes replicating zfs
>> pools, when I
>> noticed input errors.
>> After some investigation turns out it was pf overhead, even
>> though I
>> was skipping on interfaces where zfs sedn/recv.
>>
>> With pf enables (and skip) I can copy 50-80MB/s with
>> 50-80Kpps and
>> 0-100+ input drops per second.
>> With pf disabled I can copy constantly with 102 or 93 MB/s
>> and
>> 110-131Kpps, few drops (because 1 CPU almost eaten).
>
> This is the kind of traffic I am seeing:
>
> Errors/second (5 minute average) per interface:
> http://www.dataxnet.ro/alex/errors.png
> Packets/second (5 minute average) per interface:
> http://www.dataxnet.ro/alex/packets.png
>
> Those graphs were saved a few minutes ago, times are EEST (GMT+3)
>
> I'm sorry I don't have the Mbits/s graphs up, I haven't been collecting t=
hat data per interface recently (it's collected per vlan).
>
> Alex
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>