From owner-freebsd-net@FreeBSD.ORG  Sun Oct 14 13:49:49 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 7B2F53A8;
 Sun, 14 Oct 2012 13:49:49 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com
 [209.85.220.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 417FC8FC14;
 Sun, 14 Oct 2012 13:49:49 +0000 (UTC)
Received: by mail-pa0-f54.google.com with SMTP id bi1so4351322pad.13
 for <multiple recipients>; Sun, 14 Oct 2012 06:49:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type
 :content-transfer-encoding;
 bh=Zq3rSCSY4EAKrnS4QSnj1pJyFlKnPSKdAJfdwX+pjis=;
 b=FuLz4ioUpk3zSV5mE3/DsZDRnZLFuc+JnoJbDRvG4z87EJqIYLZGz2FvOJohRG5DAC
 tCA01MNsEPf9dulbvZZnGLI3dum1MLEdoYKreFs6iXL3ZOUgNWrUAlXmVWUCL9ist+Cn
 W5Izl5xusH7VELw+H4cFoe+rYP4vq1DNNtq8konEtJokwI66xZZO5cn31MKRdzNPBRb1
 IhGwveXjqHxdL9prZxM5IgzEXFjF/EndW4PutqI9zX5yRhNvCan9SiShRjuJd/wdurgS
 pFhztU4uolfH/vq+2M4DD/Bcq/k2Ovch0DAX/RlmcEZpig4CnCuT0smwuB70AyiNiCLC
 LR2w==
MIME-Version: 1.0
Received: by 10.68.218.226 with SMTP id pj2mr29566933pbc.33.1350222589004;
 Sun, 14 Oct 2012 06:49:49 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Sun, 14 Oct 2012 06:49:48 -0700 (PDT)
In-Reply-To: <2b582820-0095-4dbe-b929-ba5eb9d4e0ee@email.android.com>
References: <20121009154128.GU34622@FreeBSD.org>
 <20121012124640.GW89655@FreeBSD.org>
 <20121012124709.GX89655@FreeBSD.org>
 <CAJ-VmomVRH6gAA5busSVAgCa0As7v=HF41XQSL_BUx=NXRj04w@mail.gmail.com>
 <20121012212151.GB89655@glebius.int.ru>
 <2b582820-0095-4dbe-b929-ba5eb9d4e0ee@email.android.com>
Date: Sun, 14 Oct 2012 06:49:48 -0700
X-Google-Sender-Auth: GZ6TXz2zLdDLiraPr0soyP0cCW8
Message-ID: <CAJ-VmonoRbhNXznKTt=9FcLecVwNRxUzb5_-kGOkn_fAiFvBRA@mail.gmail.com>
Subject: Re: [CFT/Review] net byte order for AF_INET
From: Adrian Chadd <adrian@freebsd.org>
To: Aleksandr Rybalko <ray@ddteam.net>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2012 13:49:49 -0000

.. sounds like the beginning of a wiki page to me, describing the mini
project, the latest status and the latest patch.

:)


Adrian


On 13 October 2012 11:32, Aleksandr Rybalko <ray@ddteam.net> wrote:
> Gleb Smirnoff <glebius@FreeBSD.org> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=
=D0=BB(=D0=B0):
>
>>On Fri, Oct 12, 2012 at 05:06:29PM -0400, Adrian Chadd wrote:
>>A> On 12 October 2012 08:47, Gleb Smirnoff <glebius@freebsd.org> wrote:
>>A> > On Fri, Oct 12, 2012 at 04:46:40PM +0400, Gleb Smirnoff wrote:
>>A> > T>   Latest version of patch for further review and testing
>>A> > T> Changelog:
>>A> > T>  - Fixed TCP checksums
>>A> > T>  - Added comment about raw sockets byte ordering.
>>A> > T>  - More explicit htons(0), when assigning ip_off field.
>>A>
>>A> I've just eyeballed the patch again:
>>A>
>>A> * You've patched SCTP and IGMP - have you done any SCTP and IGMP
>>testing at all?
>>A> * This kind of stuff almost begs for some kind of automated test
>>suite
>>A> for testing IPv4, IPv6, TCP/UDP/ICMP, IGMP, SCTP, all the tunneling
>>A> stuff - is there anything out there like this? I know of the IPv6
>>test
>>A> suites that exist; what about being able to regression test the
>>other
>>A> stuff?
>>
>>Not tested yet:
>>
>>SCTP
>>IGMP
>>IPSEC
>>siftr(4)
>>mrouting
>>pfsync, pf_route()
>>stf(4)
>>ng_ipfw(4)
>
> No, ng_ipfw tested :-)
>
>>
>>Tested:
>>
>>TCP/UDP/ICMP
>>ip_fragment/ip_reass
>>raw socket
>>gre(4) as if_gre and as ng_pptpgre
>>gif(4)
>>pf(4)
>>ipfw(4)
>>divert(4)
>>
>>A> Also whilst I'm nitpicking - do you think there's any performance
>>A> issues that may creep up? Remember that "performance issues" to me
>>A> don't necessarily mean "on a current generation intel", but mean
>>"all
>>A> those cache starved ARM/MIPS/PPC/Atom boards out there that aren't
>>A> natively in network byte order." Making everything use network byte
>>A> order throughout the stack is nice for read-only packet work and
>>nice
>>A> for cache-happy i386s, but what about the rest of the world?
>>
>>Well, there may be unmeasurable impact. Just a few instructions per
>>packet. Some functions may be optimized to store converted length in
>>local variable and perform one or two ntohs() operations less. But
>>better as a separate change. We've got much more fat optimization
>>targets in stack than this.
>>
>>A> (Don't get me wrong, I think this tidy-up is very nice and maybe
>>quite
>>A> needed, I just wonder what other unknown magic is hiding behind the
>>A> existing code..)
>>
>>There is so much magic here, and I want to just wipe it away instead
>>of learning it to depths. The motivation to finally start this work and
>>get it done is several panics due to packet in wrong byte order, which
>>I
>>am failing to parse and model out which codepath could lead to them.
>>Thus
>>I decided to fix that in principle.
>
>
> WBW
> ------
> Aleksandr Rybalko <ray@ddteam.net>
>
>

From owner-freebsd-net@FreeBSD.ORG  Sun Oct 14 13:55:47 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 18CC75CF;
 Sun, 14 Oct 2012 13:55:47 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com
 [209.85.220.54])
 by mx1.freebsd.org (Postfix) with ESMTP id D5FE38FC08;
 Sun, 14 Oct 2012 13:55:46 +0000 (UTC)
Received: by mail-pa0-f54.google.com with SMTP id bi1so4353414pad.13
 for <multiple recipients>; Sun, 14 Oct 2012 06:55:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=42R/G8v5yC7536why14Pqza21s9/btAOdnVaH+UYAjo=;
 b=Jo+OggeNwMVvHcRTzME03lg9wceGYfgrhxQdglQ9y2aOEvVukLYpJHcEbegh/lKEex
 lm3hlFiua2RyKkPrgAtQsh6i6I39Q+sRCwPqSk9ff6MDgYlHJ55CQGV0a34R+FPhgGOg
 JiEWLQ3j6OohxS7jHdMXwowIBbFSjDZqVWTaA/WFqCpz8XmxMgXrretZ8UH5ByBpy322
 d3MlraZPlRQz7mPyRJlhmFDKCabJRHbNhq0Y2yRb1LvhH69k9725wSNszhmwt++8kTYG
 omTmIr9O6ppd/rL3DkFH2IYWbTKxQoHb+LLBX4WW6ZR1VaVq6FMoPfgDKfNM5qIj++pi
 P1UA==
MIME-Version: 1.0
Received: by 10.68.218.226 with SMTP id pj2mr29600629pbc.33.1350222946683;
 Sun, 14 Oct 2012 06:55:46 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Sun, 14 Oct 2012 06:55:46 -0700 (PDT)
In-Reply-To: <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
References: <5079A9A1.4070403@FreeBSD.org>
 <20121013182223.GA73341@onelab2.iet.unipi.it>
 <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
Date: Sun, 14 Oct 2012 06:55:46 -0700
X-Google-Sender-Auth: g4l8vmBdocUwV5o4qnVjz_8hsaE
Message-ID: <CAJ-Vmo=Tgc3ZVsmHMzf7wg5rWumLacnp=y12Tv=dX4P23Fo2JA@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Adrian Chadd <adrian@freebsd.org>
To: Jack Vogel <jfvogel@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2012 13:55:47 -0000

God, yes please. Please please please please.



Adrian

From owner-freebsd-net@FreeBSD.ORG  Sun Oct 14 16:44:38 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 52D89D8B;
 Sun, 14 Oct 2012 16:44:38 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id 234B78FC0A;
 Sun, 14 Oct 2012 16:44:38 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9EGicMe002255;
 Sun, 14 Oct 2012 16:44:38 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9EGic0D002251;
 Sun, 14 Oct 2012 16:44:38 GMT (envelope-from linimon)
Date: Sun, 14 Oct 2012 16:44:38 GMT
Message-Id: <201210141644.q9EGic0D002251@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/172683: [ip6] Duplicate IPv6 Link Local Addresses
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2012 16:44:38 -0000

Old Synopsis: Duplicate IPv6 Link Local Addresses
New Synopsis: [ip6] Duplicate IPv6 Link Local Addresses

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Sun Oct 14 16:44:21 UTC 2012
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=172683

From owner-freebsd-net@FreeBSD.ORG  Sun Oct 14 23:30:51 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 4360EEA6;
 Sun, 14 Oct 2012 23:30:51 +0000 (UTC)
 (envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id 1395D8FC08;
 Sun, 14 Oct 2012 23:30:51 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9ENUodX037025;
 Sun, 14 Oct 2012 23:30:50 GMT
 (envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9ENUobR037021;
 Sun, 14 Oct 2012 23:30:50 GMT (envelope-from linimon)
Date: Sun, 14 Oct 2012 23:30:50 GMT
Message-Id: <201210142330.q9ENUobR037021@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Subject: Re: kern/171838: [oce] [patch] Possible lock reversal and duplicate
 locks as reported by Witness
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2012 23:30:51 -0000

Old Synopsis: Possible lock reversal and duplicate locks as reported by Witness
New Synopsis: [oce] [patch] Possible lock reversal and duplicate locks as reported by Witness

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Sun Oct 14 23:30:23 UTC 2012
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=171838

From owner-freebsd-net@FreeBSD.ORG  Sun Oct 14 23:50:01 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 146C646D
 for <freebsd-net@hub.freebsd.org>; Sun, 14 Oct 2012 23:50:01 +0000 (UTC)
 (envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id F110E8FC08
 for <freebsd-net@hub.freebsd.org>; Sun, 14 Oct 2012 23:50:00 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9ENo0sg037688
 for <freebsd-net@freefall.freebsd.org>; Sun, 14 Oct 2012 23:50:00 GMT
 (envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9ENo0ZT037687;
 Sun, 14 Oct 2012 23:50:00 GMT (envelope-from gnats)
Date: Sun, 14 Oct 2012 23:50:00 GMT
Message-Id: <201210142350.q9ENo0ZT037687@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
Cc: 
From: Doug Hardie <bc979@lafn.org>
Subject: Re: kern/172683: [ip6] Duplicate IPv6 Link Local Addresses
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Doug Hardie <bc979@lafn.org>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2012 23:50:01 -0000

The following reply was made to PR kern/172683; it has been noted by GNATS.

From: Doug Hardie <bc979@lafn.org>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/172683: [ip6] Duplicate IPv6 Link Local Addresses
Date: Sun, 14 Oct 2012 16:42:50 -0700

 Here is some more interesting information on the issue.  RFC 4862 states =
 that if the link-local address is MAC based then it should bring down =
 the link and log the duplicate address error.  Kurt Jaeger pointed out =
 in a private email that there is a sysctl that is supposed to control =
 this behavior: net.inet6.ip6.dad_count.  The value 1 is to permit the =
 interface to continue to operate and the value 2 is to stop operation.  =
 Sure enough net.inet6.ip6.dad_count =3D 1 as the default value.  I =
 changed it to 2 and ran the tests again.  There was no change in the =
 performance.  The interface remained in use and nothing was logged in =
 /var/log/messages.  Unfortunately I no longer have a Vista machine to =
 test with since it generates non-MAC related link-local addresses.  XP =
 and Win 7 both use MAC based addresses.  Using Vista talking to FreeBSD =
 7.2, the Neighbor Advertisement was returned by FreeBSD and Vista chose =
 another link-local address.=

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 01:00:19 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id EF714813;
 Mon, 15 Oct 2012 01:00:19 +0000 (UTC)
 (envelope-from emaste@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id C04228FC0A;
 Mon, 15 Oct 2012 01:00:19 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9F10JMd048813;
 Mon, 15 Oct 2012 01:00:19 GMT
 (envelope-from emaste@freefall.freebsd.org)
Received: (from emaste@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9F10J0U048808;
 Mon, 15 Oct 2012 01:00:19 GMT (envelope-from emaste)
Date: Mon, 15 Oct 2012 01:00:19 GMT
Message-Id: <201210150100.q9F10J0U048808@freefall.freebsd.org>
To: gigabyte.tmn@gmail.com, emaste@FreeBSD.org, freebsd-net@FreeBSD.org
From: emaste@FreeBSD.org
Subject: Re: kern/140634: [vlan] destroying if_lagg interface with if_vlan
 members causing 100% usage by ifconfig
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 01:00:20 -0000

Synopsis: [vlan] destroying if_lagg interface with if_vlan members causing 100% usage by ifconfig

State-Changed-From-To: open->feedback
State-Changed-By: emaste
State-Changed-When: Mon Oct 15 00:59:14 UTC 2012
State-Changed-Why: 
A quick browse of the source suggests this should be fixed as of 8.0, and I
can confirm that it doesn't happen on 10-CURRENT.  If you're able to test on
a more recent version please confirm.


http://www.freebsd.org/cgi/query-pr.cgi?pr=140634

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 02:45:21 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id CBCABF55
 for <freebsd-net@freebsd.org>; Mon, 15 Oct 2012 02:45:21 +0000 (UTC)
 (envelope-from lstewart@freebsd.org)
Received: from lauren.room52.net (lauren.room52.net [210.50.193.198])
 by mx1.freebsd.org (Postfix) with ESMTP id 8901E8FC16
 for <freebsd-net@freebsd.org>; Mon, 15 Oct 2012 02:45:21 +0000 (UTC)
Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au
 [136.186.229.95])
 by lauren.room52.net (Postfix) with ESMTPSA id CE73D7E820;
 Mon, 15 Oct 2012 13:45:12 +1100 (EST)
Message-ID: <507B78B8.2000707@freebsd.org>
Date: Mon, 15 Oct 2012 13:45:12 +1100
From: Lawrence Stewart <lstewart@freebsd.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:14.0) Gecko/20120814 Thunderbird/14.0
MIME-Version: 1.0
To: "Eggert, Lars" <lars@netapp.com>
Subject: Re: FreeBSD & bufferbloat?
References: <D4D47BCFFE5A004F95D707546AC0D7E91855C8C9@SACEXCMBX01-PRD.hq.netapp.com>
In-Reply-To: <D4D47BCFFE5A004F95D707546AC0D7E91855C8C9@SACEXCMBX01-PRD.hq.netapp.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY
 autolearn=unavailable version=3.3.2
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net
Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 02:45:21 -0000

On 10/12/12 03:25, Eggert, Lars wrote:
> Hi,
>
> is anyone in BSD-land working on de-bufferbloating the kernel, similar to what the Linux folks are currently doing?

I'll be committing the CAIA Delay-Gradient (CDG) TCP congestion control 
algorithm shortly. It's still experimental, but it has some useful 
characteristics in terms of keeping buffer utilisation minimal whilst 
achieving acceptable goodput even in the face of competition from 
loss-based algorithms like NewReno. I've included a few relevant links 
at the end for anyone who wants to know more.

On the larger topic of de-bufferbloating the kernel, I'm not aware of 
anyone who is systematically identifying buffer points and "fixing" them 
if they are found to suffer from bloat problems.

Cheers,
Lawrence

http://caia.swin.edu.au/cv/dahayes/content/networking2011-cdg-preprint.pdf

www.ietf.org/proceedings/84/slides/slides-84-iccrg-2

http://caia.swin.edu.au/urp/newtcp/tools.html

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 06:52:56 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C25D64AE;
 Mon, 15 Oct 2012 06:52:56 +0000 (UTC)
 (envelope-from christian@errxtx.net)
Received: from stakka.errxtx.net (stakka.errxtx.net [94.23.249.66])
 by mx1.freebsd.org (Postfix) with ESMTP id 83AEF8FC08;
 Mon, 15 Oct 2012 06:52:56 +0000 (UTC)
Received: from ip-109-84-0-66.web.vodafone.de ([109.84.0.66]
 helo=[10.70.99.66])
 by stakka.errxtx.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.72) (envelope-from <christian@errxtx.net>)
 id 1TNe8O-0000sg-SN; Mon, 15 Oct 2012 08:26:46 +0200
References: <201210121213.11152.jhb@freebsd.org>
 <CAAAm0r3JGv3n8fX-GUpoS8CD2k9_mUBJxJ398__EH-y7SX_xrw@mail.gmail.com>
Mime-Version: 1.0 (1.0)
In-Reply-To: <CAAAm0r3JGv3n8fX-GUpoS8CD2k9_mUBJxJ398__EH-y7SX_xrw@mail.gmail.com>
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-Id: <CF46ABB9-23A4-43E8-A2BB-DE42E993B551@errxtx.net>
X-Mailer: iPhone Mail (10A403)
From: Christian Meutes <christian@errxtx.net>
Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful
Date: Mon, 15 Oct 2012 08:26:36 +0200
To: Jason Wolfe <nitroboost@gmail.com>
Cc: John Baldwin <jhb@freebsd.org>, "net@freebsd.org" <net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 06:52:56 -0000

I find the "hack" more than just strange. Because of other OSes bugs FreeBSD=
 breaks it's own stack. Don't want to know how many connections suffered fro=
m this.

(Sorry for top-posting)
--
   Christian

On 14.10.2012, at 00:19, Jason Wolfe <nitroboost@gmail.com> wrote:

> On Fri, Oct 12, 2012 at 9:13 AM, John Baldwin <jhb@freebsd.org> wrote:
>> Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted=
 SYNs
>> starting with the 3rd SYN in this block in tcp_timer.c:
>>=20
>>        /*
>>         * Disable rfc1323 if we haven't got any response to
>>         * our third SYN to work-around some broken terminal servers
>>         * (most of which have hopefully been retired) that have bad VJ
>>         * header compression code which trashes TCP segments containing
>>         * unknown-to-them TCP options.
>>         */
>>        if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D 3=
))
>>                tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
>>=20
>> There is even a PR for the original bug report: kern/1689
>>=20
>> [..snip..]
>>=20
>> The original motivation of this change is to work around broken terminal
>> servers that were old when this change was added in 2001.  Over 10 years l=
ater
>> I think we should at least have an option to turn this work-around off, a=
nd
>> possibly disable it by default.
>>=20
>> Thoughts?
>>=20
>> --
>> John Baldwin
>=20
> Not that it alone merits keeping the code in, but there are some cases
> where this comes in handy.  I ran into an issue with heavily
> trafficked Linux <-> FBSD boxes here -
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031881.html.
>=20
> Linux would deny the connection because in FBSD ithe n and outbound
> timestamp randomization isn't sync'd to the same base, so when FBSD
> would hit a 2MSL connection Linux would simply ignore the SYN.  After
> the 3rd SYN FBSD would drop support, and Linux would finally honor the
> request.  I doubt this is too widespread, but it would probably break
> things for a few folks.
>=20
> Jason
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 07:51:56 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id B941BD6D
 for <net@freebsd.org>; Mon, 15 Oct 2012 07:51:56 +0000 (UTC)
 (envelope-from eugene@imedia.ru)
Received: from mx2.imedia.ru (mx2.imedia.ru [91.230.26.134])
 by mx1.freebsd.org (Postfix) with ESMTP id 06E9A8FC16
 for <net@freebsd.org>; Mon, 15 Oct 2012 07:51:55 +0000 (UTC)
X-All-Recipients: <net@freebsd.org>
X-DKIM: OpenDKIM Filter v2.5.0 mx2.imedia.ru q9F7ploJ043338
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=imedia.ru; s=common;
 t=1350287507; bh=WjlcbDHoOCY7wEXrQ8EVumx2PW809d7e+LddX5z1fek=;
 h=From:Reply-To:To:Subject:Date;
 b=vP9BwBS3sMq6EF2Js8AksTpLb62mnwjJO30eqm6e6iaimpnGibsHeQB2Pm7cptCY0
 nNDMuc6HkVLAshKsqIuFR0rGQeQNUnqK7wWegcHtXM5lS9aMeLbniPoUSg4yl2fWka
 XDntNCP14IGhBtldLhH1PwFr8pvnLykWbZ2bsBb8=
Received: from badger.imedia.ru (root@badger.imedia.ru [10.167.1.243])
 by mx2.imedia.ru (8.14.3/8.14.3/TWINS7_LDAP) with ESMTP id q9F7ploJ043338
 for <net@freebsd.org>; Mon, 15 Oct 2012 11:51:47 +0400 (MSK)
 (envelope-from eugene@imedia.ru)
Received: from badger.imedia.ru (eugene@localhost [127.0.0.1])
 by badger.imedia.ru (8.14.5/8.14.4) with ESMTP id q9F7plln012346
 for <net@freebsd.org>; Mon, 15 Oct 2012 11:51:47 +0400 (MSK)
 (envelope-from eugene@imedia.ru)
Received: from localhost (localhost [[UNIX: localhost]])
 by badger.imedia.ru (8.14.5/8.14.4/Submit) id q9F7plLh012345
 for net@freebsd.org; Mon, 15 Oct 2012 11:51:47 +0400 (MSK)
 (envelope-from eugene@imedia.ru)
X-Authentication-Warning: badger.imedia.ru: eugene set sender to
 eugene@imedia.ru using -f
From: Eugene Mitrofanov <eugene@imedia.ru>
Organization: Sanoma Independent Media
To: net@freebsd.org
Subject: dev.bce.3.mbuf_alloc_failed_count increases permanently
Date: Mon, 15 Oct 2012 11:51:47 +0400
User-Agent: KMail/1.9.10
X-Origin: badger.imedia.ru
MIME-Version: 1.0
Content-Disposition: inline
Message-Id: <201210151151.47161.eugene@imedia.ru>
X-Length: 2179
X-UID: 5642
Content-Type: text/plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0.1
 (mx2.imedia.ru [10.167.0.252]); Mon, 15 Oct 2012 11:51:47 +0400 (MSK)
X-Virus-Scanned: clamav-milter 0.97.4-exp at lynx.imedia.ru
X-Virus-Status: Clean
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: Eugene Mitrofanov <eugene@imedia.ru>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 07:51:56 -0000

Hello list!

I have FreeBSD 8.2-p3 and observe a strange behaviour

sysctl -a | g bce.3|g -vE '(%|stat)'; echo; sleep 10;  sysctl -a | g bce.3|
g -vE '(%|stat)'; echo; netstat -m

dev.bce.3.l2fhdr_error_count: 0
dev.bce.3.mbuf_alloc_failed_count: 2098854
dev.bce.3.mbuf_frag_count: 2655285
dev.bce.3.dma_map_addr_rx_failed_count: 0
dev.bce.3.dma_map_addr_tx_failed_count: 57
dev.bce.3.unexpected_attention_count: 0
dev.bce.3.com_no_buffers: 0

dev.bce.3.l2fhdr_error_count: 0
dev.bce.3.mbuf_alloc_failed_count: 2098856
dev.bce.3.mbuf_frag_count: 2655288
dev.bce.3.dma_map_addr_rx_failed_count: 0
dev.bce.3.dma_map_addr_tx_failed_count: 57
dev.bce.3.unexpected_attention_count: 0
dev.bce.3.com_no_buffers: 0

3022/18143/21165 mbufs in use (current/cache/total)
2039/9179/11218/65536 mbuf clusters in use (current/cache/total/max)
1678/3731 mbuf+clusters out of packet secondary zone in use (current/cache)
0/1672/1672/12800 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/1763/1763/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
4833K/45448K/50282K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
59058137 requests for I/O initiated by sendfile
0 calls to protocol drain routines

Any suggestions? Could You advise me what is the reason of this?

-- 
EVM7-RIPE

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 10:11:55 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5BF89E34
 for <net@freebsd.org>; Mon, 15 Oct 2012 10:11:55 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id CD6028FC0C
 for <net@freebsd.org>; Mon, 15 Oct 2012 10:11:54 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1TNhhe-000Pex-Dl; Mon, 15 Oct 2012 14:15:18 +0400
Message-ID: <507C1960.6050500@FreeBSD.org>
Date: Mon, 15 Oct 2012 18:10:40 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: Jack Vogel <jfvogel@gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org>
 <20121013182223.GA73341@onelab2.iet.unipi.it>
 <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
In-Reply-To: <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Luigi Rizzo <rizzo@iet.unipi.it>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 10:11:55 -0000

On 13.10.2012 23:24, Jack Vogel wrote:
> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:

>>
>> one option could be (same as it is done in the timer
>> routine in dummynet) to build a list of all the packets
>> that need to be sent to if_input(), and then call
>> if_input with the entire list outside the lock.
>>
>> It would be even easier if we modify the various *_input()
>> routines to handle a list of mbufs instead of just one.

Bulk processing is generally a good idea we probably should implement.
Probably starting from driver queue ending with marked mbufs 
(OURS/forward/legacy processing (appletalk and similar))?

This can minimize an impact for all
locks on RX side:
L2
* rx PFIL hook
L3 (both IPv4 and IPv6)
* global IF_ADDR_RLOCK (currently commented out)
* Per-interface ADDR_RLOCK
* PFIL hook

 From the first glance, there can be problems with:
* Increased latency (we should have some kind of rx_process_limit), but 
still
* reader locks being acquired for much longer amount of time

>>
>> cheers
>> luigi
>>
>> Very interesting idea Luigi, will have to get that some thought.
>
> Jack

Returning to original post topic:

Given
1) we are currently binding ixgbe ithreads to CPU cores
2) RX queue lock is used by (indirectly) in only 2 places:
a) ISR routine (msix or legacy irq)
b) taskqueue routine which is scheduled if some packets remains in RX 
queue and rx_process_limit ended OR we need something to TX

3) in practice taskqueue routine is a nightmare for many people since 
there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
some traffic burst happens: once it is called it starts to schedule 
itself more and more replacing original ISR routine. Additionally, 
increasing rx_process_limit does not help since taskqueue is called with 
the same limit. Finally, currently netisr taskq threads are not bound to 
any CPU which makes the process even more uncontrollable.

Maybe we can rethink taskqueue usage for RX processing?
I mean, taskq is called if host fails to process packets in ring fast 
enough, which can happen when:
* traffic burst happens on some (or all) queue
* traffic ratio is too high.

In former case we have ring buffer size which can be tuned by 
administrator to fairly big value.
For latter case:
If all system CPUs are used for RX processing moving some uncontrolled 
percent of load to random CPU definitely does no good (especially given 
that ixgbe has AIM and RX indirection table for that purposes which can 
give much more predictable results)

It does even more evil in case of special setups like 
rx_queues=CPU_COUNT-1 and the last CPU is used by all other processes 
including control plane one (routing software, various keepalives).

If system has more CPUs (24 vs 16 queues, for example) there is standard 
way for distributing load: netisr and deferred processing.
Netisr threads are already CPU-bound, and, more important, splitting
packets to different threads can be done by performing some (say, L3+L4) 
hash computation which will not lead to out-of-order packet processing.


>
>> So my questions are:
>>>
>>> Can any real LORs happen in some complex setup? (I can't imagine any).
>>> If so: maybe we can somehow avoid/workaround such cases? (and consider
>>> removing those locks).
>>>
>>>
>>>
>>> --
>>> WBR, Alexander
>>>
>>> _______________________________________________
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 11:06:13 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 67EE7511
 for <freebsd-net@FreeBSD.org>; Mon, 15 Oct 2012 11:06:13 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id 35DED8FC2E
 for <freebsd-net@FreeBSD.org>; Mon, 15 Oct 2012 11:06:13 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9FB6D5f011550
 for <freebsd-net@FreeBSD.org>; Mon, 15 Oct 2012 11:06:13 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9FB6DS9011549
 for freebsd-net@FreeBSD.org; Mon, 15 Oct 2012 11:06:13 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 15 Oct 2012 11:06:13 GMT
Message-Id: <201210151106.q9FB6DS9011549@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-net@FreeBSD.org
Subject: Current problem reports assigned to freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 11:06:13 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).


From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 13:04:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F2C2717E;
 Mon, 15 Oct 2012 13:04:29 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id B24B08FC17;
 Mon, 15 Oct 2012 13:04:29 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id F0DF7B984;
 Mon, 15 Oct 2012 09:04:28 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: ixgbe & if_igb RX ring locking
Date: Mon, 15 Oct 2012 09:04:27 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org>
 <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
 <507C1960.6050500@FreeBSD.org>
In-Reply-To: <507C1960.6050500@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210150904.27567.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 15 Oct 2012 09:04:29 -0400 (EDT)
Cc: Luigi Rizzo <rizzo@iet.unipi.it>,
 "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 13:04:30 -0000

On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
> On 13.10.2012 23:24, Jack Vogel wrote:
> > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:
> 
> >>
> >> one option could be (same as it is done in the timer
> >> routine in dummynet) to build a list of all the packets
> >> that need to be sent to if_input(), and then call
> >> if_input with the entire list outside the lock.
> >>
> >> It would be even easier if we modify the various *_input()
> >> routines to handle a list of mbufs instead of just one.
> 
> Bulk processing is generally a good idea we probably should implement.
> Probably starting from driver queue ending with marked mbufs 
> (OURS/forward/legacy processing (appletalk and similar))?
> 
> This can minimize an impact for all
> locks on RX side:
> L2
> * rx PFIL hook
> L3 (both IPv4 and IPv6)
> * global IF_ADDR_RLOCK (currently commented out)
> * Per-interface ADDR_RLOCK
> * PFIL hook
> 
>  From the first glance, there can be problems with:
> * Increased latency (we should have some kind of rx_process_limit), but 
> still
> * reader locks being acquired for much longer amount of time
> 
> >>
> >> cheers
> >> luigi
> >>
> >> Very interesting idea Luigi, will have to get that some thought.
> >
> > Jack
> 
> Returning to original post topic:
> 
> Given
> 1) we are currently binding ixgbe ithreads to CPU cores
> 2) RX queue lock is used by (indirectly) in only 2 places:
> a) ISR routine (msix or legacy irq)
> b) taskqueue routine which is scheduled if some packets remains in RX 
> queue and rx_process_limit ended OR we need something to TX
> 
> 3) in practice taskqueue routine is a nightmare for many people since 
> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
> some traffic burst happens: once it is called it starts to schedule 
> itself more and more replacing original ISR routine. Additionally, 
> increasing rx_process_limit does not help since taskqueue is called with 
> the same limit. Finally, currently netisr taskq threads are not bound to 
> any CPU which makes the process even more uncontrollable.

I think part of the problem here is that the taskqueue in ixgbe(4) is
bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
just start transmitting packets directly.

I fixed this in igb(4) here:

http://svnweb.freebsd.org/base?view=revision&revision=233708

You can try this for ixgbe(4).  It also comments out a spurious taskqueue 
reschedule from the watchdog handler that might also lower the taskqueue 
usage.  You can try changing that #if 0 to an #if 1 to test just the txeof 
changes:

Index: ixgbe.c
===================================================================
--- ixgbe.c	(revision 241579)
+++ ixgbe.c	(working copy)
@@ -149,7 +149,7 @@
 static void     ixgbe_enable_intr(struct adapter *);
 static void     ixgbe_disable_intr(struct adapter *);
 static void     ixgbe_update_stats_counters(struct adapter *);
-static bool	ixgbe_txeof(struct tx_ring *);
+static void	ixgbe_txeof(struct tx_ring *);
 static bool	ixgbe_rxeof(struct ix_queue *, int);
 static void	ixgbe_rx_checksum(u32, struct mbuf *, u32);
 static void     ixgbe_set_promisc(struct adapter *);
@@ -1439,8 +1439,9 @@
 	struct adapter	*adapter = que->adapter;
 	struct ixgbe_hw	*hw = &adapter->hw;
 	struct 		tx_ring *txr = adapter->tx_rings;
-	bool		more_tx, more_rx;
-	u32       	reg_eicr, loop = MAX_LOOP;
+	struct ifnet    *ifp = adapter->ifp;
+	bool		more_rx;
+	u32       	reg_eicr;
 
 
 	reg_eicr = IXGBE_READ_REG(hw, IXGBE_EICR);
@@ -1454,14 +1455,16 @@
 	more_rx = ixgbe_rxeof(que, adapter->rx_process_limit);
 
 	IXGBE_TX_LOCK(txr);
-	do {
-		more_tx = ixgbe_txeof(txr);
-	} while (loop-- && more_tx);
+	ixgbe_txeof(txr);
+#if __FreeBSD_version >= 800000
+	if (!drbr_empty(ifp, txr->br))
+		ixgbe_mq_start_locked(ifp, txr, NULL);
+#else
+	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
+		ixgbe_start_locked(txr, ifp);
+#endif
 	IXGBE_TX_UNLOCK(txr);
 
-	if (more_rx || more_tx)
-		taskqueue_enqueue(que->tq, &que->que_task);
-
 	/* Check for fan failure */
 	if ((hw->phy.media_type == ixgbe_media_type_copper) &&
 	    (reg_eicr & IXGBE_EICR_GPI_SDP1)) {
@@ -1474,7 +1477,10 @@
 	if (reg_eicr & IXGBE_EICR_LSC)
 		taskqueue_enqueue(adapter->tq, &adapter->link_task);
 
-	ixgbe_enable_intr(adapter);
+	if (more_rx)
+		taskqueue_enqueue(que->tq, &que->que_task);
+	else
+		ixgbe_enable_intr(adapter);
 	return;
 }
 
@@ -1491,7 +1497,8 @@
 	struct adapter  *adapter = que->adapter;
 	struct tx_ring	*txr = que->txr;
 	struct rx_ring	*rxr = que->rxr;
-	bool		more_tx, more_rx;
+	struct ifnet    *ifp = adapter->ifp;
+	bool		more_rx;
 	u32		newitr = 0;
 
 	ixgbe_disable_queue(adapter, que->msix);
@@ -1500,18 +1507,14 @@
 	more_rx = ixgbe_rxeof(que, adapter->rx_process_limit);
 
 	IXGBE_TX_LOCK(txr);
-	more_tx = ixgbe_txeof(txr);
-	/*
-	** Make certain that if the stack 
-	** has anything queued the task gets
-	** scheduled to handle it.
-	*/
-#if __FreeBSD_version < 800000
-	if (!IFQ_DRV_IS_EMPTY(&adapter->ifp->if_snd))
+	ixgbe_txeof(txr);
+#if __FreeBSD_version >= 800000
+	if (!drbr_empty(ifp, txr->br))
+		ixgbe_mq_start_locked(ifp, txr, NULL);
 #else
-	if (!drbr_empty(adapter->ifp, txr->br))
+	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
+		ixgbe_start_locked(txr, ifp);
 #endif
-		more_tx = 1;
 	IXGBE_TX_UNLOCK(txr);
 
 	/* Do AIM now? */
@@ -1565,7 +1568,7 @@
         rxr->packets = 0;
 
 no_calc:
-	if (more_tx || more_rx)
+	if (more_rx)
 		taskqueue_enqueue(que->tq, &que->que_task);
 	else /* Reenable this interrupt */
 		ixgbe_enable_queue(adapter, que->msix);
@@ -2049,8 +2052,10 @@
 			++hung;
 		if (txr->queue_status & IXGBE_QUEUE_DEPLETED)
 			++busy;
+#if 0
 		if ((txr->queue_status & IXGBE_QUEUE_IDLE) == 0)
 			taskqueue_enqueue(que->tq, &que->que_task);
+#endif
         }
 	/* Only truely watchdog if all queues show hung */
         if (hung == adapter->num_queues)
@@ -3548,7 +3556,7 @@
  *  tx_buffer is put back on the free queue.
  *
  **********************************************************************/
-static bool
+static void
 ixgbe_txeof(struct tx_ring *txr)
 {
 	struct adapter	*adapter = txr->adapter;
@@ -3597,13 +3605,13 @@
 			IXGBE_CORE_UNLOCK(adapter);
 			IXGBE_TX_LOCK(txr);
 		}
-		return FALSE;
+		return;
 	}
 #endif /* DEV_NETMAP */
 
 	if (txr->tx_avail == adapter->num_tx_desc) {
 		txr->queue_status = IXGBE_QUEUE_IDLE;
-		return FALSE;
+		return;
 	}
 
 	processed = 0;
@@ -3613,7 +3621,7 @@
 	tx_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[first];
 	last = tx_buffer->eop_index;
 	if (last == -1)
-		return FALSE;
+		return;
 	eop_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[last];
 
 	/*
@@ -3693,12 +3701,8 @@
 	if (txr->tx_avail > IXGBE_TX_CLEANUP_THRESHOLD)
 		txr->queue_status &= ~IXGBE_QUEUE_DEPLETED;
 
-	if (txr->tx_avail == adapter->num_tx_desc) {
+	if (txr->tx_avail == adapter->num_tx_desc)
 		txr->queue_status = IXGBE_QUEUE_IDLE;
-		return (FALSE);
-	}
-
-	return TRUE;
 }
 
 /*********************************************************************

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 13:04:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F2C2717E;
 Mon, 15 Oct 2012 13:04:29 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id B24B08FC17;
 Mon, 15 Oct 2012 13:04:29 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id F0DF7B984;
 Mon, 15 Oct 2012 09:04:28 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: ixgbe & if_igb RX ring locking
Date: Mon, 15 Oct 2012 09:04:27 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org>
 <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
 <507C1960.6050500@FreeBSD.org>
In-Reply-To: <507C1960.6050500@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210150904.27567.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 15 Oct 2012 09:04:29 -0400 (EDT)
Cc: Luigi Rizzo <rizzo@iet.unipi.it>,
 "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 13:04:30 -0000

On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
> On 13.10.2012 23:24, Jack Vogel wrote:
> > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:
> 
> >>
> >> one option could be (same as it is done in the timer
> >> routine in dummynet) to build a list of all the packets
> >> that need to be sent to if_input(), and then call
> >> if_input with the entire list outside the lock.
> >>
> >> It would be even easier if we modify the various *_input()
> >> routines to handle a list of mbufs instead of just one.
> 
> Bulk processing is generally a good idea we probably should implement.
> Probably starting from driver queue ending with marked mbufs 
> (OURS/forward/legacy processing (appletalk and similar))?
> 
> This can minimize an impact for all
> locks on RX side:
> L2
> * rx PFIL hook
> L3 (both IPv4 and IPv6)
> * global IF_ADDR_RLOCK (currently commented out)
> * Per-interface ADDR_RLOCK
> * PFIL hook
> 
>  From the first glance, there can be problems with:
> * Increased latency (we should have some kind of rx_process_limit), but 
> still
> * reader locks being acquired for much longer amount of time
> 
> >>
> >> cheers
> >> luigi
> >>
> >> Very interesting idea Luigi, will have to get that some thought.
> >
> > Jack
> 
> Returning to original post topic:
> 
> Given
> 1) we are currently binding ixgbe ithreads to CPU cores
> 2) RX queue lock is used by (indirectly) in only 2 places:
> a) ISR routine (msix or legacy irq)
> b) taskqueue routine which is scheduled if some packets remains in RX 
> queue and rx_process_limit ended OR we need something to TX
> 
> 3) in practice taskqueue routine is a nightmare for many people since 
> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
> some traffic burst happens: once it is called it starts to schedule 
> itself more and more replacing original ISR routine. Additionally, 
> increasing rx_process_limit does not help since taskqueue is called with 
> the same limit. Finally, currently netisr taskq threads are not bound to 
> any CPU which makes the process even more uncontrollable.

I think part of the problem here is that the taskqueue in ixgbe(4) is
bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
just start transmitting packets directly.

I fixed this in igb(4) here:

http://svnweb.freebsd.org/base?view=revision&revision=233708

You can try this for ixgbe(4).  It also comments out a spurious taskqueue 
reschedule from the watchdog handler that might also lower the taskqueue 
usage.  You can try changing that #if 0 to an #if 1 to test just the txeof 
changes:

Index: ixgbe.c
===================================================================
--- ixgbe.c	(revision 241579)
+++ ixgbe.c	(working copy)
@@ -149,7 +149,7 @@
 static void     ixgbe_enable_intr(struct adapter *);
 static void     ixgbe_disable_intr(struct adapter *);
 static void     ixgbe_update_stats_counters(struct adapter *);
-static bool	ixgbe_txeof(struct tx_ring *);
+static void	ixgbe_txeof(struct tx_ring *);
 static bool	ixgbe_rxeof(struct ix_queue *, int);
 static void	ixgbe_rx_checksum(u32, struct mbuf *, u32);
 static void     ixgbe_set_promisc(struct adapter *);
@@ -1439,8 +1439,9 @@
 	struct adapter	*adapter = que->adapter;
 	struct ixgbe_hw	*hw = &adapter->hw;
 	struct 		tx_ring *txr = adapter->tx_rings;
-	bool		more_tx, more_rx;
-	u32       	reg_eicr, loop = MAX_LOOP;
+	struct ifnet    *ifp = adapter->ifp;
+	bool		more_rx;
+	u32       	reg_eicr;
 
 
 	reg_eicr = IXGBE_READ_REG(hw, IXGBE_EICR);
@@ -1454,14 +1455,16 @@
 	more_rx = ixgbe_rxeof(que, adapter->rx_process_limit);
 
 	IXGBE_TX_LOCK(txr);
-	do {
-		more_tx = ixgbe_txeof(txr);
-	} while (loop-- && more_tx);
+	ixgbe_txeof(txr);
+#if __FreeBSD_version >= 800000
+	if (!drbr_empty(ifp, txr->br))
+		ixgbe_mq_start_locked(ifp, txr, NULL);
+#else
+	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
+		ixgbe_start_locked(txr, ifp);
+#endif
 	IXGBE_TX_UNLOCK(txr);
 
-	if (more_rx || more_tx)
-		taskqueue_enqueue(que->tq, &que->que_task);
-
 	/* Check for fan failure */
 	if ((hw->phy.media_type == ixgbe_media_type_copper) &&
 	    (reg_eicr & IXGBE_EICR_GPI_SDP1)) {
@@ -1474,7 +1477,10 @@
 	if (reg_eicr & IXGBE_EICR_LSC)
 		taskqueue_enqueue(adapter->tq, &adapter->link_task);
 
-	ixgbe_enable_intr(adapter);
+	if (more_rx)
+		taskqueue_enqueue(que->tq, &que->que_task);
+	else
+		ixgbe_enable_intr(adapter);
 	return;
 }
 
@@ -1491,7 +1497,8 @@
 	struct adapter  *adapter = que->adapter;
 	struct tx_ring	*txr = que->txr;
 	struct rx_ring	*rxr = que->rxr;
-	bool		more_tx, more_rx;
+	struct ifnet    *ifp = adapter->ifp;
+	bool		more_rx;
 	u32		newitr = 0;
 
 	ixgbe_disable_queue(adapter, que->msix);
@@ -1500,18 +1507,14 @@
 	more_rx = ixgbe_rxeof(que, adapter->rx_process_limit);
 
 	IXGBE_TX_LOCK(txr);
-	more_tx = ixgbe_txeof(txr);
-	/*
-	** Make certain that if the stack 
-	** has anything queued the task gets
-	** scheduled to handle it.
-	*/
-#if __FreeBSD_version < 800000
-	if (!IFQ_DRV_IS_EMPTY(&adapter->ifp->if_snd))
+	ixgbe_txeof(txr);
+#if __FreeBSD_version >= 800000
+	if (!drbr_empty(ifp, txr->br))
+		ixgbe_mq_start_locked(ifp, txr, NULL);
 #else
-	if (!drbr_empty(adapter->ifp, txr->br))
+	if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd))
+		ixgbe_start_locked(txr, ifp);
 #endif
-		more_tx = 1;
 	IXGBE_TX_UNLOCK(txr);
 
 	/* Do AIM now? */
@@ -1565,7 +1568,7 @@
         rxr->packets = 0;
 
 no_calc:
-	if (more_tx || more_rx)
+	if (more_rx)
 		taskqueue_enqueue(que->tq, &que->que_task);
 	else /* Reenable this interrupt */
 		ixgbe_enable_queue(adapter, que->msix);
@@ -2049,8 +2052,10 @@
 			++hung;
 		if (txr->queue_status & IXGBE_QUEUE_DEPLETED)
 			++busy;
+#if 0
 		if ((txr->queue_status & IXGBE_QUEUE_IDLE) == 0)
 			taskqueue_enqueue(que->tq, &que->que_task);
+#endif
         }
 	/* Only truely watchdog if all queues show hung */
         if (hung == adapter->num_queues)
@@ -3548,7 +3556,7 @@
  *  tx_buffer is put back on the free queue.
  *
  **********************************************************************/
-static bool
+static void
 ixgbe_txeof(struct tx_ring *txr)
 {
 	struct adapter	*adapter = txr->adapter;
@@ -3597,13 +3605,13 @@
 			IXGBE_CORE_UNLOCK(adapter);
 			IXGBE_TX_LOCK(txr);
 		}
-		return FALSE;
+		return;
 	}
 #endif /* DEV_NETMAP */
 
 	if (txr->tx_avail == adapter->num_tx_desc) {
 		txr->queue_status = IXGBE_QUEUE_IDLE;
-		return FALSE;
+		return;
 	}
 
 	processed = 0;
@@ -3613,7 +3621,7 @@
 	tx_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[first];
 	last = tx_buffer->eop_index;
 	if (last == -1)
-		return FALSE;
+		return;
 	eop_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[last];
 
 	/*
@@ -3693,12 +3701,8 @@
 	if (txr->tx_avail > IXGBE_TX_CLEANUP_THRESHOLD)
 		txr->queue_status &= ~IXGBE_QUEUE_DEPLETED;
 
-	if (txr->tx_avail == adapter->num_tx_desc) {
+	if (txr->tx_avail == adapter->num_tx_desc)
 		txr->queue_status = IXGBE_QUEUE_IDLE;
-		return (FALSE);
-	}
-
-	return TRUE;
 }
 
 /*********************************************************************

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 15:17:27 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 6564781B;
 Mon, 15 Oct 2012 15:17:27 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 364EE8FC0A;
 Mon, 15 Oct 2012 15:17:27 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 917D4B911;
 Mon, 15 Oct 2012 11:17:26 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful
Date: Mon, 15 Oct 2012 09:08:36 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <201210121213.11152.jhb@freebsd.org>
In-Reply-To: <201210121213.11152.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210150908.36498.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 15 Oct 2012 11:17:26 -0400 (EDT)
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 15:17:27 -0000

On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote:
> Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted SYNs 
> starting with the 3rd SYN in this block in tcp_timer.c:
> 
> 	/*
> 	 * Disable rfc1323 if we haven't got any response to
> 	 * our third SYN to work-around some broken terminal servers
> 	 * (most of which have hopefully been retired) that have bad VJ
> 	 * header compression code which trashes TCP segments containing
> 	 * unknown-to-them TCP options.
> 	 */
> 	if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3))
> 		tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP);
> 
> There is even a PR for the original bug report: kern/1689
> 
> However, there is an unintended consequence of this change that can be 
> disastrous.  Specifically, suppose you have a FreeBSD client connecting to a 
> server, and that the SYNs are arriving at the server successfully, but the 
> first few return SYN/ACKs are dropped.  Eventually a SYN/ACK makes it through 
> and the connection is established.
> 
> The server (based on the first SYN it saw) believes it has negotiated window 
> scaling with the client.  The client, however, has broken what it promised in 
> that first SYN and believes it is not using any window scaling at all.  This 
> causes two forms of breakage:
> 
>  1) When the server advertises a scaled window (e.g. '8' for a 64k window
>     scaled at 13), the client thinks it is an unscaled window ('8') and
>     sends data to the server very slowly.
>  
>  2) When the client advertises an unscaled window (e.g. '65535' for a 64k
>     window), the server thinks it has a huge window (65535 << 13 == 511MB)
>     to send into.
> 
> I'm not sure that 2) is a problem per se, but I have definitely seen instances 
> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and client 
> end of the connections to verify they disagreed on the scaling).
> 
> The original motivation of this change is to work around broken terminal 
> servers that were old when this change was added in 2001.  Over 10 years later 
> I think we should at least have an option to turn this work-around off, and 
> possibly disable it by default.
> 
> Thoughts?

How about this:

Index: tcp_timer.c
===================================================================
--- tcp_timer.c	(revision 241579)
+++ tcp_timer.c	(working copy)
@@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, CTLFL
 	/* max idle probes */
 int	tcp_maxpersistidle;
 
+static int	tcp_rexmit_drop_options = 0;
+SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW,
+    &tcp_rexmit_drop_options, 0,
+    "Drop TCP options from 3rd and later retransmitted SYN");
+
 static int	per_cpu_timers = 0;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW,
     &per_cpu_timers , 0, "run tcp timers on all cpus");
@@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp)
 	 * header compression code which trashes TCP segments containing
 	 * unknown-to-them TCP options.
 	 */
-	if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3))
+	if (tcp_rexmit_drop_options && (tp->t_state == TCPS_SYN_SENT) &&
+	    (tp->t_rxtshift == 3))
 		tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP);
 	/*
 	 * If we backed off this far, our srtt estimate is probably bogus.

Any other suggestions on the sysctl name?

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 15:17:27 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 6564781B;
 Mon, 15 Oct 2012 15:17:27 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 364EE8FC0A;
 Mon, 15 Oct 2012 15:17:27 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 917D4B911;
 Mon, 15 Oct 2012 11:17:26 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful
Date: Mon, 15 Oct 2012 09:08:36 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <201210121213.11152.jhb@freebsd.org>
In-Reply-To: <201210121213.11152.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210150908.36498.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 15 Oct 2012 11:17:26 -0400 (EDT)
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 15:17:27 -0000

On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote:
> Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted SYNs 
> starting with the 3rd SYN in this block in tcp_timer.c:
> 
> 	/*
> 	 * Disable rfc1323 if we haven't got any response to
> 	 * our third SYN to work-around some broken terminal servers
> 	 * (most of which have hopefully been retired) that have bad VJ
> 	 * header compression code which trashes TCP segments containing
> 	 * unknown-to-them TCP options.
> 	 */
> 	if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3))
> 		tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP);
> 
> There is even a PR for the original bug report: kern/1689
> 
> However, there is an unintended consequence of this change that can be 
> disastrous.  Specifically, suppose you have a FreeBSD client connecting to a 
> server, and that the SYNs are arriving at the server successfully, but the 
> first few return SYN/ACKs are dropped.  Eventually a SYN/ACK makes it through 
> and the connection is established.
> 
> The server (based on the first SYN it saw) believes it has negotiated window 
> scaling with the client.  The client, however, has broken what it promised in 
> that first SYN and believes it is not using any window scaling at all.  This 
> causes two forms of breakage:
> 
>  1) When the server advertises a scaled window (e.g. '8' for a 64k window
>     scaled at 13), the client thinks it is an unscaled window ('8') and
>     sends data to the server very slowly.
>  
>  2) When the client advertises an unscaled window (e.g. '65535' for a 64k
>     window), the server thinks it has a huge window (65535 << 13 == 511MB)
>     to send into.
> 
> I'm not sure that 2) is a problem per se, but I have definitely seen instances 
> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and client 
> end of the connections to verify they disagreed on the scaling).
> 
> The original motivation of this change is to work around broken terminal 
> servers that were old when this change was added in 2001.  Over 10 years later 
> I think we should at least have an option to turn this work-around off, and 
> possibly disable it by default.
> 
> Thoughts?

How about this:

Index: tcp_timer.c
===================================================================
--- tcp_timer.c	(revision 241579)
+++ tcp_timer.c	(working copy)
@@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, CTLFL
 	/* max idle probes */
 int	tcp_maxpersistidle;
 
+static int	tcp_rexmit_drop_options = 0;
+SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW,
+    &tcp_rexmit_drop_options, 0,
+    "Drop TCP options from 3rd and later retransmitted SYN");
+
 static int	per_cpu_timers = 0;
 SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW,
     &per_cpu_timers , 0, "run tcp timers on all cpus");
@@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp)
 	 * header compression code which trashes TCP segments containing
 	 * unknown-to-them TCP options.
 	 */
-	if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3))
+	if (tcp_rexmit_drop_options && (tp->t_state == TCPS_SYN_SENT) &&
+	    (tp->t_rxtshift == 3))
 		tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP);
 	/*
 	 * If we backed off this far, our srtt estimate is probably bogus.

Any other suggestions on the sysctl name?

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 16:29:28 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 8262FC8E;
 Mon, 15 Oct 2012 16:29:28 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117])
 by mx1.freebsd.org (Postfix) with ESMTP id ED7628FC14;
 Mon, 15 Oct 2012 16:29:27 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGTQx0020725;
 Mon, 15 Oct 2012 20:29:26 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGTQCf020724;
 Mon, 15 Oct 2012 20:29:26 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Mon, 15 Oct 2012 20:29:26 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: ixgbe & if_igb RX ring locking
Message-ID: <20121015162926.GV89655@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <5079A9A1.4070403@FreeBSD.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: Jack Vogel <jfvogel@gmail.com>, net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 16:29:28 -0000

On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote:
A> Packets receiving code for both ixgbe and if_igb looks like the following:
A> ixgbe_msix_que
A> 
A> -- ixgbe_rxeof()
A>     {
A>        IXGBE_RX_LOCK(rxr);
A>          while
A>          {
A>             get_packet;
A> 
A>             -- ixgbe_rx_input()
A>                {
A>                   ++ IXGBE_RX_UNLOCK(rxr);
A>                   if_input(packet);
A>                   ++ IXGBE_RX_LOCK(rxr);
A>                }
A> 
A>          }
A>        IXGBE_RX_UNLOCK(rxr);
A>      }
A> 
A> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe).
A> 
A> These lines probably do LORs masking (if any) well.
A> However, such change introduce quite significant performance drop:
A> 
A> On my routing setup (nearly the same from previous -Intel 10G thread in 
A> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is 
A> nearly 20%.
A> 
A> So my questions are:
A> 
A> Can any real LORs happen in some complex setup? (I can't imagine any).
A> If so: maybe we can somehow avoid/workaround such cases? (and consider 
A> removing those locks).

To me this unlock/lock looks like a legacy from times, when the driver
had a single mutex for both TX and RX parts.

And removing this re-locking in foo_rxeof() was one of the aims for separate
TX/RX locking.

Really, lurking through history shows that once driver had split its locking
to separate RX and TX part, these unlock/lock was removed. However, later
this unlock/lock was added back:

http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup

, without any comments for the reason it is added back.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 16:32:12 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 951D4F8D;
 Mon, 15 Oct 2012 16:32:12 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117])
 by mx1.freebsd.org (Postfix) with ESMTP id 02F0C8FC0C;
 Mon, 15 Oct 2012 16:32:11 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGWA3Y020752;
 Mon, 15 Oct 2012 20:32:10 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGWAT9020751;
 Mon, 15 Oct 2012 20:32:10 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Mon, 15 Oct 2012 20:32:10 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: John Baldwin <jhb@FreeBSD.org>
Subject: Re: ixgbe & if_igb RX ring locking
Message-ID: <20121015163210.GW89655@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org>
 <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
 <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <201210150904.27567.jhb@freebsd.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-net@FreeBSD.org, "Alexander V. Chernikov" <melifaro@FreeBSD.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>,
 net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 16:32:12 -0000

On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
J> > 3) in practice taskqueue routine is a nightmare for many people since 
J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
J> > some traffic burst happens: once it is called it starts to schedule 
J> > itself more and more replacing original ISR routine. Additionally, 
J> > increasing rx_process_limit does not help since taskqueue is called with 
J> > the same limit. Finally, currently netisr taskq threads are not bound to 
J> > any CPU which makes the process even more uncontrollable.
J> 
J> I think part of the problem here is that the taskqueue in ixgbe(4) is
J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
J> just start transmitting packets directly.
J> 
J> I fixed this in igb(4) here:
J> 
J> http://svnweb.freebsd.org/base?view=revision&revision=233708

The problem Alexander describes in 3) definitely wasn't fixed in r233708.

It is still present in head/, and it prevents me to do good benchmarking
of pf(4) on igb(4).

The problem is related to RX handling, so I don't see how r233708 could
fix it.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 16:32:12 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 951D4F8D;
 Mon, 15 Oct 2012 16:32:12 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117])
 by mx1.freebsd.org (Postfix) with ESMTP id 02F0C8FC0C;
 Mon, 15 Oct 2012 16:32:11 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGWA3Y020752;
 Mon, 15 Oct 2012 20:32:10 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGWAT9020751;
 Mon, 15 Oct 2012 20:32:10 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Mon, 15 Oct 2012 20:32:10 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: John Baldwin <jhb@FreeBSD.org>
Subject: Re: ixgbe & if_igb RX ring locking
Message-ID: <20121015163210.GW89655@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org>
 <CAFOYbc=N87_OECto7B8jdzmRZA-yoa_JWgvVc8kwpK9umO97rQ@mail.gmail.com>
 <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <201210150904.27567.jhb@freebsd.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-net@FreeBSD.org, "Alexander V. Chernikov" <melifaro@FreeBSD.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>,
 net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 16:32:12 -0000

On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
J> > 3) in practice taskqueue routine is a nightmare for many people since 
J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
J> > some traffic burst happens: once it is called it starts to schedule 
J> > itself more and more replacing original ISR routine. Additionally, 
J> > increasing rx_process_limit does not help since taskqueue is called with 
J> > the same limit. Finally, currently netisr taskq threads are not bound to 
J> > any CPU which makes the process even more uncontrollable.
J> 
J> I think part of the problem here is that the taskqueue in ixgbe(4) is
J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
J> just start transmitting packets directly.
J> 
J> I fixed this in igb(4) here:
J> 
J> http://svnweb.freebsd.org/base?view=revision&revision=233708

The problem Alexander describes in 3) definitely wasn't fixed in r233708.

It is still present in head/, and it prevents me to do good benchmarking
of pf(4) on igb(4).

The problem is related to RX handling, so I don't see how r233708 could
fix it.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 16:39:26 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 007C22DB;
 Mon, 15 Oct 2012 16:39:25 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
 [209.85.212.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 768448FC1B;
 Mon, 15 Oct 2012 16:39:25 +0000 (UTC)
Received: by mail-vb0-f54.google.com with SMTP id v11so6975292vbm.13
 for <multiple recipients>; Mon, 15 Oct 2012 09:39:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=fCEc0qdOMh4veUmcKP6NL+cBTZTtL7osxYL4tQsWYPo=;
 b=DB8n0AD7ueC51bAjJKoNBrLpFeiRKj/7AZhVe4AFs/fcymNDWrpmmCq8qgoPv9nOBA
 Ar/duWzq+N7AQMp83PTIe95gs2HE5iTZiSo3WOmjaN1SrPY6VGbOqrJiyrZea+HpX3Cw
 8lG31GqcUaAizpQtQjsEoAUrfC1knuOsrTFs6V9CSaQHbsEB6Vi/GfLXL7VBy2UIrAJd
 mzOFfkpolsr00DMNgUhGM3TqnSq6UczUY8I8Re6lY8QBFcV+v8+kArMGrnaFJqxlq93f
 3WF7uJAviu/U2r1Szl6qQ3aZSzwujQukr52mG0zUcHz4/8UcSInsP4c7+8DUyzME9QBI
 VHrg==
MIME-Version: 1.0
Received: by 10.58.1.101 with SMTP id 5mr7344899vel.40.1350319164396; Mon, 15
 Oct 2012 09:39:24 -0700 (PDT)
Received: by 10.58.68.8 with HTTP; Mon, 15 Oct 2012 09:39:24 -0700 (PDT)
In-Reply-To: <20121015162926.GV89655@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
Date: Mon, 15 Oct 2012 09:39:24 -0700
Message-ID: <CAFOYbcmkt+3-f7BzfaStUHBFEO0TzJXhYES=42ovkencPoKHJA@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Jack Vogel <jfvogel@gmail.com>
To: Gleb Smirnoff <glebius@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 16:39:26 -0000

On Mon, Oct 15, 2012 at 9:29 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:

> On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote:
> A> Packets receiving code for both ixgbe and if_igb looks like the
> following:
> A> ixgbe_msix_que
> A>
> A> -- ixgbe_rxeof()
> A>     {
> A>        IXGBE_RX_LOCK(rxr);
> A>          while
> A>          {
> A>             get_packet;
> A>
> A>             -- ixgbe_rx_input()
> A>                {
> A>                   ++ IXGBE_RX_UNLOCK(rxr);
> A>                   if_input(packet);
> A>                   ++ IXGBE_RX_LOCK(rxr);
> A>                }
> A>
> A>          }
> A>        IXGBE_RX_UNLOCK(rxr);
> A>      }
> A>
> A> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe).
> A>
> A> These lines probably do LORs masking (if any) well.
> A> However, such change introduce quite significant performance drop:
> A>
> A> On my routing setup (nearly the same from previous -Intel 10G thread in
> A> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is
> A> nearly 20%.
> A>
> A> So my questions are:
> A>
> A> Can any real LORs happen in some complex setup? (I can't imagine any).
> A> If so: maybe we can somehow avoid/workaround such cases? (and consider
> A> removing those locks).
>
> To me this unlock/lock looks like a legacy from times, when the driver
> had a single mutex for both TX and RX parts.
>
> And removing this re-locking in foo_rxeof() was one of the aims for
> separate
> TX/RX locking.
>
> Really, lurking through history shows that once driver had split its
> locking
> to separate RX and TX part, these unlock/lock was removed. However, later
> this unlock/lock was added back:
>
>
> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup
>
> , without any comments for the reason it is added back.
>
> I did not want to add it back, there were problems that constrained me to
do so, although its
been some time, I'd be happy to do some testing again without and see.

Jack

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 16:50:20 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 0C81C5F0;
 Mon, 15 Oct 2012 16:50:20 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id 226B83B655C;
 Mon, 15 Oct 2012 16:50:18 +0000 (UTC)
Message-ID: <507C3E8B.1000307@FreeBSD.org>
Date: Mon, 15 Oct 2012 20:49:15 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: Jack Vogel <jfvogel@gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFOYbcmkt+3-f7BzfaStUHBFEO0TzJXhYES=42ovkencPoKHJA@mail.gmail.com>
In-Reply-To: <CAFOYbcmkt+3-f7BzfaStUHBFEO0TzJXhYES=42ovkencPoKHJA@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 16:50:20 -0000

On 15.10.2012 20:39, Jack Vogel wrote:
>
>
>
> I did not want to add it back, there were problems that constrained me
> to do so, although its
> been some time, I'd be happy to do some testing again without and see.
>
We've got more than hundred routers/firewalls running under heavy load 
without this lock (pre- 2.3.8 version, modified drivers) on both ixgbe / 
igb.


> Jack
>
>


-- 
WBR, Alexander



From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 16:58:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 895EE87F;
 Mon, 15 Oct 2012 16:58:30 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117])
 by mx1.freebsd.org (Postfix) with ESMTP id F31658FC08;
 Mon, 15 Oct 2012 16:58:29 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGwS3d020954;
 Mon, 15 Oct 2012 20:58:28 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGwSTn020953;
 Mon, 15 Oct 2012 20:58:28 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Mon, 15 Oct 2012 20:58:28 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: Jack Vogel <jfvogel@gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
Message-ID: <20121015165828.GX89655@glebius.int.ru>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFOYbcmkt+3-f7BzfaStUHBFEO0TzJXhYES=42ovkencPoKHJA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <CAFOYbcmkt+3-f7BzfaStUHBFEO0TzJXhYES=42ovkencPoKHJA@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: "Alexander V. Chernikov" <melifaro@FreeBSD.org>, net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 16:58:30 -0000

On Mon, Oct 15, 2012 at 09:39:24AM -0700, Jack Vogel wrote:
J> > To me this unlock/lock looks like a legacy from times, when the driver
J> > had a single mutex for both TX and RX parts.
J> >
J> > And removing this re-locking in foo_rxeof() was one of the aims for
J> > separate
J> > TX/RX locking.
J> >
J> > Really, lurking through history shows that once driver had split its
J> > locking
J> > to separate RX and TX part, these unlock/lock was removed. However, later
J> > this unlock/lock was added back:
J> >
J> >
J> > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup
J> >
J> > , without any comments for the reason it is added back.
J> >
J> > I did not want to add it back, there were problems that constrained me to
J> do so, although its
J> been some time, I'd be happy to do some testing again without and see.

Can you please dig through mail archives to identify these problems? I
can't imagine any.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 17:27:13 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9CD2BBF6;
 Mon, 15 Oct 2012 17:27:13 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
 [209.85.212.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 24B7B8FC1A;
 Mon, 15 Oct 2012 17:27:13 +0000 (UTC)
Received: by mail-vb0-f54.google.com with SMTP id v11so7047457vbm.13
 for <multiple recipients>; Mon, 15 Oct 2012 10:27:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=4MNchQ5xQdhoam5MbxDbY7jC8+4pI2r+OrtoDo+4/0c=;
 b=wb2tnoQoG2bVaTe/QkKUArXmvc0KY+WKYdF4Glo/OAfBZx0zXSRJpizyYF7VvpklTA
 hnGncBqxXcvnr22r9yevNNKEa6BtcrlEmg5Fu6Kt/G9/CEWm5RQF45lz0xdYFNG/7Xe7
 ykKs1XzpgvdmU8XydjGX1tcMmtmrD/esJq5YTRffnt2GhErsHVdi1rtgyUdAUvyhMWvS
 zS0SDd/EccH/GaTzrY9gmMoHH/+CYntyLM6FWKFO6C3Dl1tE2PXCTVBjmpQeY57oAnH+
 dE3bqWLQwkM1yov3sBwEIoujg7n2mS9RMGBKD6J60lXoMIglVGFPdOJdHuo4IR3pJz3Z
 zXNw==
MIME-Version: 1.0
Received: by 10.52.65.147 with SMTP id x19mr5858273vds.113.1350322032593; Mon,
 15 Oct 2012 10:27:12 -0700 (PDT)
Received: by 10.58.68.8 with HTTP; Mon, 15 Oct 2012 10:27:12 -0700 (PDT)
In-Reply-To: <20121015165828.GX89655@glebius.int.ru>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFOYbcmkt+3-f7BzfaStUHBFEO0TzJXhYES=42ovkencPoKHJA@mail.gmail.com>
 <20121015165828.GX89655@glebius.int.ru>
Date: Mon, 15 Oct 2012 10:27:12 -0700
Message-ID: <CAFOYbcm20CcA1Lxs6KhG+myu4-+a7vt5h+BSD4FF03ZsqbnhKA@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Jack Vogel <jfvogel@gmail.com>
To: Gleb Smirnoff <glebius@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 17:27:13 -0000

On Mon, Oct 15, 2012 at 9:58 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:

> On Mon, Oct 15, 2012 at 09:39:24AM -0700, Jack Vogel wrote:
> J> > To me this unlock/lock looks like a legacy from times, when the driver
> J> > had a single mutex for both TX and RX parts.
> J> >
> J> > And removing this re-locking in foo_rxeof() was one of the aims for
> J> > separate
> J> > TX/RX locking.
> J> >
> J> > Really, lurking through history shows that once driver had split its
> J> > locking
> J> > to separate RX and TX part, these unlock/lock was removed. However,
> later
> J> > this unlock/lock was added back:
> J> >
> J> >
> J> >
> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup
> J> >
> J> > , without any comments for the reason it is added back.
> J> >
> J> > I did not want to add it back, there were problems that constrained
> me to
> J> do so, although its
> J> been some time, I'd be happy to do some testing again without and see.
>
> Can you please dig through mail archives to identify these problems? I
> can't imagine any.
>
>
It may not be in email, there were tests going on internally here that I
often was working
with...  At this point it doesn't matter, Alexander says its running
without, I will have some
more testing on current code and go from there.

Jack

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 19:23:26 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id E7535319;
 Mon, 15 Oct 2012 19:23:26 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id B7F6A8FC0A;
 Mon, 15 Oct 2012 19:23:26 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1BBEBB911;
 Mon, 15 Oct 2012 15:23:26 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Gleb Smirnoff <glebius@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
Date: Mon, 15 Oct 2012 14:14:27 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org>
In-Reply-To: <20121015163210.GW89655@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
Message-Id: <201210151414.27318.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 15 Oct 2012 15:23:26 -0400 (EDT)
Cc: freebsd-net@freebsd.org, "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>,
 net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 19:23:27 -0000

On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote:
> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
> J> > 3) in practice taskqueue routine is a nightmare for many people since 
> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
> J> > some traffic burst happens: once it is called it starts to schedule 
> J> > itself more and more replacing original ISR routine. Additionally, 
> J> > increasing rx_process_limit does not help since taskqueue is called with 
> J> > the same limit. Finally, currently netisr taskq threads are not bound to 
> J> > any CPU which makes the process even more uncontrollable.
> J> 
> J> I think part of the problem here is that the taskqueue in ixgbe(4) is
> J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
> J> just start transmitting packets directly.
> J> 
> J> I fixed this in igb(4) here:
> J> 
> J> http://svnweb.freebsd.org/base?view=revision&revision=233708
> 
> The problem Alexander describes in 3) definitely wasn't fixed in r233708.
> 
> It is still present in head/, and it prevents me to do good benchmarking
> of pf(4) on igb(4).
> 
> The problem is related to RX handling, so I don't see how r233708 could
> fix it.

Before 233708, if you had a single TX packet waiting to go out and an RX
interrupt arrived, the task queue would be constantly reschedule causing
it to effectively spin at 100% until the TX packet was completely transmitted
and the hardware had updated the descriptor to mark it as complete.  In fact,
as long as you have any pending TX packets at all it will keep spinning until
it gets into a state where you have no pending TX packets (so a steady stream
of TX packets, including, say ACKs would cause the taskqueue to run forever).

In general I think that with MSI-X you should just use an RX processing limit
of -1.  Anything else is just adding overhead in the form of extra context
switches.  Neither the task or the MSI-X interrupt handler are on a thread
that is shared with any other tasks or handlers, so all that scheduling (or
rescheduling) the task will do is result in the task being immediately run
(after either a context switch or returning back to the main loop of the
taskqueue thread).

If you look at the drivers, if a burst of RX traffic ends, the taskqueue
should stop running and stop polling the hardware.  It is only the TX side
that gets stuck needlessly polling.  The watchdog timer rescheduling the
handler once a second when there is no watchdog condition doesn't help
matters either, but I think that is unique to ixgbe(4).

It would be good if you could determine exactly why igb thinks it needs to
reschedule the taskqueue in your test case on igb(4) post 233708.

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 19:23:26 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id E7535319;
 Mon, 15 Oct 2012 19:23:26 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id B7F6A8FC0A;
 Mon, 15 Oct 2012 19:23:26 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1BBEBB911;
 Mon, 15 Oct 2012 15:23:26 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Gleb Smirnoff <glebius@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
Date: Mon, 15 Oct 2012 14:14:27 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org>
In-Reply-To: <20121015163210.GW89655@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
Message-Id: <201210151414.27318.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Mon, 15 Oct 2012 15:23:26 -0400 (EDT)
Cc: freebsd-net@freebsd.org, "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>,
 net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 19:23:27 -0000

On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote:
> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
> J> > 3) in practice taskqueue routine is a nightmare for many people since 
> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
> J> > some traffic burst happens: once it is called it starts to schedule 
> J> > itself more and more replacing original ISR routine. Additionally, 
> J> > increasing rx_process_limit does not help since taskqueue is called with 
> J> > the same limit. Finally, currently netisr taskq threads are not bound to 
> J> > any CPU which makes the process even more uncontrollable.
> J> 
> J> I think part of the problem here is that the taskqueue in ixgbe(4) is
> J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
> J> just start transmitting packets directly.
> J> 
> J> I fixed this in igb(4) here:
> J> 
> J> http://svnweb.freebsd.org/base?view=revision&revision=233708
> 
> The problem Alexander describes in 3) definitely wasn't fixed in r233708.
> 
> It is still present in head/, and it prevents me to do good benchmarking
> of pf(4) on igb(4).
> 
> The problem is related to RX handling, so I don't see how r233708 could
> fix it.

Before 233708, if you had a single TX packet waiting to go out and an RX
interrupt arrived, the task queue would be constantly reschedule causing
it to effectively spin at 100% until the TX packet was completely transmitted
and the hardware had updated the descriptor to mark it as complete.  In fact,
as long as you have any pending TX packets at all it will keep spinning until
it gets into a state where you have no pending TX packets (so a steady stream
of TX packets, including, say ACKs would cause the taskqueue to run forever).

In general I think that with MSI-X you should just use an RX processing limit
of -1.  Anything else is just adding overhead in the form of extra context
switches.  Neither the task or the MSI-X interrupt handler are on a thread
that is shared with any other tasks or handlers, so all that scheduling (or
rescheduling) the task will do is result in the task being immediately run
(after either a context switch or returning back to the main loop of the
taskqueue thread).

If you look at the drivers, if a burst of RX traffic ends, the taskqueue
should stop running and stop polling the hardware.  It is only the TX side
that gets stuck needlessly polling.  The watchdog timer rescheduling the
handler once a second when there is no watchdog condition doesn't help
matters either, but I think that is unique to ixgbe(4).

It would be good if you could determine exactly why igb thinks it needs to
reschedule the taskqueue in your test case on igb(4) post 233708.

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 20:48:35 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id AE5D6F29;
 Mon, 15 Oct 2012 20:48:35 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com
 [209.85.220.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 3ACF38FC08;
 Mon, 15 Oct 2012 20:48:34 +0000 (UTC)
Received: by mail-vc0-f182.google.com with SMTP id fw7so8162445vcb.13
 for <multiple recipients>; Mon, 15 Oct 2012 13:48:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=9vwLYR0bQ5U/NIC0hJlBagJoyCUEI16v7Zr36M4Sugk=;
 b=No1NjvzBl1SSS9FhzOz9Ad3Ce0hr8ff9bN6LjgsWns9+NWylt3/j/2w48GH3ukwMNA
 n5GRYDsVeTkEQRKup1qhoPQItKjYTcFTq17qG88Hxy/GQil/HUNfrdB5MN3vkBW4vMVy
 rliXgPoT5fgGkKjXlWRHOi5225mcULdLKH/Na+udYLlv+maa1Sm7z1BnfUsEtJRltMof
 8f9wK0nVciDqTkRu+MVbgJmq7BxAT8RDQx3nDVx1l3bPmFFIjEABjrb/3RjJhRpiYq+s
 X4VwZ4nKF2ZeDSBhx34I9G9+tleaatGXPo2lvn7nIXpLIVWVHS4FV9qjiKKOYOtoZkEv
 yAHQ==
MIME-Version: 1.0
Received: by 10.52.155.199 with SMTP id vy7mr6121885vdb.54.1350334114196; Mon,
 15 Oct 2012 13:48:34 -0700 (PDT)
Received: by 10.58.207.114 with HTTP; Mon, 15 Oct 2012 13:48:34 -0700 (PDT)
In-Reply-To: <20121015162926.GV89655@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
Date: Mon, 15 Oct 2012 16:48:34 -0400
Message-ID: <CAFMmRNxT=GWxc6r7B81ENjzwJmfea3016Sh-DxJEGBwybM0QwQ@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Ryan Stone <rysto32@gmail.com>
To: Gleb Smirnoff <glebius@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 20:48:35 -0000

On Mon, Oct 15, 2012 at 12:29 PM, Gleb Smirnoff <glebius@freebsd.org> wrote:
> To me this unlock/lock looks like a legacy from times, when the driver
> had a single mutex for both TX and RX parts.
>
> And removing this re-locking in foo_rxeof() was one of the aims for separate
> TX/RX locking.
>
> Really, lurking through history shows that once driver had split its locking
> to separate RX and TX part, these unlock/lock was removed. However, later
> this unlock/lock was added back:
>
> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup
>
> , without any comments for the reason it is added back.

There's a convoluted LOR if you call into the stack with the RX lock
held which is described here:

http://lists.freebsd.org/pipermail/freebsd-net/2012-September/033371.html

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 22:36:58 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 18D8B52B;
 Mon, 15 Oct 2012 22:36:58 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com
 [209.85.160.54])
 by mx1.freebsd.org (Postfix) with ESMTP id C171B8FC0A;
 Mon, 15 Oct 2012 22:36:57 +0000 (UTC)
Received: by mail-pb0-f54.google.com with SMTP id rp8so5740118pbb.13
 for <multiple recipients>; Mon, 15 Oct 2012 15:36:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=NBeUqkxXqFANk8PmvxmkSkAK7q/Xh5Ri1zXO4hpRjz4=;
 b=q4DUF1Hs0bpzF44xS+NaYIkjvnNknLLQkoUlWn7lEQsaw7NsbCjv5iQz3pyqVpiJS3
 MqEishawiHO4OCp11L/jcojcbDAtxuYNicKNPbkKr2kAgsgYZlTj6BM0tXnohiZtzTFU
 mCdqfXFPLuFM1oPuSKRVqsWyth1ugSyYAYFW5IRXeLGtEN26fuY+ir2SnD3iJozCPgAq
 9wukRk5pI/1ZyKdJwcmyzkk5obX1oR1yydgwj9/TXlCBRctFCx4xhIOgPRvPjPNikZbh
 4WdW7XVYMGqc+dXMRJKCESvoVJV3FKVPuC1HNJAJ/m48INUsNJ316D+5oJf8cZKaOIG0
 4NXg==
MIME-Version: 1.0
Received: by 10.66.80.133 with SMTP id r5mr10706792pax.24.1350340617342; Mon,
 15 Oct 2012 15:36:57 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Mon, 15 Oct 2012 15:36:57 -0700 (PDT)
In-Reply-To: <201210151414.27318.jhb@freebsd.org>
References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org>
 <20121015163210.GW89655@FreeBSD.org>
 <201210151414.27318.jhb@freebsd.org>
Date: Mon, 15 Oct 2012 15:36:57 -0700
X-Google-Sender-Auth: KyTp3ym2n1JjTzY8n01J-YeRRuM
Message-ID: <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Adrian Chadd <adrian@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, freebsd-net@freebsd.org,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org,
 Luigi Rizzo <rizzo@iet.unipi.it>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 22:36:58 -0000

The reason why I've started moving net80211 and ath _away_ from using
direct dispatch (for now) and to using a taskqueue for TX (and RX) is
because it's too freaking annoying right now to deal with all the
crazy long-held locks to guarantee consistency between multiple
transmitting threads.

Considering that the driver and net80211 stack:

* sometimes is PCI, sometimes is USB (with all the differing thread
models that exist there);
* sometimes bridge traffic, sometimes route traffic, sometimes source
or terminate TCP/UDP connections;
* sometimes has one sender, sometimes has multiple senders, with some
other modules in between (bridge, pf, ipfw, etc) with locks being held
here and there;
* since the stack(s) like doing direct dispatch, RX very often causes
TX to occur, which for some drivers will block on a long-held driver
lock (with all the LORs that occur) - and drivers that do this (eg
iwn) will simply drop the lock before passing the packet up. Dropping
the lock before passing net80211_input*() .. is just plain silly.

Now, I'd _like_ to eventually make net80211/ath support direct
dispatch, but that also requires making sure only -one- transmitter is
working at once. I'd like to not have the extra context switch
overhead, but I haven't seen a better way of doing it yet.

It's fun to see the gige/10ge driver have lots of long held locks with
lots of concurrent sender processes possibly blocking until TX
completes.. so I wonder if that has scaling issues for lots of
connections/sending processes.



Adrian

From owner-freebsd-net@FreeBSD.ORG  Mon Oct 15 22:36:58 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 18D8B52B;
 Mon, 15 Oct 2012 22:36:58 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com
 [209.85.160.54])
 by mx1.freebsd.org (Postfix) with ESMTP id C171B8FC0A;
 Mon, 15 Oct 2012 22:36:57 +0000 (UTC)
Received: by mail-pb0-f54.google.com with SMTP id rp8so5740118pbb.13
 for <multiple recipients>; Mon, 15 Oct 2012 15:36:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=NBeUqkxXqFANk8PmvxmkSkAK7q/Xh5Ri1zXO4hpRjz4=;
 b=q4DUF1Hs0bpzF44xS+NaYIkjvnNknLLQkoUlWn7lEQsaw7NsbCjv5iQz3pyqVpiJS3
 MqEishawiHO4OCp11L/jcojcbDAtxuYNicKNPbkKr2kAgsgYZlTj6BM0tXnohiZtzTFU
 mCdqfXFPLuFM1oPuSKRVqsWyth1ugSyYAYFW5IRXeLGtEN26fuY+ir2SnD3iJozCPgAq
 9wukRk5pI/1ZyKdJwcmyzkk5obX1oR1yydgwj9/TXlCBRctFCx4xhIOgPRvPjPNikZbh
 4WdW7XVYMGqc+dXMRJKCESvoVJV3FKVPuC1HNJAJ/m48INUsNJ316D+5oJf8cZKaOIG0
 4NXg==
MIME-Version: 1.0
Received: by 10.66.80.133 with SMTP id r5mr10706792pax.24.1350340617342; Mon,
 15 Oct 2012 15:36:57 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Mon, 15 Oct 2012 15:36:57 -0700 (PDT)
In-Reply-To: <201210151414.27318.jhb@freebsd.org>
References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org>
 <20121015163210.GW89655@FreeBSD.org>
 <201210151414.27318.jhb@freebsd.org>
Date: Mon, 15 Oct 2012 15:36:57 -0700
X-Google-Sender-Auth: KyTp3ym2n1JjTzY8n01J-YeRRuM
Message-ID: <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Adrian Chadd <adrian@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, freebsd-net@freebsd.org,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org,
 Luigi Rizzo <rizzo@iet.unipi.it>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Oct 2012 22:36:58 -0000

The reason why I've started moving net80211 and ath _away_ from using
direct dispatch (for now) and to using a taskqueue for TX (and RX) is
because it's too freaking annoying right now to deal with all the
crazy long-held locks to guarantee consistency between multiple
transmitting threads.

Considering that the driver and net80211 stack:

* sometimes is PCI, sometimes is USB (with all the differing thread
models that exist there);
* sometimes bridge traffic, sometimes route traffic, sometimes source
or terminate TCP/UDP connections;
* sometimes has one sender, sometimes has multiple senders, with some
other modules in between (bridge, pf, ipfw, etc) with locks being held
here and there;
* since the stack(s) like doing direct dispatch, RX very often causes
TX to occur, which for some drivers will block on a long-held driver
lock (with all the LORs that occur) - and drivers that do this (eg
iwn) will simply drop the lock before passing the packet up. Dropping
the lock before passing net80211_input*() .. is just plain silly.

Now, I'd _like_ to eventually make net80211/ath support direct
dispatch, but that also requires making sure only -one- transmitter is
working at once. I'd like to not have the extra context switch
overhead, but I haven't seen a better way of doing it yet.

It's fun to see the gige/10ge driver have lots of long held locks with
lots of concurrent sender processes possibly blocking until TX
completes.. so I wonder if that has scaling issues for lots of
connections/sending processes.



Adrian

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 02:11:38 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5AC6F212;
 Tue, 16 Oct 2012 02:11:38 +0000 (UTC)
 (envelope-from gnn@neville-neil.com)
Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176])
 by mx1.freebsd.org (Postfix) with ESMTP id 152158FC0A;
 Tue, 16 Oct 2012 02:11:36 +0000 (UTC)
Received: from pool-96-250-5-62.nycmny.fios.verizon.net ([96.250.5.62]:58711
 helo=minion.home)
 by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80)
 (envelope-from <gnn@neville-neil.com>)
 id 1TNwd6-0008Nm-Jt; Mon, 15 Oct 2012 22:11:36 -0400
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful
From: George Neville-Neil <gnn@neville-neil.com>
In-Reply-To: <201210150908.36498.jhb@freebsd.org>
Date: Mon, 15 Oct 2012 22:11:41 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <70397C6E-202A-4FAE-AF53-6A5A1D89FAAC@neville-neil.com>
References: <201210121213.11152.jhb@freebsd.org>
 <201210150908.36498.jhb@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
X-Mailer: Apple Mail (2.1499)
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - vps.hungerhost.com
X-AntiAbuse: Original Domain - freebsd.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - neville-neil.com
Cc: freebsd-net@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 02:11:38 -0000


On Oct 15, 2012, at 09:08 , John Baldwin <jhb@freebsd.org> wrote:

> On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote:
>> Back in 2001 FreeBSD added a hack to strip TCP options from =
retransmitted SYNs=20
>> starting with the 3rd SYN in this block in tcp_timer.c:
>>=20
>> 	/*
>> 	 * Disable rfc1323 if we haven't got any response to
>> 	 * our third SYN to work-around some broken terminal servers
>> 	 * (most of which have hopefully been retired) that have bad VJ
>> 	 * header compression code which trashes TCP segments containing
>> 	 * unknown-to-them TCP options.
>> 	 */
>> 	if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D =
3))
>> 		tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
>>=20
>> There is even a PR for the original bug report: kern/1689
>>=20
>> However, there is an unintended consequence of this change that can =
be=20
>> disastrous.  Specifically, suppose you have a FreeBSD client =
connecting to a=20
>> server, and that the SYNs are arriving at the server successfully, =
but the=20
>> first few return SYN/ACKs are dropped.  Eventually a SYN/ACK makes it =
through=20
>> and the connection is established.
>>=20
>> The server (based on the first SYN it saw) believes it has negotiated =
window=20
>> scaling with the client.  The client, however, has broken what it =
promised in=20
>> that first SYN and believes it is not using any window scaling at =
all.  This=20
>> causes two forms of breakage:
>>=20
>> 1) When the server advertises a scaled window (e.g. '8' for a 64k =
window
>>    scaled at 13), the client thinks it is an unscaled window ('8') =
and
>>    sends data to the server very slowly.
>>=20
>> 2) When the client advertises an unscaled window (e.g. '65535' for a =
64k
>>    window), the server thinks it has a huge window (65535 << 13 =3D=3D =
511MB)
>>    to send into.
>>=20
>> I'm not sure that 2) is a problem per se, but I have definitely seen =
instances=20
>> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and =
client=20
>> end of the connections to verify they disagreed on the scaling).
>>=20
>> The original motivation of this change is to work around broken =
terminal=20
>> servers that were old when this change was added in 2001.  Over 10 =
years later=20
>> I think we should at least have an option to turn this work-around =
off, and=20
>> possibly disable it by default.
>>=20
>> Thoughts?
>=20
> How about this:
>=20
> Index: tcp_timer.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- tcp_timer.c	(revision 241579)
> +++ tcp_timer.c	(working copy)
> @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, =
CTLFL
> 	/* max idle probes */
> int	tcp_maxpersistidle;
>=20
> +static int	tcp_rexmit_drop_options =3D 0;
> +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW,
> +    &tcp_rexmit_drop_options, 0,
> +    "Drop TCP options from 3rd and later retransmitted SYN");
> +
> static int	per_cpu_timers =3D 0;
> SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW,
>     &per_cpu_timers , 0, "run tcp timers on all cpus");
> @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp)
> 	 * header compression code which trashes TCP segments containing
> 	 * unknown-to-them TCP options.
> 	 */
> -	if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D =
3))
> +	if (tcp_rexmit_drop_options && (tp->t_state =3D=3D =
TCPS_SYN_SENT) &&
> +	    (tp->t_rxtshift =3D=3D 3))
> 		tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
> 	/*
> 	 * If we backed off this far, our srtt estimate is probably =
bogus.
>=20
> Any other suggestions on the sysctl name?

The name's fine.  Commit that sucker and turn it off.

Best,
George


From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 02:11:38 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5AC6F212;
 Tue, 16 Oct 2012 02:11:38 +0000 (UTC)
 (envelope-from gnn@neville-neil.com)
Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176])
 by mx1.freebsd.org (Postfix) with ESMTP id 152158FC0A;
 Tue, 16 Oct 2012 02:11:36 +0000 (UTC)
Received: from pool-96-250-5-62.nycmny.fios.verizon.net ([96.250.5.62]:58711
 helo=minion.home)
 by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80)
 (envelope-from <gnn@neville-neil.com>)
 id 1TNwd6-0008Nm-Jt; Mon, 15 Oct 2012 22:11:36 -0400
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful
From: George Neville-Neil <gnn@neville-neil.com>
In-Reply-To: <201210150908.36498.jhb@freebsd.org>
Date: Mon, 15 Oct 2012 22:11:41 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <70397C6E-202A-4FAE-AF53-6A5A1D89FAAC@neville-neil.com>
References: <201210121213.11152.jhb@freebsd.org>
 <201210150908.36498.jhb@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
X-Mailer: Apple Mail (2.1499)
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - vps.hungerhost.com
X-AntiAbuse: Original Domain - freebsd.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - neville-neil.com
Cc: freebsd-net@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 02:11:38 -0000


On Oct 15, 2012, at 09:08 , John Baldwin <jhb@freebsd.org> wrote:

> On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote:
>> Back in 2001 FreeBSD added a hack to strip TCP options from =
retransmitted SYNs=20
>> starting with the 3rd SYN in this block in tcp_timer.c:
>>=20
>> 	/*
>> 	 * Disable rfc1323 if we haven't got any response to
>> 	 * our third SYN to work-around some broken terminal servers
>> 	 * (most of which have hopefully been retired) that have bad VJ
>> 	 * header compression code which trashes TCP segments containing
>> 	 * unknown-to-them TCP options.
>> 	 */
>> 	if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D =
3))
>> 		tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
>>=20
>> There is even a PR for the original bug report: kern/1689
>>=20
>> However, there is an unintended consequence of this change that can =
be=20
>> disastrous.  Specifically, suppose you have a FreeBSD client =
connecting to a=20
>> server, and that the SYNs are arriving at the server successfully, =
but the=20
>> first few return SYN/ACKs are dropped.  Eventually a SYN/ACK makes it =
through=20
>> and the connection is established.
>>=20
>> The server (based on the first SYN it saw) believes it has negotiated =
window=20
>> scaling with the client.  The client, however, has broken what it =
promised in=20
>> that first SYN and believes it is not using any window scaling at =
all.  This=20
>> causes two forms of breakage:
>>=20
>> 1) When the server advertises a scaled window (e.g. '8' for a 64k =
window
>>    scaled at 13), the client thinks it is an unscaled window ('8') =
and
>>    sends data to the server very slowly.
>>=20
>> 2) When the client advertises an unscaled window (e.g. '65535' for a =
64k
>>    window), the server thinks it has a huge window (65535 << 13 =3D=3D =
511MB)
>>    to send into.
>>=20
>> I'm not sure that 2) is a problem per se, but I have definitely seen =
instances=20
>> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and =
client=20
>> end of the connections to verify they disagreed on the scaling).
>>=20
>> The original motivation of this change is to work around broken =
terminal=20
>> servers that were old when this change was added in 2001.  Over 10 =
years later=20
>> I think we should at least have an option to turn this work-around =
off, and=20
>> possibly disable it by default.
>>=20
>> Thoughts?
>=20
> How about this:
>=20
> Index: tcp_timer.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> --- tcp_timer.c	(revision 241579)
> +++ tcp_timer.c	(working copy)
> @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, =
CTLFL
> 	/* max idle probes */
> int	tcp_maxpersistidle;
>=20
> +static int	tcp_rexmit_drop_options =3D 0;
> +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW,
> +    &tcp_rexmit_drop_options, 0,
> +    "Drop TCP options from 3rd and later retransmitted SYN");
> +
> static int	per_cpu_timers =3D 0;
> SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW,
>     &per_cpu_timers , 0, "run tcp timers on all cpus");
> @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp)
> 	 * header compression code which trashes TCP segments containing
> 	 * unknown-to-them TCP options.
> 	 */
> -	if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D =
3))
> +	if (tcp_rexmit_drop_options && (tp->t_state =3D=3D =
TCPS_SYN_SENT) &&
> +	    (tp->t_rxtshift =3D=3D 3))
> 		tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP);
> 	/*
> 	 * If we backed off this far, our srtt estimate is probably =
bogus.
>=20
> Any other suggestions on the sysctl name?

The name's fine.  Commit that sucker and turn it off.

Best,
George


From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 03:42:17 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 65B958FD
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 03:42:17 +0000 (UTC)
 (envelope-from vcfdser@yahoo.com)
Received: from s7.send-out.co.cc (s7.send-out.co.cc [41.32.158.117])
 by mx1.freebsd.org (Postfix) with ESMTP id 2C5A48FC0C
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 03:42:15 +0000 (UTC)
Received: from PC2 ([41.32.158.117]) by s7.send-out.co.cc with Microsoft
 SMTPSVC(6.0.2600.2096); Mon, 15 Oct 2012 21:51:45 +0200
From: "vcfdser@yahoo.com" <vcfdser@yahoo.com>
To: freebsd-net@freebsd.org
Subject: Message to whole mankind
X-Mailer: SendBlaster.1.6.0
Date: Mon, 15 Oct 2012 21:51:45 +0200
Message-ID: <295263127008937716989@PC2>
X-OriginalArrivalTime: 15 Oct 2012 19:51:45.0906 (UTC)
 FILETIME=[81CADD20:01CDAB0E]
MIME-Version: 1.0
Content-Type: text/plain
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: vcfdser@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 03:42:17 -0000

   The submission of man to His Creator is the essence of Islam. The name
   "Islam" is chosen by God (Allah) and not by man. It is the same
   unifying Message revealed to all the Prophets and Messengers by Allah
   and which they spread amongst their respective nations. In its Final
   form it was revealed to Muhammad (Peace & Mercy of Allah be upon him)
   as a complete Message to whole mankind. The Lord, Allah, is the True
   and Only Creator that deserves to be worshipped. No worship is worthy
   of being given to a stone, statue, a cross, a triangle, Khomeini,
   Farakhan, Eliajahs, Malcom's X or Y, Ghandi, Krishna, Guru, Buddha,
   Mahatma, Emperor, Joseph Smith, Sun, Moon (not to that from Korea
   too!), Light, Fire, rivers, cows, Rama, Temples, Prophets, Messengers
   (Yes! Muslims do not worship Muhammad-peace be upon him), Saints,
   Priests, Monks, Movie Stars, Sheiks, etc.!!! All are created beings or
   things.

   ALLAH, is the Name of the One True God. His Name is not chosen by man
   and does not have a number or gender. It is known that Allah is the
   Name of God in Aramaic, the language of our beloved Prophet Jesus and a
   sister language of Arabic. The Name "Allah" has been used by all
   previous Prophets starting with Adam and by the last and final Prophet,
   Muhammad (Peace be upon them all).
   The Innate Nature in man recognizes what is good and bad, what is true
   and false. It recognizes that the Attributes of Allah must be True,
   Unique, and All-Perfect. It does not feel comfortable towards any kind
   of degradation of His Attributes not does it qualities to the Creator.
   Many who became "discontent with God" did so because of the practices
   of the Church in medieval Europe and because of the claims of "god
   dwelling in a son" and the concept of the "original sin". However, they
   "escaped" into worshipping a new theory called "mother nature" as well
   as the "material" World. With the advancement of materialistic
   technology others from different religions adopted the concept of
   "forgetting about God" and "let us live this life and enjoy it!", not
   realizing that they have chosen the worship of the "original god" of
   Rome: Desire!.
   NOW we can see that all of this materialistic and secular progress
   produced a spiritual vacuum that led to complex social, economical,
   political, and psychological problems. Many of those who "fled" their
   "religions" are in search again. Some try to "escape" the complexity of
   their daily lives via various means. Those who had the chance to
   examine the Qur'an and Islam, proceed with a complete way of life that
   relates man to establish a purpose for his presence on earth. This is
   well recognized in the Attributes of Allah and what does He require
   from man. He does not want man to be enslaved to any false deity:
   nature, drugs, lust, money, other man, desire, or sex. He provides man
   with the proofs that He is the One who can redeem so that man can free
   himself from the slavery to any form of creation and to turn to his
   Creator Alone. THIS Creator Has Perfect Attributes. He is the First,
   nothing is before Him, the Ever Living. To Him is the Final Return
   where everyone will be dealt with in the Most Perfect and Just way. He
   does not begot nor He is begotten. Those who attribute Divinity to
   Jesus forget or ignore the fact that Jesus was in a mother's womb. He
   needed nutrition; he was born and grew up to be a man. He was trusted
   with the Gospel as a Message to the Children of Israel: "For there is
   One God, and one mediator (i.e. a messenger) between God and men (the
   Children of Israel), the man Christ Jesus) (I Timothy 2:5). A
   man-messenger calling his nation not to worship him: "But in vain they
   do worship me!" (Mathew 15:9). A man who needs to eat, walk, sleed,
   rest, etc.. cannot have Divine Attributes because he is in need and God
   (Allah) is Self-Sufficient.
   AS far as Buddhism, Hinduism, Zoroastrianism, Marxism, and Capitalism,
   there is the devotion of worshipping created being/things in one form
   or another. Jews had attributed a "Nationalistic" belonging to Allah.
   They labeled Him "The Tribal God" for the Children of Israel. Men and
   women following these "religions" were born with the natural
   inclination of submission to their Creator, Allah. It is their parents
   who had driven them into their respective traditions. However, once
   people are exposed to the Signs of Allah around them, or in the Qur'an
   or to someone who triggers thei Fitra (natural inclination to worship
   Allah Alone), the reverting process begins and that is why we see a
   universal spreading of Islam. In the West and despite tha many
   distortions of Islam in the Media, many admit that Islam may be the
   fastest growing Faith. No sense of fairness can be achieved without a
   genuine attempt to know the Word of Allah in the Qur'an and not on the
   30-min-Evening News. This is the real challenge for those who seek the
   Truth. Man is created for a purpose: to live a life in accordance with
   Allah's way. Why Not? Do we posses the air we breath? Did we create
   ourselves or others? Or were we ourselves the Creators? We are limited
   and weak. So is our right to ignore our Creator where we all need Him?
   ISLAM is the submission in worship to Allah Alone and it is the essence
   of all the Messages sent to all nations before us. Allah is All-Just
   and All-Wise. He does not intend confusion for His Creation. The
   religion accepted to Him is the one chosen by Him. Its essence must be
   One, because He is One. It is free from geographical, racist, and
   status oriented concepts. It is Perfect and it is the complete way of
   life. All these qualities are chosen by Allah in His Only Religion:
   Islam. Its details are in in the Qur'an, read it and come with an open
   heart because none can expose better than the World of Allah. The
   Qur'an was revealed to Prophet Muhammad. He did not author it. He was
   unlettered. Its translation is available in many languages in
   bookstores or in an Islamic Center close to you. Take the time to read
   it and come/call the Islamic Center, or speak to someone who re-verted
   and submitted to Allah Alone.
   Prepared by Dr. Saleh As-Saleh

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 10:06:25 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 516A049B
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 10:06:25 +0000 (UTC)
 (envelope-from ozkan.kirik@gmail.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
 [209.85.212.54])
 by mx1.freebsd.org (Postfix) with ESMTP id EFAFF8FC1A
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 10:06:24 +0000 (UTC)
Received: by mail-vb0-f54.google.com with SMTP id v11so7980738vbm.13
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 03:06:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=LZBs9dsfL8c7GiE5aOo8ya46oQOR3I6OEpiLpdCxO1s=;
 b=yd4zi5/Nbqk6Vzbu2C/3Jt8l7S8swkf3Tno/owe3Wt+Nc6NpS8mqxWhLfCRf+5atP0
 WaXTPogfirB6SXAFDIvAoP7lVbbe57ORpLbXRLu7y/8/Awa56jqtkvSWMgTB50AHo3lg
 /Fy7/KQCMACBFjVURYuQ9SpYnjqWyrq9btoptXQuCsvSwP8kIIu/CL1vn5jNn4mbVOqL
 AV/gzl8pZA1Ui4EscyQMbc0gf3kGHjD1/gyQnEe4EmEGp4gKlyM8t40VR/TENnPZ2fdX
 /DgIcCebiEhPRU84N2GnSbpkjaHswKLsoUmRu2r//2XySNJs3vOiCJa1wg29QZvL47yN
 C70Q==
MIME-Version: 1.0
Received: by 10.220.150.14 with SMTP id w14mr8366656vcv.13.1350381984272; Tue,
 16 Oct 2012 03:06:24 -0700 (PDT)
Received: by 10.58.56.135 with HTTP; Tue, 16 Oct 2012 03:06:24 -0700 (PDT)
In-Reply-To: <506EEE46.1000604@airnet.opole.pl>
References: <2DE61B0869B7484997BCA012845482C7EBE8E280DB@WIN2008.Domnt.abi.ca>
 <5068AC17.8020704@FreeBSD.org>
 <2DE61B0869B7484997BCA012845482C7EBE8E280DC@WIN2008.Domnt.abi.ca>
 <5068ADCC.5030105@FreeBSD.org>
 <2DE61B0869B7484997BCA012845482C7EBE8E280DD@WIN2008.Domnt.abi.ca>
 <5068B48E.2070303@FreeBSD.org> <20121004160240.GA1967@funkthat.com>
 <506DC933.7080307@airnet.opole.pl>
 <20121004222327.GA40357@in-addr.com>
 <506E7BE7.2080104@airnet.opole.pl>
 <CAAEEwq0OCdVn3Fy4Cc7SXMgzo5F9KAxeu=6nW8h3AbcVs+OMpA@mail.gmail.com>
 <506ED2AD.8000408@airnet.opole.pl>
 <2DE61B0869B7484997BCA012845482C7EBE8E28162@WIN2008.Domnt.abi.ca>
 <506EEE46.1000604@airnet.opole.pl>
Date: Tue, 16 Oct 2012 13:06:24 +0300
Message-ID: <CAAcX-AEL5TqUzY=4kUdffXLdQGf=KVfCpJgCd-pwz4D0_GYMog@mail.gmail.com>
Subject: Re: Default route destination changing without warning follow-up
From: =?ISO-8859-1?Q?=D6zkan_KIRIK?= <ozkan.kirik@gmail.com>
To: Krzysztof Barcikowski <krzysiek@airnet.opole.pl>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 10:06:25 -0000

I was reported this behaviour before.

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157796



On Fri, Oct 5, 2012 at 5:27 PM, Krzysztof Barcikowski
<krzysiek@airnet.opole.pl> wrote:
> W dniu 2012-10-05 16:22, Dominic Blais pisze:
>
>> Hi,
>>
>> I'm using GENERIC. Everything else is added as loaded module.
>>
>> Here's my kldstat:
>>
>>
>
> I forgot about modules, here they are:
>
> Id Refs Address            Size     Name
>  1   13 0xffffffff80200000 12200c8  kernel
>  2    1 0xffffffff81421000 215f8    geom_mirror.ko
>  3    1 0xffffffff81443000 29e8     coretemp.ko
>  4    1 0xffffffff81446000 17450    dummynet.ko
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:13:08 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 6A9293A5;
 Tue, 16 Oct 2012 12:13:08 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id C7F9A3B5C58;
 Tue, 16 Oct 2012 12:13:06 +0000 (UTC)
Message-ID: <507D4F11.2030704@FreeBSD.org>
Date: Tue, 16 Oct 2012 16:12:01 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: Ryan Stone <rysto32@gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFMmRNxT=GWxc6r7B81ENjzwJmfea3016Sh-DxJEGBwybM0QwQ@mail.gmail.com>
In-Reply-To: <CAFMmRNxT=GWxc6r7B81ENjzwJmfea3016Sh-DxJEGBwybM0QwQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:13:08 -0000

On 16.10.2012 00:48, Ryan Stone wrote:
> On Mon, Oct 15, 2012 at 12:29 PM, Gleb Smirnoff <glebius@freebsd.org> wrote:
>> To me this unlock/lock looks like a legacy from times, when the driver
>> had a single mutex for both TX and RX parts.
>>
>> And removing this re-locking in foo_rxeof() was one of the aims for separate
>> TX/RX locking.
>>
>> Really, lurking through history shows that once driver had split its locking
>> to separate RX and TX part, these unlock/lock was removed. However, later
>> this unlock/lock was added back:
>>
>> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup
>>
>> , without any comments for the reason it is added back.
>
> There's a convoluted LOR if you call into the stack with the RX lock
> held which is described here:
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-September/033371.html

Are you using stock ixgbe driver?

lock order reversal:^M^M
  1st 0xffffff800153c138 ix:rx (ix:rx) @ src/sys/dev/ixgbe/ixgbe.c:7113^M^M
  2nd 0xffffffff80af9c48 udp (udp) @ src/sys/netinet/udp_usrreq.c:471^M^M

It seems to me than ixgbe.c was always like ~5.5k lines of code, line 
7113 seems a bit suspicious.

  2nd 0xffffff8001539400 ixgbe0 (IXGBE Core Lock) @
src/sys/dev/ixgbe/ixgbe.c:1725

Nearest IXGBE_CORE_LOCK() in r217917 (8.2-R) resides at line 905.


Maybe I'm missing something obvious?


>


-- 
WBR, Alexander



From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:40:49 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 8F1E1BFB;
 Tue, 16 Oct 2012 12:40:49 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
 [209.85.212.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 043958FC19;
 Tue, 16 Oct 2012 12:40:48 +0000 (UTC)
Received: by mail-vb0-f54.google.com with SMTP id v11so8170582vbm.13
 for <multiple recipients>; Tue, 16 Oct 2012 05:40:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=zwyWvomooeuEE0Zoo4Y3zt4wEDvk3xDhL397lId9CSg=;
 b=zy5Ts+0hjnsjtWYIKd+nkIMsFCIaqXW2lMk4s7bAXO04QSkgCRkxhHCsUJ63umtrTz
 MLNljwjZXTtPFVUF4VT8Of8vo+BJQShCPD20C4Kplrs3zNGCCHiQlqe8osRiQSTrkqBu
 9Qooc35yC/zRFaS0oB3/cHa24IooV21g/JplzObiE58Sin7nrNWbE/ELvYKr7OhF96OE
 DxY/eSnMcU6Ay9DraJFFcGZ7wWhFCktexU6ZqQAyUVRUqf5M7s6eKFjOm9BBZXYFGgAL
 Bz6O+QrzspHMDV7ZQ/UIkO4NyiVPXRjvj6mIE7SNnH9f3KWZoRPtXylGLPG9rTlgTdEg
 z3sw==
MIME-Version: 1.0
Received: by 10.220.154.6 with SMTP id m6mr8478035vcw.51.1350391247952; Tue,
 16 Oct 2012 05:40:47 -0700 (PDT)
Received: by 10.58.207.114 with HTTP; Tue, 16 Oct 2012 05:40:47 -0700 (PDT)
In-Reply-To: <507D4F11.2030704@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFMmRNxT=GWxc6r7B81ENjzwJmfea3016Sh-DxJEGBwybM0QwQ@mail.gmail.com>
 <507D4F11.2030704@FreeBSD.org>
Date: Tue, 16 Oct 2012 08:40:47 -0400
Message-ID: <CAFMmRNy1h86ZXAH4oK+LOK6_VeTsbo=d6oYk669bisKfHbJ56A@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Ryan Stone <rysto32@gmail.com>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:40:49 -0000

On Tue, Oct 16, 2012 at 8:12 AM, Alexander V. Chernikov
<melifaro@freebsd.org> wrote:
> Are you using stock ixgbe driver?

Pay no attention to the line numbers behind the curtain. :)

I don't believe that I've changed the locking order at all in the
driver, but you are right, that wasn't taken from the stock driver.

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:47:36 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 737D7EB3;
 Tue, 16 Oct 2012 12:47:36 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117])
 by mx1.freebsd.org (Postfix) with ESMTP id DE4F28FC0C;
 Tue, 16 Oct 2012 12:47:35 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9GClYep033268;
 Tue, 16 Oct 2012 16:47:34 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9GClXZh033267;
 Tue, 16 Oct 2012 16:47:33 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Tue, 16 Oct 2012 16:47:33 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: Ryan Stone <rysto32@gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
Message-ID: <20121016124733.GC89655@glebius.int.ru>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFMmRNxT=GWxc6r7B81ENjzwJmfea3016Sh-DxJEGBwybM0QwQ@mail.gmail.com>
 <507D4F11.2030704@FreeBSD.org>
 <CAFMmRNy1h86ZXAH4oK+LOK6_VeTsbo=d6oYk669bisKfHbJ56A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <CAFMmRNy1h86ZXAH4oK+LOK6_VeTsbo=d6oYk669bisKfHbJ56A@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: "Alexander V. Chernikov" <melifaro@FreeBSD.org>,
 Jack Vogel <jfvogel@gmail.com>, net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:47:36 -0000

On Tue, Oct 16, 2012 at 08:40:47AM -0400, Ryan Stone wrote:
R> > Are you using stock ixgbe driver?
R> 
R> Pay no attention to the line numbers behind the curtain. :)
R> 
R> I don't believe that I've changed the locking order at all in the
R> driver, but you are right, that wasn't taken from the stock driver.

Can you please provide hints how can SIOCADDMULTI lead to obtaining RX
lock in the stock driver?

Sorry if I miss obvious.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:47:57 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 1AC86F4B;
 Tue, 16 Oct 2012 12:47:57 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id 80DEC3B4C86;
 Tue, 16 Oct 2012 12:47:55 +0000 (UTC)
Message-ID: <507D5739.70509@FreeBSD.org>
Date: Tue, 16 Oct 2012 16:46:49 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org>
 <201210151414.27318.jhb@freebsd.org>
In-Reply-To: <201210151414.27318.jhb@freebsd.org>
Content-Type: text/plain; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:47:57 -0000

On 15.10.2012 22:14, John Baldwin wrote:
> On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote:
>> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
>> J> > 3) in practice taskqueue routine is a nightmare for many people since
>> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>> J> > some traffic burst happens: once it is called it starts to schedule
>> J> > itself more and more replacing original ISR routine. Additionally,
>> J> > increasing rx_process_limit does not help since taskqueue is called with
>> J> > the same limit. Finally, currently netisr taskq threads are not bound to
>> J> > any CPU which makes the process even more uncontrollable.
>> J>
>> J> I think part of the problem here is that the taskqueue in ixgbe(4) is
>> J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>> J> just start transmitting packets directly.
>> J>
>> J> I fixed this in igb(4) here:
>> J>
>> J> http://svnweb.freebsd.org/base?view=revision&revision=233708
>>
>> The problem Alexander describes in 3) definitely wasn't fixed in r233708.
>>
>> It is still present in head/, and it prevents me to do good benchmarking
>> of pf(4) on igb(4).
>>
>> The problem is related to RX handling, so I don't see how r233708 could
>> fix it.
>
> Before 233708, if you had a single TX packet waiting to go out and an RX
> interrupt arrived, the task queue would be constantly reschedule causing
> it to effectively spin at 100% until the TX packet was completely transmitted
> and the hardware had updated the descriptor to mark it as complete.  In fact,
> as long as you have any pending TX packets at all it will keep spinning until
> it gets into a state where you have no pending TX packets (so a steady stream
> of TX packets, including, say ACKs would cause the taskqueue to run forever).
>
> In general I think that with MSI-X you should just use an RX processing limit
> of -1.  Anything else is just adding overhead in the form of extra context
Yes, this is the obvious next step after binding threads to CPUs.
> switches.  Neither the task or the MSI-X interrupt handler are on a thread
> that is shared with any other tasks or handlers, so all that scheduling (or
> rescheduling) the task will do is result in the task being immediately run
> (after either a context switch or returning back to the main loop of the
> taskqueue thread).

>
> If you look at the drivers, if a burst of RX traffic ends, the taskqueue
It is questionable if this behavior is good during burst:

1) Due to RX locking taskq eats signifficant (if not all) RX packets 
from given queue
2) Tasq can run on any cpu so this introduces possible out-of-order 
packets within connection which is bad for forwarding (and there were 
some problems in our TCP stack in the past). Additionally, this behavior 
is totally uncontrollable and unscalable (we run _one_ task _instead_ of 
RX handler) and leads to significant performance flapping on 
heavy-loaded forwarding setups.

> should stop running and stop polling the hardware.  It is only the TX side
> that gets stuck needlessly polling.  The watchdog timer rescheduling the
Unfortunately, until at least single call from driver to this function 
remains, it is possible that potential traffic burst can be consumed by 
tasq (especially if large rx_processing_limit is set).

If there are reasons not to change tasq RX processing behavior, maybe 
adding additional sysctl like:
ix.0.loop_forever = 1 can be a compromise?

e.g. main processing loop does not decrease 'count' variable if this 
loop_forever is set, and tasq invocation limit remains controlled by
current rx_processing_limit.

Nothing is changed by default, but people wishing to get predicable 
results simply set loop_forever to 1 and rx_processing_limit to 1 (or 0).



> handler once a second when there is no watchdog condition doesn't help
> matters either, but I think that is unique to ixgbe(4).
>
> It would be good if you could determine exactly why igb thinks it needs to
> reschedule the taskqueue in your test case on igb(4) post 233708.
>


-- 
WBR, Alexander



From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:47:57 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 1AC86F4B;
 Tue, 16 Oct 2012 12:47:57 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id 80DEC3B4C86;
 Tue, 16 Oct 2012 12:47:55 +0000 (UTC)
Message-ID: <507D5739.70509@FreeBSD.org>
Date: Tue, 16 Oct 2012 16:46:49 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org>
 <201210151414.27318.jhb@freebsd.org>
In-Reply-To: <201210151414.27318.jhb@freebsd.org>
Content-Type: text/plain; charset=KOI8-R; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:47:57 -0000

On 15.10.2012 22:14, John Baldwin wrote:
> On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote:
>> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
>> J> > 3) in practice taskqueue routine is a nightmare for many people since
>> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>> J> > some traffic burst happens: once it is called it starts to schedule
>> J> > itself more and more replacing original ISR routine. Additionally,
>> J> > increasing rx_process_limit does not help since taskqueue is called with
>> J> > the same limit. Finally, currently netisr taskq threads are not bound to
>> J> > any CPU which makes the process even more uncontrollable.
>> J>
>> J> I think part of the problem here is that the taskqueue in ixgbe(4) is
>> J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>> J> just start transmitting packets directly.
>> J>
>> J> I fixed this in igb(4) here:
>> J>
>> J> http://svnweb.freebsd.org/base?view=revision&revision=233708
>>
>> The problem Alexander describes in 3) definitely wasn't fixed in r233708.
>>
>> It is still present in head/, and it prevents me to do good benchmarking
>> of pf(4) on igb(4).
>>
>> The problem is related to RX handling, so I don't see how r233708 could
>> fix it.
>
> Before 233708, if you had a single TX packet waiting to go out and an RX
> interrupt arrived, the task queue would be constantly reschedule causing
> it to effectively spin at 100% until the TX packet was completely transmitted
> and the hardware had updated the descriptor to mark it as complete.  In fact,
> as long as you have any pending TX packets at all it will keep spinning until
> it gets into a state where you have no pending TX packets (so a steady stream
> of TX packets, including, say ACKs would cause the taskqueue to run forever).
>
> In general I think that with MSI-X you should just use an RX processing limit
> of -1.  Anything else is just adding overhead in the form of extra context
Yes, this is the obvious next step after binding threads to CPUs.
> switches.  Neither the task or the MSI-X interrupt handler are on a thread
> that is shared with any other tasks or handlers, so all that scheduling (or
> rescheduling) the task will do is result in the task being immediately run
> (after either a context switch or returning back to the main loop of the
> taskqueue thread).

>
> If you look at the drivers, if a burst of RX traffic ends, the taskqueue
It is questionable if this behavior is good during burst:

1) Due to RX locking taskq eats signifficant (if not all) RX packets 
from given queue
2) Tasq can run on any cpu so this introduces possible out-of-order 
packets within connection which is bad for forwarding (and there were 
some problems in our TCP stack in the past). Additionally, this behavior 
is totally uncontrollable and unscalable (we run _one_ task _instead_ of 
RX handler) and leads to significant performance flapping on 
heavy-loaded forwarding setups.

> should stop running and stop polling the hardware.  It is only the TX side
> that gets stuck needlessly polling.  The watchdog timer rescheduling the
Unfortunately, until at least single call from driver to this function 
remains, it is possible that potential traffic burst can be consumed by 
tasq (especially if large rx_processing_limit is set).

If there are reasons not to change tasq RX processing behavior, maybe 
adding additional sysctl like:
ix.0.loop_forever = 1 can be a compromise?

e.g. main processing loop does not decrease 'count' variable if this 
loop_forever is set, and tasq invocation limit remains controlled by
current rx_processing_limit.

Nothing is changed by default, but people wishing to get predicable 
results simply set loop_forever to 1 and rx_processing_limit to 1 (or 0).



> handler once a second when there is no watchdog condition doesn't help
> matters either, but I think that is unique to ixgbe(4).
>
> It would be good if you could determine exactly why igb thinks it needs to
> reschedule the taskqueue in your test case on igb(4) post 233708.
>


-- 
WBR, Alexander



From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:49:19 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 7455311C;
 Tue, 16 Oct 2012 12:49:19 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 42D758FC14;
 Tue, 16 Oct 2012 12:49:19 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 8BF06B980;
 Tue, 16 Oct 2012 08:49:18 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
Date: Tue, 16 Oct 2012 08:38:17 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org>
 <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
In-Reply-To: <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210160838.17741.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 16 Oct 2012 08:49:18 -0400 (EDT)
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, freebsd-net@freebsd.org,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org,
 Luigi Rizzo <rizzo@iet.unipi.it>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:49:19 -0000

On Monday, October 15, 2012 6:36:57 pm Adrian Chadd wrote:
> The reason why I've started moving net80211 and ath _away_ from using
> direct dispatch (for now) and to using a taskqueue for TX (and RX) is
> because it's too freaking annoying right now to deal with all the
> crazy long-held locks to guarantee consistency between multiple
> transmitting threads.
> 
> Considering that the driver and net80211 stack:
> 
> * sometimes is PCI, sometimes is USB (with all the differing thread
> models that exist there);
> * sometimes bridge traffic, sometimes route traffic, sometimes source
> or terminate TCP/UDP connections;
> * sometimes has one sender, sometimes has multiple senders, with some
> other modules in between (bridge, pf, ipfw, etc) with locks being held
> here and there;
> * since the stack(s) like doing direct dispatch, RX very often causes
> TX to occur, which for some drivers will block on a long-held driver
> lock (with all the LORs that occur) - and drivers that do this (eg
> iwn) will simply drop the lock before passing the packet up. Dropping
> the lock before passing net80211_input*() .. is just plain silly.
> 
> Now, I'd _like_ to eventually make net80211/ath support direct
> dispatch, but that also requires making sure only -one- transmitter is
> working at once. I'd like to not have the extra context switch
> overhead, but I haven't seen a better way of doing it yet.
> 
> It's fun to see the gige/10ge driver have lots of long held locks with
> lots of concurrent sender processes possibly blocking until TX
> completes.. so I wonder if that has scaling issues for lots of
> connections/sending processes.

I don't follow how this is related to this thread at all (which has more to do 
with ixgbe scheduling duplicate work).  However, is your issue that the stack 
locks (e.g. socket and protocol layer locks) are held across 
if_start/if_transmit?

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 12:49:19 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 7455311C;
 Tue, 16 Oct 2012 12:49:19 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 42D758FC14;
 Tue, 16 Oct 2012 12:49:19 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 8BF06B980;
 Tue, 16 Oct 2012 08:49:18 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
Date: Tue, 16 Oct 2012 08:38:17 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org>
 <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
In-Reply-To: <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210160838.17741.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 16 Oct 2012 08:49:18 -0400 (EDT)
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, freebsd-net@freebsd.org,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org,
 Luigi Rizzo <rizzo@iet.unipi.it>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 12:49:19 -0000

On Monday, October 15, 2012 6:36:57 pm Adrian Chadd wrote:
> The reason why I've started moving net80211 and ath _away_ from using
> direct dispatch (for now) and to using a taskqueue for TX (and RX) is
> because it's too freaking annoying right now to deal with all the
> crazy long-held locks to guarantee consistency between multiple
> transmitting threads.
> 
> Considering that the driver and net80211 stack:
> 
> * sometimes is PCI, sometimes is USB (with all the differing thread
> models that exist there);
> * sometimes bridge traffic, sometimes route traffic, sometimes source
> or terminate TCP/UDP connections;
> * sometimes has one sender, sometimes has multiple senders, with some
> other modules in between (bridge, pf, ipfw, etc) with locks being held
> here and there;
> * since the stack(s) like doing direct dispatch, RX very often causes
> TX to occur, which for some drivers will block on a long-held driver
> lock (with all the LORs that occur) - and drivers that do this (eg
> iwn) will simply drop the lock before passing the packet up. Dropping
> the lock before passing net80211_input*() .. is just plain silly.
> 
> Now, I'd _like_ to eventually make net80211/ath support direct
> dispatch, but that also requires making sure only -one- transmitter is
> working at once. I'd like to not have the extra context switch
> overhead, but I haven't seen a better way of doing it yet.
> 
> It's fun to see the gige/10ge driver have lots of long held locks with
> lots of concurrent sender processes possibly blocking until TX
> completes.. so I wonder if that has scaling issues for lots of
> connections/sending processes.

I don't follow how this is related to this thread at all (which has more to do 
with ixgbe scheduling duplicate work).  However, is your issue that the stack 
locks (e.g. socket and protocol layer locks) are held across 
if_start/if_transmit?

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 13:20:05 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id D591ADD5;
 Tue, 16 Oct 2012 13:20:05 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
 [209.85.212.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 531228FC0C;
 Tue, 16 Oct 2012 13:20:05 +0000 (UTC)
Received: by mail-vb0-f54.google.com with SMTP id v11so8234287vbm.13
 for <multiple recipients>; Tue, 16 Oct 2012 06:20:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=1bT0WmRPa07jbTKhIxzJLWQzJTt2QP7ibYzX+c+Afc8=;
 b=FclWLbXi1zlsY+fNTQ8D8SNHM7dlHiQk1TxSVacObLCpzo1YPBJ03dLpuLxbgC8TQB
 qk/1IYFWnHU+CN2NkoCmYghG/GRXrm9oLVxwzX7MGsG2CdLB5pz+bT/XW13liMYyIvT5
 /P7qCwICj46d6o369cmR14Vu894XFDmHdspmGWAHwn5T48ZeDcLCoh4YPI6uPDQFr/mI
 a4rgS2KLck44C8dYlEOx/yizJnGKuX0kK33kK5Srzk3Z5rx87h041vCZo66oLLZUhAnH
 lnVC8wxoOsKAmlIIMLU4X8/CXaRYkkH+jns4sNpYGTDzUn/CbSx7IQekxM5eXugvjzJR
 Vl2w==
MIME-Version: 1.0
Received: by 10.52.68.7 with SMTP id r7mr7101958vdt.96.1350393604422; Tue, 16
 Oct 2012 06:20:04 -0700 (PDT)
Received: by 10.58.207.114 with HTTP; Tue, 16 Oct 2012 06:20:04 -0700 (PDT)
In-Reply-To: <20121016124733.GC89655@glebius.int.ru>
References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org>
 <CAFMmRNxT=GWxc6r7B81ENjzwJmfea3016Sh-DxJEGBwybM0QwQ@mail.gmail.com>
 <507D4F11.2030704@FreeBSD.org>
 <CAFMmRNy1h86ZXAH4oK+LOK6_VeTsbo=d6oYk669bisKfHbJ56A@mail.gmail.com>
 <20121016124733.GC89655@glebius.int.ru>
Date: Tue, 16 Oct 2012 09:20:04 -0400
Message-ID: <CAFMmRNzA23r_a0nrnsUN8ZoHGL5c7OF1-nCqWpN42FLc50d7eg@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Ryan Stone <rysto32@gmail.com>
To: Gleb Smirnoff <glebius@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 13:20:05 -0000

On Tue, Oct 16, 2012 at 8:47 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:
> Can you please provide hints how can SIOCADDMULTI lead to obtaining RX
> lock in the stock driver?

It doesn't.  But it does acquire the core lock, and the core lock is
acquired before the RX lock (in ixgbe_init, for instance).

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 15:27:34 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id B1896E1D
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 15:27:34 +0000 (UTC)
 (envelope-from s.khanchi@gmail.com)
Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com
 [209.85.223.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 6F82E8FC17
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 15:27:34 +0000 (UTC)
Received: by mail-ie0-f182.google.com with SMTP id k10so13112009iea.13
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 08:27:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:from:date:x-google-sender-auth:message-id
 :subject:to:content-type;
 bh=eAbprt1hU2bXTmYJ3M2Sf1efHtGNNBI92a5uMrb483A=;
 b=FFdmbyOCko1Cy2yH3WDdJ+J2cIjjV232UzVU56H/jfV4ZAkGQ9ApQeoelbT6vCeBtj
 HDoBo2gid0+6d54RiFIqKpdvHiDn7RtW5lEhJEG9FOakDp9CQIseOT8bCEekL9lSnZER
 cn5j/lzmMc9XDG8EW2AXWBfv6qsd9aZ/Zbf3k8hz0M1HMiaToom5r9zdbDg9Lo+z/Dw5
 64n/aJl9OVHbI8YWqjwflHV8pwoUMJYSmoq3dJ6lDMtNWVJIAVUkBEwF1t0xF+wk9Ln5
 7HD3vEDBAO0IdL1jzCaVTSnUeWmV+KWwKSHAu7IKQ4oG1rbvhrkDwxr+xJDYyOeqp2ip
 DfEw==
Received: by 10.50.171.5 with SMTP id aq5mr680433igc.36.1350401250787; Tue, 16
 Oct 2012 08:27:30 -0700 (PDT)
MIME-Version: 1.0
Sender: s.khanchi@gmail.com
Received: by 10.64.51.234 with HTTP; Tue, 16 Oct 2012 08:27:10 -0700 (PDT)
From: h bagade <bagadeh@gmail.com>
Date: Tue, 16 Oct 2012 18:57:10 +0330
X-Google-Sender-Auth: Q6-xiu0KfuqcM3BJWPqnE6-Vhos
Message-ID: <CAARSjE15=zkw0V3hWFgmt0drnAOzB+UZ9TGZo+4Z9UcgNLPG4A@mail.gmail.com>
Subject: TCP_DROP_SYNFIN kernel option side effects?!
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 15:27:34 -0000

Hi all,

I need to add this option to kernel in order to defeating Nmap
OS-Fingerprinting. My system is running as Web Server and  also it is the
gateway on the network.
I want to know if setting this option has any side effects on other parts
of the system? Is there any situation that SYN and FIN bits are set both in
TCP packets? Is it a normal situation?

Any helps or comments are really appreciated.

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 16:17:31 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C99F3D02;
 Tue, 16 Oct 2012 16:17:31 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 701C28FC17;
 Tue, 16 Oct 2012 16:17:31 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id CF5A2B911;
 Tue, 16 Oct 2012 12:17:30 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
Date: Tue, 16 Oct 2012 12:09:55 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org>
 <201210151414.27318.jhb@freebsd.org> <507D5739.70509@FreeBSD.org>
In-Reply-To: <507D5739.70509@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
Message-Id: <201210161209.55979.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 16 Oct 2012 12:17:30 -0400 (EDT)
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 16:17:31 -0000

On Tuesday, October 16, 2012 8:46:49 am Alexander V. Chernikov wrote:
> On 15.10.2012 22:14, John Baldwin wrote:
> > On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote:
> >> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
> >> J> > 3) in practice taskqueue routine is a nightmare for many people since
> >> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
> >> J> > some traffic burst happens: once it is called it starts to schedule
> >> J> > itself more and more replacing original ISR routine. Additionally,
> >> J> > increasing rx_process_limit does not help since taskqueue is called with
> >> J> > the same limit. Finally, currently netisr taskq threads are not bound to
> >> J> > any CPU which makes the process even more uncontrollable.
> >> J>
> >> J> I think part of the problem here is that the taskqueue in ixgbe(4) is
> >> J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
> >> J> just start transmitting packets directly.
> >> J>
> >> J> I fixed this in igb(4) here:
> >> J>
> >> J> http://svnweb.freebsd.org/base?view=revision&revision=233708
> >>
> >> The problem Alexander describes in 3) definitely wasn't fixed in r233708.
> >>
> >> It is still present in head/, and it prevents me to do good benchmarking
> >> of pf(4) on igb(4).
> >>
> >> The problem is related to RX handling, so I don't see how r233708 could
> >> fix it.
> >
> > Before 233708, if you had a single TX packet waiting to go out and an RX
> > interrupt arrived, the task queue would be constantly reschedule causing
> > it to effectively spin at 100% until the TX packet was completely transmitted
> > and the hardware had updated the descriptor to mark it as complete.  In fact,
> > as long as you have any pending TX packets at all it will keep spinning until
> > it gets into a state where you have no pending TX packets (so a steady stream
> > of TX packets, including, say ACKs would cause the taskqueue to run forever).
> >
> > In general I think that with MSI-X you should just use an RX processing limit
> > of -1.  Anything else is just adding overhead in the form of extra context
> Yes, this is the obvious next step after binding threads to CPUs.
> > switches.  Neither the task or the MSI-X interrupt handler are on a thread
> > that is shared with any other tasks or handlers, so all that scheduling (or
> > rescheduling) the task will do is result in the task being immediately run
> > (after either a context switch or returning back to the main loop of the
> > taskqueue thread).
> 
> >
> > If you look at the drivers, if a burst of RX traffic ends, the taskqueue
> It is questionable if this behavior is good during burst:
> 
> 1) Due to RX locking taskq eats signifficant (if not all) RX packets 
> from given queue
> 2) Tasq can run on any cpu so this introduces possible out-of-order 
> packets within connection which is bad for forwarding (and there were 
> some problems in our TCP stack in the past). Additionally, this behavior 
> is totally uncontrollable and unscalable (we run _one_ task _instead_ of 
> RX handler) and leads to significant performance flapping on 
> heavy-loaded forwarding setups.

The taskqueue and interrupt handler should never run concurrently.  If they
are doing so now, that is a _bug_ and my patch fixes some of those already.
Just as r233708 fixed similar bugs in igb.  Normally the interrupt handler
should disable the specific MSI-X interrupt when it schedules the task, and
the interrupt is not re-enabled until the task decides it doesn't need to
reschedule itself.  If this is done correctly, then you shouldn't see RX
lock contention unless someone is doing 'ifconfig' or something else that
triggers an ioctl.

Anything else is just papering over these bugs (which are quite bad since
they result in out-of-order handling besides the lock contention).  In fact,
my original motivation for using a separate TX-only task for the if_transmit
case for igb was specifically to avoid out-of-order processing on RX, not to
prevent lock contention.

Can you describe the specific situation in which you now see both the task and
the interrupt handler running concurrently?  Do you have KTR traces from
KTR_SCHED perhaps?

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 16:17:31 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C99F3D02;
 Tue, 16 Oct 2012 16:17:31 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 701C28FC17;
 Tue, 16 Oct 2012 16:17:31 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id CF5A2B911;
 Tue, 16 Oct 2012 12:17:30 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
Date: Tue, 16 Oct 2012 12:09:55 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org>
 <201210151414.27318.jhb@freebsd.org> <507D5739.70509@FreeBSD.org>
In-Reply-To: <507D5739.70509@FreeBSD.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
Message-Id: <201210161209.55979.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 16 Oct 2012 12:17:30 -0400 (EDT)
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 16:17:31 -0000

On Tuesday, October 16, 2012 8:46:49 am Alexander V. Chernikov wrote:
> On 15.10.2012 22:14, John Baldwin wrote:
> > On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote:
> >> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote:
> >> J> > 3) in practice taskqueue routine is a nightmare for many people since
> >> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
> >> J> > some traffic burst happens: once it is called it starts to schedule
> >> J> > itself more and more replacing original ISR routine. Additionally,
> >> J> > increasing rx_process_limit does not help since taskqueue is called with
> >> J> > the same limit. Finally, currently netisr taskq threads are not bound to
> >> J> > any CPU which makes the process even more uncontrollable.
> >> J>
> >> J> I think part of the problem here is that the taskqueue in ixgbe(4) is
> >> J> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
> >> J> just start transmitting packets directly.
> >> J>
> >> J> I fixed this in igb(4) here:
> >> J>
> >> J> http://svnweb.freebsd.org/base?view=revision&revision=233708
> >>
> >> The problem Alexander describes in 3) definitely wasn't fixed in r233708.
> >>
> >> It is still present in head/, and it prevents me to do good benchmarking
> >> of pf(4) on igb(4).
> >>
> >> The problem is related to RX handling, so I don't see how r233708 could
> >> fix it.
> >
> > Before 233708, if you had a single TX packet waiting to go out and an RX
> > interrupt arrived, the task queue would be constantly reschedule causing
> > it to effectively spin at 100% until the TX packet was completely transmitted
> > and the hardware had updated the descriptor to mark it as complete.  In fact,
> > as long as you have any pending TX packets at all it will keep spinning until
> > it gets into a state where you have no pending TX packets (so a steady stream
> > of TX packets, including, say ACKs would cause the taskqueue to run forever).
> >
> > In general I think that with MSI-X you should just use an RX processing limit
> > of -1.  Anything else is just adding overhead in the form of extra context
> Yes, this is the obvious next step after binding threads to CPUs.
> > switches.  Neither the task or the MSI-X interrupt handler are on a thread
> > that is shared with any other tasks or handlers, so all that scheduling (or
> > rescheduling) the task will do is result in the task being immediately run
> > (after either a context switch or returning back to the main loop of the
> > taskqueue thread).
> 
> >
> > If you look at the drivers, if a burst of RX traffic ends, the taskqueue
> It is questionable if this behavior is good during burst:
> 
> 1) Due to RX locking taskq eats signifficant (if not all) RX packets 
> from given queue
> 2) Tasq can run on any cpu so this introduces possible out-of-order 
> packets within connection which is bad for forwarding (and there were 
> some problems in our TCP stack in the past). Additionally, this behavior 
> is totally uncontrollable and unscalable (we run _one_ task _instead_ of 
> RX handler) and leads to significant performance flapping on 
> heavy-loaded forwarding setups.

The taskqueue and interrupt handler should never run concurrently.  If they
are doing so now, that is a _bug_ and my patch fixes some of those already.
Just as r233708 fixed similar bugs in igb.  Normally the interrupt handler
should disable the specific MSI-X interrupt when it schedules the task, and
the interrupt is not re-enabled until the task decides it doesn't need to
reschedule itself.  If this is done correctly, then you shouldn't see RX
lock contention unless someone is doing 'ifconfig' or something else that
triggers an ioctl.

Anything else is just papering over these bugs (which are quite bad since
they result in out-of-order handling besides the lock contention).  In fact,
my original motivation for using a separate TX-only task for the if_transmit
case for igb was specifically to avoid out-of-order processing on RX, not to
prevent lock contention.

Can you describe the specific situation in which you now see both the task and
the interrupt handler running concurrently?  Do you have KTR traces from
KTR_SCHED perhaps?

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 20:36:02 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 30CA5AB7
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 20:36:02 +0000 (UTC)
 (envelope-from mariano.cediel@gmail.com)
Received: from mail-ye0-f182.google.com (mail-ye0-f182.google.com
 [209.85.213.182])
 by mx1.freebsd.org (Postfix) with ESMTP id E1DA28FC0A
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 20:36:01 +0000 (UTC)
Received: by mail-ye0-f182.google.com with SMTP id l8so163477yen.13
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 13:35:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=liH4heV9VsDVN0fvTbaoZibpsxBSHJeM7DnI7WLwJFY=;
 b=tr+ptc1ZMzaV09qJuQkyga1oFpcmC4Vj1X+AJAWErw3uidIrfrF/BEvy9g+Qfw+rEH
 JSd22gfQ9TLFVL4Yuro0ezSNxCBzOvexodMHtx2W9z7Sn3ibyqZGp88gM+ZUQJkLAjW4
 N8LtBZrk+9zXHC4P6H2mKC6ydOmwcgrc5X8JTP0UAyBJreihFlnOGxeL0oAviTqfRsMf
 mF96zWsRoE5RDOwlZbqimm1FJjA2r9SuuzeBfxLI+xmqv9Uef7GQZSn49xLYe86Ggcc/
 2Fp7AHcoFBw9qYm5fKLnGQ2Oe539xFS2Mf6+OehYslUzyCtqWMDkRX834SX9Xduwmyyt
 a8bQ==
MIME-Version: 1.0
Received: by 10.52.75.72 with SMTP id a8mr7518537vdw.66.1350419755026; Tue, 16
 Oct 2012 13:35:55 -0700 (PDT)
Received: by 10.58.102.197 with HTTP; Tue, 16 Oct 2012 13:35:55 -0700 (PDT)
Date: Tue, 16 Oct 2012 22:35:55 +0200
Message-ID: <CAB-01r59bep6pt96sYfT=QNV+SRum=1xVESfOU86Ohevd=Zs2A@mail.gmail.com>
Subject: one physical interface -> n virtual interfaces
From: Mariano Cediel <mariano.cediel@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 20:36:02 -0000

How do I create, from a physical interface, n virtual interfaces, but
all effects are real, their MAC different, on which we can do
individually NAT, etc, etc.?

I need one external interface has 2 public IPs, and I'll do every NAT
over every <interface> (with ipfw and divert)
individually (each of them has its own gateway)

A little help to start researching .....
Greetings.

(sorry for my poor english)

-- 

        [o - -  -   -    -      -
   (\   |                  u d t
   (  \_('>              c c s
   (__(=_)             s o ?
      -"=

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 21:54:54 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 0571BE92;
 Tue, 16 Oct 2012 21:54:54 +0000 (UTC)
 (envelope-from eric@vangyzen.net)
Received: from aussmtpmrkpc120.us.dell.com (aussmtpmrkpc120.us.dell.com
 [143.166.82.159])
 by mx1.freebsd.org (Postfix) with ESMTP id BEBE58FC16;
 Tue, 16 Oct 2012 21:54:53 +0000 (UTC)
X-Loopcount0: from 64.238.244.148
X-IronPort-AV: E=Sophos;i="4.80,595,1344229200"; 
   d="scan'208";a="7198298"
Message-ID: <507DD768.7000803@vangyzen.net>
Date: Tue, 16 Oct 2012 16:53:44 -0500
From: Eric van Gyzen <eric@vangyzen.net>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:14.0) Gecko/20120822 Thunderbird/14.0
MIME-Version: 1.0
To: net@FreeBSD.org, "Bjoern A. Zeeb" <bz@FreeBSD.org>
Subject: Tahi "Redirected On-link" Test Case
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 21:54:54 -0000

I am currently working on a fix for kern/152791 (Tahi IPv6 Ready Logo 
test case #169: Redirected On-link).  I have a change to add the host 
route, and it works for test case 169.  However, the route never gets 
removed, so all subsequent test cases fail (because they first verify 
that the Node Under Test thinks the destination is off-link).

How/When should I clean up the route?

Each test case runs a common cleanup procedure, which sends a RA with a 
Router Lifetime of zero and a Prefix Information option with a Valid 
Lifetime and Preferred Lifetime of zero.  This deprecates the NUT's only 
global address, by which it reaches the newly-on-link destination.  
However, it doesn't seem rational to use this event to trigger a cleanup 
of the route.

The only other trigger I can imagine is the transition of the 
Destination Cache entry to the Stale state.  That also doesn't make 
complete sense.  (It probably also wouldn't work, since in my testing, 
test case 170 begins immediately after test case 169 ends.)

I'm assuming a certain amount of familiarity (on your part) with these 
tests.  If you'd like, I can explain them in more detail.

Thanks in advance for any advice,

Eric

From owner-freebsd-net@FreeBSD.ORG  Tue Oct 16 22:03:46 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5CB381B7
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 22:03:46 +0000 (UTC)
 (envelope-from pprocacci@datapipe.com)
Received: from EXFESMQ04.datapipe-corp.net (exfesmq04.datapipe.com
 [64.27.120.68]) by mx1.freebsd.org (Postfix) with ESMTP id 101BC8FC1A
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 22:03:45 +0000 (UTC)
Received: from nat.myhome (192.168.128.103) by EXFESMQ04.datapipe-corp.net
 (192.168.128.29) with Microsoft SMTP Server (TLS) id 14.2.318.1; Tue, 16 Oct
 2012 18:02:35 -0400
Date: Tue, 16 Oct 2012 17:02:59 -0500
From: "Paul A. Procacci" <pprocacci@datapipe.com>
To: Mariano Cediel <mariano.cediel@gmail.com>
Subject: Re: one physical interface -> n virtual interfaces
Message-ID: <20121016220258.GI7125@nat.myhome>
References: <CAB-01r59bep6pt96sYfT=QNV+SRum=1xVESfOU86Ohevd=Zs2A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CAB-01r59bep6pt96sYfT=QNV+SRum=1xVESfOU86Ohevd=Zs2A@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Originating-IP: [192.168.128.103]
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2012 22:03:46 -0000

On Tue, Oct 16, 2012 at 10:35:55PM +0200, Mariano Cediel wrote:
> How do I create, from a physical interface, n virtual interfaces, but
> all effects are real, their MAC different, on which we can do
> individually NAT, etc, etc.?
>
> I need one external interface has 2 public IPs, and I'll do every NAT
> over every <interface> (with ipfw and divert)
> individually (each of them has its own gateway)
>
> A little help to start researching .....
> Greetings.

http://freebsd.1045724.n5.nabble.com/Virtual-Network-Interface-Card-td40051=
09.html

The above was posted in late 2010.  It has one example of creating vitual i=
nterfaces using the netgraph module.  3rd post from the top.

I'm not entirely sure if this is the current _correct_ way, but I imagine i=
s still accurate and can be used to get you started.

~Paul

________________________________

This message may contain confidential or privileged information. If you are=
 not the intended recipient, please advise us immediately and delete this m=
essage. See http://www.datapipe.com/legal/email_disclaimer/ for further inf=
ormation on confidentiality and the risks of non-secure electronic communic=
ation. If you cannot access these links, please notify us by reply message =
and we will send the contents to you.

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 01:26:41 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F410B436;
 Wed, 17 Oct 2012 01:26:40 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com
 [209.85.220.54])
 by mx1.freebsd.org (Postfix) with ESMTP id A73438FC0C;
 Wed, 17 Oct 2012 01:26:40 +0000 (UTC)
Received: by mail-pa0-f54.google.com with SMTP id bi1so6983971pad.13
 for <multiple recipients>; Tue, 16 Oct 2012 18:26:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=bmRWDQgAacYcgW3VizvX9o+evO04s1b8d0cJgL5So04=;
 b=MHVlLiuSQdqTM3ZE8/tTznTHRf0zABwVBNh9LhNoOozPYrugsgabD9PpyZ4SfsO4ZZ
 ebU5mpNi/VeeHJIAvDHeln4VGQCsEtNYSMr1Eh5DWF0GD4LSJtpBAuPaljYmFIjKZLdc
 Lj+Nzb4fmdKVvfg5Nz8XJn/lo5jDCfKGsQY61DyvqlN7Xf764jvfD9uz1+3qSJdNvHlg
 HBnts6BlDoxG/p7qecU2S0r5XRjlCWFnc0i0qvBwMUXASpiN+MvDQDypaplcDHuPKMW/
 PgiWMqvs5TWt0HQSTjAjDZ2MSSde6dSdieLTHSzNMG92+wzEHK0YqC6C0TAOC7vApwUi
 rfrA==
MIME-Version: 1.0
Received: by 10.66.86.129 with SMTP id p1mr6638350paz.39.1350437200422; Tue,
 16 Oct 2012 18:26:40 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Tue, 16 Oct 2012 18:26:40 -0700 (PDT)
In-Reply-To: <201210160838.17741.jhb@freebsd.org>
References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org>
 <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
 <201210160838.17741.jhb@freebsd.org>
Date: Tue, 16 Oct 2012 18:26:40 -0700
X-Google-Sender-Auth: GoMxUKHM6DlvJHVOZ6ge9-Pu1BU
Message-ID: <CAJ-Vmo=iYD2N36eq5wzKGh8f+792mBBG9LKvi6fVTjqrEpgCRg@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Adrian Chadd <adrian@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, freebsd-net@freebsd.org,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org,
 Luigi Rizzo <rizzo@iet.unipi.it>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 01:26:41 -0000

On 16 October 2012 05:38, John Baldwin <jhb@freebsd.org> wrote:

> I don't follow how this is related to this thread at all (which has more to do
> with ixgbe scheduling duplicate work).  However, is your issue that the stack
> locks (e.g. socket and protocol layer locks) are held across
> if_start/if_transmit?

It's a comment on the larger scale architectural problem. Since
if_transmit and if_start are called from multiple thread contexts, the
current ways drivers implement this are:

* support direct dispatch to hardware, but wrap the whole sending
process in one enormous lock, to prevent packet reordering issues; or
* drop TX and TX completion into a TX taskqueue (or multiple, one per
hardware send queue) and push frames into that taskqueue via some
queue and then wake said taskqueue up; or
* some bastardised version of both.

For the intel drivers, the locks are held for a (potentially) very
long time. Both igb and ixgb both hold the locks for the entirety of
the TX process. It's not protecting something like a queue operation,
it's effectively serialising the entirety of the TX and TX completion
process.

That works ok-ish for ethernet drivers which are "send and ignore",
but for wireless drivers where the stack implements a lot more state,
it really does quite suck. And since wireless drivers have a top level
idea of sequence and encryption (ie, it's not per-TCP stream, it's
across multiple sending streams to a given node), I can't model the
locking and serialisation on what the TCP/UDP code does.

I wish we had a better way of implementing "serialisation without
long, long held locks" but short of stuffing everything into a
taskqueue and only locking the send queue involved, I can't really
think of anything.



Adrian

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 01:26:41 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F410B436;
 Wed, 17 Oct 2012 01:26:40 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com
 [209.85.220.54])
 by mx1.freebsd.org (Postfix) with ESMTP id A73438FC0C;
 Wed, 17 Oct 2012 01:26:40 +0000 (UTC)
Received: by mail-pa0-f54.google.com with SMTP id bi1so6983971pad.13
 for <multiple recipients>; Tue, 16 Oct 2012 18:26:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=bmRWDQgAacYcgW3VizvX9o+evO04s1b8d0cJgL5So04=;
 b=MHVlLiuSQdqTM3ZE8/tTznTHRf0zABwVBNh9LhNoOozPYrugsgabD9PpyZ4SfsO4ZZ
 ebU5mpNi/VeeHJIAvDHeln4VGQCsEtNYSMr1Eh5DWF0GD4LSJtpBAuPaljYmFIjKZLdc
 Lj+Nzb4fmdKVvfg5Nz8XJn/lo5jDCfKGsQY61DyvqlN7Xf764jvfD9uz1+3qSJdNvHlg
 HBnts6BlDoxG/p7qecU2S0r5XRjlCWFnc0i0qvBwMUXASpiN+MvDQDypaplcDHuPKMW/
 PgiWMqvs5TWt0HQSTjAjDZ2MSSde6dSdieLTHSzNMG92+wzEHK0YqC6C0TAOC7vApwUi
 rfrA==
MIME-Version: 1.0
Received: by 10.66.86.129 with SMTP id p1mr6638350paz.39.1350437200422; Tue,
 16 Oct 2012 18:26:40 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Tue, 16 Oct 2012 18:26:40 -0700 (PDT)
In-Reply-To: <201210160838.17741.jhb@freebsd.org>
References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org>
 <CAJ-Vmo=qMJXwYDUEPHRn49SrAOP+Nt3v9FaxFXq8wB6Q_uCmPg@mail.gmail.com>
 <201210160838.17741.jhb@freebsd.org>
Date: Tue, 16 Oct 2012 18:26:40 -0700
X-Google-Sender-Auth: GoMxUKHM6DlvJHVOZ6ge9-Pu1BU
Message-ID: <CAJ-Vmo=iYD2N36eq5wzKGh8f+792mBBG9LKvi6fVTjqrEpgCRg@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Adrian Chadd <adrian@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>, freebsd-net@freebsd.org,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org,
 Luigi Rizzo <rizzo@iet.unipi.it>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 01:26:41 -0000

On 16 October 2012 05:38, John Baldwin <jhb@freebsd.org> wrote:

> I don't follow how this is related to this thread at all (which has more to do
> with ixgbe scheduling duplicate work).  However, is your issue that the stack
> locks (e.g. socket and protocol layer locks) are held across
> if_start/if_transmit?

It's a comment on the larger scale architectural problem. Since
if_transmit and if_start are called from multiple thread contexts, the
current ways drivers implement this are:

* support direct dispatch to hardware, but wrap the whole sending
process in one enormous lock, to prevent packet reordering issues; or
* drop TX and TX completion into a TX taskqueue (or multiple, one per
hardware send queue) and push frames into that taskqueue via some
queue and then wake said taskqueue up; or
* some bastardised version of both.

For the intel drivers, the locks are held for a (potentially) very
long time. Both igb and ixgb both hold the locks for the entirety of
the TX process. It's not protecting something like a queue operation,
it's effectively serialising the entirety of the TX and TX completion
process.

That works ok-ish for ethernet drivers which are "send and ignore",
but for wireless drivers where the stack implements a lot more state,
it really does quite suck. And since wireless drivers have a top level
idea of sequence and encryption (ie, it's not per-TCP stream, it's
across multiple sending streams to a given node), I can't model the
locking and serialisation on what the TCP/UDP code does.

I wish we had a better way of implementing "serialisation without
long, long held locks" but short of stuffing everything into a
taskqueue and only locking the send queue involved, I can't really
think of anything.



Adrian

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 03:18:38 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 06D37768
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 03:18:38 +0000 (UTC)
 (envelope-from rfg@tristatelogic.com)
Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com
 [69.62.255.118])
 by mx1.freebsd.org (Postfix) with ESMTP id B3BFE8FC14
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 03:18:37 +0000 (UTC)
Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1])
 by segfault.tristatelogic.com (Postfix) with ESMTP id 7FA275081A
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 20:18:29 -0700 (PDT)
To: freebsd-net@freebsd.org
Subject: Wireless Networking Bug(s) in 9.1-RC2 (?)
Date: Tue, 16 Oct 2012 20:18:29 -0700
Message-ID: <15066.1350443909@tristatelogic.com>
From: "Ronald F. Guilmette" <rfg@tristatelogic.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 03:18:38 -0000



Greerings,

I am currently running 9.1-RC2 on my laptop, and I'm wondering what the
proper procedure is for reporting bugs in not-yet-released releases.
Could somebody please tell me?  Should I just file a regular PR?  (I've
never done this before for anything that's not an official -RELEASE,
and I don't want to be busting anybody's chops over something that isn't
considered ready-for-prine-time anyway.)

So anyway, I'll give the issue to you in a nutshell... This laptop has
both wired ethernet and wireless (11{b,g,n}) capabilities.  I have a
Linksys E1000 which I had this thing successfully talking to/with
(using 11n) under 9.0-RELEASE.  (The Linksys is set to speak `N-Only'.)

Now however, it does appear to me that in 9.1-RC2 there may perhaps be
a problem which is causing the iwn0 interface to want to speak to the
Linksys using 11b, of all things.  (I would have though that if it was
giving up on `N' it would have fallen back to `G' next.)

I include below relevant portions of my /etc/rc.conf file and the output
I am now getting from ifconfig -a.

Guidance would be appreciated.  Should I be filing a PR?  Is my rc.conf
goofed?


Regards,
rfg


P.S.  Actually, I've never tried running _both_ the wired & wireless stuff
on this laptop in parallel before now.  Is that part of the problem?  And
anyway, how exactly does the system establish a default route to 192.168.1.1
when there are two (or more) ways to get there from here?


rc.conf:
=============================================================================
hostname="slim.tristatelogic.com"
ifconfig_re0="inet 192.168.1.23 netmask 255.255.255.0"
defaultrouter="192.168.1.1"
#
wlans_iwn0="wlan0"
ifconfig_wlan0="WPA inet 192.168.1.21 netmask 255.255.255.0 ssid ronair2-1"
=============================================================================

ifconfig -a:
=============================================================================
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
        ether 00:24:21:65:ad:a0
        inet 192.168.1.23 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::224:21ff:fe65:ada0%re0 prefixlen 64 scopeid 0x4 
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
iwn0: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 2290
        ether 00:22:fb:76:6d:18
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: IEEE 802.11 Wireless Ethernet autoselect mode 11b
        status: associated
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128 
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa 
        inet 127.0.0.1 netmask 0xff000000 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
wlan0: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 00:22:fb:76:6d:18
        inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::222:fbff:fe76:6d18%wlan0 prefixlen 64 tentative scopeid 0xb 
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
        status: no carrier
        ssid ronair2-1 channel 1 (2412 MHz 11b)
        country US authmode WPA1+WPA2/802.11i privacy OFF txpower 15 bmiss 10
        scanvalid 450 bgscan bgscanintvl 300 bgscanidle 250 roam:rssi 7
        roam:rate 1 wme roaming MANUAL bintval 0
=============================================================================

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 04:21:27 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 1763E2EC
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 04:21:27 +0000 (UTC)
 (envelope-from kob6558@gmail.com)
Received: from mail-we0-f182.google.com (mail-we0-f182.google.com
 [74.125.82.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 9EF018FC08
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 04:21:26 +0000 (UTC)
Received: by mail-we0-f182.google.com with SMTP id x43so5203396wey.13
 for <freebsd-net@freebsd.org>; Tue, 16 Oct 2012 21:21:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=EVk0+JpVZR97dDKwYc1f/u2nwwRLlFzR5orOm0fhA6U=;
 b=TNr064IOj8i1KDLR2GBc2omcLI0CVueiztYi+RXJv9L6JX+dYEiIKf7msoM6ppDuv3
 kDNE6gNnnphyWDldNrgc5wqVE1hWHTm9acqbgTm0LRl72rNoIF0In1fB7NFjhoWVLEeV
 QI24y6zJdsrmvU/VEU224CsuZNBpoc/Rfm6ztiJMNEg5RdnKAH5acG3UqN5gPO62RA7A
 wqhbcmBfezYoz5kfL5fagMfE4KUNS8ML8b40P3d7og7Vmh2B8yTcxZpgnbK3QGxib9jL
 HbogWnHCrU1oMJaO6v7YhY4CUjpgg8pW5zqY4V14Y1lPiPKeyCNORk25dxLmafyWf46b
 R6Lw==
MIME-Version: 1.0
Received: by 10.216.197.104 with SMTP id s82mr10013089wen.62.1350447685564;
 Tue, 16 Oct 2012 21:21:25 -0700 (PDT)
Received: by 10.223.66.194 with HTTP; Tue, 16 Oct 2012 21:21:25 -0700 (PDT)
In-Reply-To: <15066.1350443909@tristatelogic.com>
References: <15066.1350443909@tristatelogic.com>
Date: Tue, 16 Oct 2012 21:21:25 -0700
Message-ID: <CAN6yY1sxo=YH1CALp-sKDtjyDfC5LZjxN0yEBWKNQFvprdi06A@mail.gmail.com>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
From: Kevin Oberman <kob6558@gmail.com>
To: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 04:21:27 -0000

On Tue, Oct 16, 2012 at 8:18 PM, Ronald F. Guilmette
<rfg@tristatelogic.com> wrote:
>
>
> Greerings,
>
> I am currently running 9.1-RC2 on my laptop, and I'm wondering what the
> proper procedure is for reporting bugs in not-yet-released releases.
> Could somebody please tell me?  Should I just file a regular PR?  (I've
> never done this before for anything that's not an official -RELEASE,
> and I don't want to be busting anybody's chops over something that isn't
> considered ready-for-prine-time anyway.)

I think stable@ is probably the best choice. wireless@ would also be
an appropriate place.

> So anyway, I'll give the issue to you in a nutshell... This laptop has
> both wired ethernet and wireless (11{b,g,n}) capabilities.  I have a
> Linksys E1000 which I had this thing successfully talking to/with
> (using 11n) under 9.0-RELEASE.  (The Linksys is set to speak `N-Only'.)
>
> Now however, it does appear to me that in 9.1-RC2 there may perhaps be
> a problem which is causing the iwn0 interface to want to speak to the
> Linksys using 11b, of all things.  (I would have though that if it was
> giving up on `N' it would have fallen back to `G' next.)
>
> I include below relevant portions of my /etc/rc.conf file and the output
> I am now getting from ifconfig -a.
>
> Guidance would be appreciated.  Should I be filing a PR?  Is my rc.conf
> goofed?
>
>
> Regards,
> rfg
>
>
> P.S.  Actually, I've never tried running _both_ the wired & wireless stuff
> on this laptop in parallel before now.  Is that part of the problem?  And
> anyway, how exactly does the system establish a default route to 192.168.1.1
> when there are two (or more) ways to get there from here?
>
>
> rc.conf:
> =============================================================================
> hostname="slim.tristatelogic.com"
> ifconfig_re0="inet 192.168.1.23 netmask 255.255.255.0"
> defaultrouter="192.168.1.1"
> #
> wlans_iwn0="wlan0"
> ifconfig_wlan0="WPA inet 192.168.1.21 netmask 255.255.255.0 ssid ronair2-1"
> =============================================================================
>
> ifconfig -a:
> =============================================================================
> re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
>         ether 00:24:21:65:ad:a0
>         inet 192.168.1.23 netmask 0xffffff00 broadcast 192.168.1.255
>         inet6 fe80::224:21ff:fe65:ada0%re0 prefixlen 64 scopeid 0x4
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
> iwn0: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 2290
>         ether 00:22:fb:76:6d:18
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: IEEE 802.11 Wireless Ethernet autoselect mode 11b
>         status: associated
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
>         inet6 ::1 prefixlen 128
>         inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa
>         inet 127.0.0.1 netmask 0xff000000
>         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> wlan0: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         ether 00:22:fb:76:6d:18
>         inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255
>         inet6 fe80::222:fbff:fe76:6d18%wlan0 prefixlen 64 tentative scopeid 0xb
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
>         status: no carrier
>         ssid ronair2-1 channel 1 (2412 MHz 11b)
>         country US authmode WPA1+WPA2/802.11i privacy OFF txpower 15 bmiss 10
>         scanvalid 450 bgscan bgscanintvl 300 bgscanidle 250 roam:rssi 7
>         roam:rate 1 wme roaming MANUAL bintval 0
> =============================================================================

I don't see any real issue with your configuration, but I do see
something odd and it may be tied to the problem you are seeing. FWIW,
I also have an agn iwn card, but I only have a G access point at this
time and it runs fine in G.

The oddity is that you specify your ssid in the rc.conf file while
using WPA. I've never seen that before. It's in my wpa_supplicant.conf
file. It seems more reasonable for a laptop that may need to associate
with a home and a work SSID as well as ones at conferences and, in my
case alternate work and home SSIDs. When it is in the rc.conf file, it
requires change with every relocation.

in any case, you might try moving the SID into the wpa_supplicant.conf
file, but my bet is it is N specific. Paging Adrian.
-- 
R. Kevin Oberman, Network Engineer
E-mail: kob6558@gmail.com

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 07:41:15 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 0C3A58F
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 07:41:15 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com
 [209.85.160.54])
 by mx1.freebsd.org (Postfix) with ESMTP id D0A2F8FC14
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 07:41:14 +0000 (UTC)
Received: by mail-pb0-f54.google.com with SMTP id rp8so7380627pbb.13
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 00:41:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=NDN9LWemtqHNDLHQUZL1CygDjX8I0jBqLtFZX+POpvo=;
 b=u/Yo3xhdYTI+pqV7aIdgsQY282lvPopLUyhiJzFF8ZhTnNjOCeEtYc7vVnAomgTcwR
 ObFoOBCLdDEF2KFARTHil0yLVSLsq6jcXRe2cUAc0CJsXR4VUfTyS7+WwJLryg2Um6xo
 lNBE5Cnc3l/+HD0un1kU6vwX/PMPpHFJe9TD6kxc/9keQQYKIc8OUGn0YqnMKQSaG7Hb
 eXRDQyIawIJuik8YqEEv9bi8KWdq4LM9oi+zz7ozmJqeVMzylnylBWyuTVt4UKWwpW1W
 7N6DgkohxWsFRiT1lLbVLnNETU5iGB8mnz21jCtbktpFgXNCezkr7rnKCqtTgZeURDay
 2Y7w==
MIME-Version: 1.0
Received: by 10.68.218.226 with SMTP id pj2mr54538138pbc.33.1350459674258;
 Wed, 17 Oct 2012 00:41:14 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Wed, 17 Oct 2012 00:41:14 -0700 (PDT)
In-Reply-To: <CAN6yY1sxo=YH1CALp-sKDtjyDfC5LZjxN0yEBWKNQFvprdi06A@mail.gmail.com>
References: <15066.1350443909@tristatelogic.com>
 <CAN6yY1sxo=YH1CALp-sKDtjyDfC5LZjxN0yEBWKNQFvprdi06A@mail.gmail.com>
Date: Wed, 17 Oct 2012 00:41:14 -0700
X-Google-Sender-Auth: sN8-E_rIn3uQ_tOtbmi5XHWS9zk
Message-ID: <CAJ-Vmonk0xtmqPMFnCZp-YVzmC3-boeu0o9A4DwSeBGYC+5=sg@mail.gmail.com>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
From: Adrian Chadd <adrian@freebsd.org>
To: Kevin Oberman <kob6558@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-net@freebsd.org, "Ronald F. Guilmette" <rfg@tristatelogic.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 07:41:15 -0000

for wifi - you need to configure /etc/wpa_supplicant.conf as well,
right? You don't need the ssid in the ifconfig line; wpa_supplicant
will scan and find your AP.

The driver should call back to non-n and non-g if needs be.

As for the config - erm, you have two interfaces on the same L2.
That's going to confuse things, right? What's 'netstat -rn' show?




Adrian

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 07:42:07 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 4DA7B13F
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 07:42:07 +0000 (UTC)
 (envelope-from remi.pauchet@netasq.com)
Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1])
 by mx1.freebsd.org (Postfix) with ESMTP id C01798FC16
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 07:42:06 +0000 (UTC)
Received: from [10.2.9.2] (unknown [91.212.116.2])
 by work.netasq.com (Postfix) with ESMTPSA id 8917027053AC;
 Wed, 17 Oct 2012 09:42:04 +0200 (CEST)
Subject: Re: ixgbe and ixgbevf drivers are not working in virtualization
 environment
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: multipart/signed;
 boundary="Apple-Mail=_1D8C5446-7BBA-41AB-B251-E7E24D126B93";
 protocol="application/pkcs7-signature"; micalg=sha1
From: =?iso-8859-1?Q?R=E9mi_Pauchet?= <remi.pauchet@netasq.com>
In-Reply-To: <B0DB1464-D65E-4856-99DD-2C688D3CC731@netasq.com>
Date: Wed, 17 Oct 2012 09:42:03 +0200
Message-Id: <C6A18AC9-A87D-4134-BF41-FA0452A076E4@netasq.com>
References: <792D5931-19E7-4239-A3E8-5D2BC90F03FD@netasq.com>
 <CAFOYbcmYn_44dMc0OiTqWpvQvas1biY0AUJvxqZ_+n1ZApEn4g@mail.gmail.com>
 <B0DB1464-D65E-4856-99DD-2C688D3CC731@netasq.com>
To: Jack Vogel <jfvogel@gmail.com>
X-Mailer: Apple Mail (2.1283)
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 07:42:07 -0000


--Apple-Mail=_1D8C5446-7BBA-41AB-B251-E7E24D126B93
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

Hi

My interface is configured, UP and running and I still can't get a link

Can you help me with this issue ?

Regards,
R=E9mi


Le 12 oct. 2012 =E0 09:38, R=E9mi Pauchet a =E9crit :

> Hi,
>=20
> Unfortunately not:
>=20
> ix0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu =
1500
> 	=
options=3D401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,VLAN_HWTSO>
> 	ether 00:e0:ed:1c:99:4e
> 	inet 172.16.255.254 netmask 0xffff0000 broadcast 172.16.255.255
> 	inet6 fe80::2e0:edff:fe1c:994e%ix0 prefixlen 64 scopeid 0x2=20
> 	nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> 	media: Ethernet autoselect
> 	status: no carrier
> ix1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu =
1500
> 	=
options=3D401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,VLAN_HWTSO>
> 	ether 00:e0:ed:1c:99:4f
> 	inet 172.17.255.254 netmask 0xffff0000 broadcast 172.17.255.255
> 	inet6 fe80::2e0:edff:fe1c:994f%ix1 prefixlen 64 scopeid 0x3=20
> 	nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> 	media: Ethernet autoselect
> 	status: no carrier
>=20
> Regards,
> R=E9mi
>=20
> Le 11 oct. 2012 =E0 18:25, Jack Vogel a =E9crit :
>=20
>> The ixgbe device will not get link until you have run init, so assign =
it an address or just do an ifconfig up.
>>=20
>> I have never used the driver using a passthru type setup but I =
believe its been done successfully if
>> memory serves.
>>=20
>> Jack
>>=20
>>=20
>> On Thu, Oct 11, 2012 at 8:39 AM, R=E9mi Pauchet =
<remi.pauchet@netasq.com> wrote:
>> Hi,
>>=20
>> I'm trying to use the ixgbe (10Gb) driver in a FreeBSD virtual =
machine on an esxi 5 using DirectPath (PCI Passthrough) and the card is =
detected, but I can't get a link (status: no carrier)
>>=20
>> ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - =
2.3.11> mem 0xd2420000-0xd243ffff,0xd2400000-0xd2403fff irq 18 at device =
0.0 on pci3
>>=20
>> ix0: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
>>         =
options=3D401bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSU=
M,TSO4,VLAN_HWTSO>
>>         ether 00:e0:ed:1c:99:4e
>>         nd6 options=3D29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>         media: Ethernet autoselect
>>         status: no carrier
>>=20
>> I have also tested with XenServer 6, using SR-IOV (ixgbevf driver) =
with the same result: the driver is loading, but no link detected.
>>=20
>> In both case (VMWare DirectPath and XenServer SR-IOV), I tested Linux =
with success.
>>=20
>>=20
>> The card is an Intel 82599EB, the motherboard is an Intel X58 =
(supermicro X8ST3) with a Xeon W3680 and I've tested FreeBSD 8.3 and 9.0
>>=20
>> I've found a forum thread with the same issue: =
http://forums.freebsd.org/showthread.php?t=3D29855 and no answer :)
>>=20
>>=20
>> Please find in attachment the dmesg (boot -v) with the ix driver =
compiled with DEBUG flags using vmware.
>>=20
>>=20
>> Can anyone provide feedback about this issue ?
>>=20
>> Regards,
>> R=E9mi Pauchet
>>=20
>>=20
>>=20
>=20


--Apple-Mail=_1D8C5446-7BBA-41AB-B251-E7E24D126B93--

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 07:59:18 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 8A5A378B
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 07:59:18 +0000 (UTC)
 (envelope-from rfg@tristatelogic.com)
Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com
 [69.62.255.118])
 by mx1.freebsd.org (Postfix) with ESMTP id 5D5768FC14
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 07:59:18 +0000 (UTC)
Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1])
 by segfault.tristatelogic.com (Postfix) with ESMTP id B17725081A;
 Wed, 17 Oct 2012 00:59:15 -0700 (PDT)
To: Kevin Oberman <kob6558@gmail.com>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
In-Reply-To: <CAN6yY1sxo=YH1CALp-sKDtjyDfC5LZjxN0yEBWKNQFvprdi06A@mail.gmail.com>
Date: Wed, 17 Oct 2012 00:59:15 -0700
Message-ID: <16376.1350460755@tristatelogic.com>
From: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 07:59:18 -0000


In message <CAN6yY1sxo=YH1CALp-sKDtjyDfC5LZjxN0yEBWKNQFvprdi06A@mail.gmail.com>
, you wrote:

>I wrote:
>> P.S.  Actually, I've never tried running _both_ the wired & wireless stuff
>> on this laptop in parallel before now.  Is that part of the problem?  And
>> anyway, how exactly does the system establish a default route to 192.168.1.1
>> when there are two (or more) ways to get there from here?
>>...
>I don't see any real issue with your configuration, but I do see
>something odd and it may be tied to the problem you are seeing. FWIW,
>I also have an agn iwn card, but I only have a G access point at this
>time and it runs fine in G.

Yes, as I mentioned, when I was running 9.0-RELEASE, my iwn0 was talking
just fine to my Linksys.  (That was mostly `N', but I think that I may have
had the two playing nice together with `G' also.)

>The oddity is that you specify your ssid in the rc.conf file while
>using WPA. I've never seen that before.

Well, see, the instructions on this page:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-wireless.html

are not really all that clear.  Some of the examples have the ssid clause
in the ifconfig_XXX= lines in the rc.conf file, while others don't.  One
example that I would have liked very much to have seen in there would have
been an example showing what to put in rc.conf in the case where one wants
to do WPA, but with static IPs, rather than DHCP.

The closest thing to that is under Section 32.3.3.1.2.4, and in the example
there, as you can see, there is an ssid clause in the ifconfig_wlan0= line.
(I assumed that was necessary in case there were multiple ssid/password pairs
within the wpa_supplicant.conf file, and obviously, in such a case, set up
of the interface has to pick one of them from among the available alternatives.)

What is correct?  Beats the hell out of me!  I am not in any sense an expert
of this stuff.  All I can say is that the examples on this page are confusing.

>It's in my wpa_supplicant.conf file.

Yes, I have the ssid name in there too.

>It seems more reasonable for a laptop that may need to associate
>with a home and a work SSID as well as ones at conferences and...

Well, no.  Actually, at the moment, I *only* have an interest in connecting
to my own local Linksys... nothing else.  (That part of why I'm using a
static IP... this is effectively just a static connection... minus the
wires and the drilling of holes through the walls.)

>in any case, you might try moving the SID into the wpa_supplicant.conf file.

That kinda remind me of that old Ragu spagetti sauce TV commercial... "It's
in there!" :-)

>but my bet is it is N specific.

I doubt it.  I think I had the same questionable setup when I was running `G'
on 9.0-RELEASE.

But I would like to find out what the Right Answer is also.

>Paging Adrian.

Yes, please.


Regards,
rfg


P.S.  What about my routing question?  If I have one machine and it has
two independent connections to 192.168.1.1 and the rc.conf file says:

   defaultrouter="192.168.1.1"

then how does FreeBSD decide (or figure out) which of the two interfaces
packets going to some random IPv4 address elsewhere will flow out of?

For me at least, this is really puzzling.

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 08:19:04 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 3DACCE17;
 Wed, 17 Oct 2012 08:19:04 +0000 (UTC)
 (envelope-from rfg@tristatelogic.com)
Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com
 [69.62.255.118])
 by mx1.freebsd.org (Postfix) with ESMTP id 0D5958FC08;
 Wed, 17 Oct 2012 08:19:03 +0000 (UTC)
Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1])
 by segfault.tristatelogic.com (Postfix) with ESMTP id 4DA5F5081B;
 Wed, 17 Oct 2012 01:19:03 -0700 (PDT)
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
In-Reply-To: <CAJ-Vmonk0xtmqPMFnCZp-YVzmC3-boeu0o9A4DwSeBGYC+5=sg@mail.gmail.com>
Date: Wed, 17 Oct 2012 01:19:03 -0700
Message-ID: <16534.1350461943@tristatelogic.com>
From: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Cc: freebsd-net@freebsd.org, Kevin Oberman <kob6558@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 08:19:04 -0000


In message <CAJ-Vmonk0xtmqPMFnCZp-YVzmC3-boeu0o9A4DwSeBGYC+5=sg@mail.gmail.com>
, you wrote:

>for wifi - you need to configure /etc/wpa_supplicant.conf as well,
>right?

Did that.  Yes.

>You don't need the ssid in the ifconfig line;

OK.  If you say so.  (See my prior e-mail where I wondered aloud if there
are circumstances where the ssid might have to appear in both places.)

 wpa_supplicant
9
>will scan and find your AP.
>
>The driver should call back to non-n and non-g if needs be.
>
>As for the config - erm, you have two interfaces on the same L2.
>That's going to confuse things, right?

Well, I can't speak for the hardware, but it sure as hell does confuse
*me*. (1/2 :-)

>What's 'netstat -rn' show?


Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use  Netif Expire
default            192.168.1.1        UGS         0   104122    re0
127.0.0.1          link#10            UH          0        0    lo0
192.168.1.0/24     link#4             U           0    23515    re0
192.168.1.21       link#11            UHS         0        0    lo0
192.168.1.23       link#4             UHS         0        0    lo0

Internet6:
Destination                       Gateway                       Flags      Netif Expire
::/96                             ::1                           UGRS        lo0
::1                               link#10                       UH          lo0
::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
fe80::/10                         ::1                           UGRS        lo0
fe80::%re0/64                     link#4                        U           re0
fe80::224:21ff:fe65:ada0%re0      link#4                        UHS         lo0
fe80::%lo0/64                     link#10                       U           lo0
fe80::1%lo0                       link#10                       UHS         lo0
fe80::%wlan0/64                   link#11                       U         wlan0
fe80::222:fbff:fe76:6d18%wlan0    link#11                       UHS         lo0
ff01::%re0/32                     fe80::224:21ff:fe65:ada0%re0  U           re0
ff01::%lo0/32                     ::1                           U           lo0
ff01::%wlan0/32                   fe80::222:fbff:fe76:6d18%wlan0 U         wlan0
ff02::/16                         ::1                           UGRS        lo0
ff02::%re0/32                     fe80::224:21ff:fe65:ada0%re0  U           re0
ff02::%lo0/32                     ::1                           U           lo0
ff02::%wlan0/32                   fe80::222:fbff:fe76:6d18%wlan0 U         wlan0



P.S.  I ain't using IPv6... like not at all.

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 13:58:44 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id B6FC01B8;
 Wed, 17 Oct 2012 13:58:44 +0000 (UTC)
 (envelope-from guy.helmer@gmail.com)
Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com
 [209.85.223.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 585628FC1C;
 Wed, 17 Oct 2012 13:58:44 +0000 (UTC)
Received: by mail-ie0-f182.google.com with SMTP id k10so15445297iea.13
 for <multiple recipients>; Wed, 17 Oct 2012 06:58:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=content-type:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to:x-mailer;
 bh=uBkU6J6yz2QtLFjXXnukFNjQ++pfiG7l0wHcQUIVeEU=;
 b=EqBfrGxUafgz1idFOfr+iqdSMibRQdUJBVRzPbz6bk5oTqbJXkfkIwSlTXadzFEH7J
 sRW4tWqoLJFOjJr2Pj447JWP0E6bAhd0rVEI+EQKrdvSLpW90vKVWimNvPyK3CC6XAu6
 TUmaNWN1lGyhz0IaQku2ALThX1wnbjQP9dRudkcwbcvdfv+vauG102Ob7f+5IkaLtVUk
 5FafdIvi3y/szeR8VrCv7vBzdrOgrFHqlX4c+C8HisiYPQt4wGDCSGFmZ4DNYusNl79Y
 0Q5eZC2OGZfkUoO5xTGA0QkZX2cQskEIzs6bjujnKLUZDiedSp2Bq4xP5aobf68KPwSt
 GRQg==
Received: by 10.50.190.232 with SMTP id gt8mr1578197igc.69.1350482321182;
 Wed, 17 Oct 2012 06:58:41 -0700 (PDT)
Received: from guysmbp.dyn.palisadesys.com ([216.81.189.9])
 by mx.google.com with ESMTPS id az4sm3015212igb.2.2012.10.17.06.58.39
 (version=TLSv1/SSLv3 cipher=OTHER);
 Wed, 17 Oct 2012 06:58:40 -0700 (PDT)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: 8.3: kernel panic in bpf.c catchpacket()
From: Guy Helmer <guy.helmer@gmail.com>
In-Reply-To: <1EDA1615-2CDE-405A-A725-AF7CC7D3E273@gmail.com>
Date: Wed, 17 Oct 2012 08:58:42 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <381E3EEC-7EDB-428B-A724-434443E51A53@gmail.com>
References: <4B5399BF-4EE0-4182-8297-3BB97C4AA884@gmail.com>
 <59F9A36E-3DB2-4F6F-BB2A-A4C9DA76A70C@gmail.com>
 <5075C05E.9070800@FreeBSD.org>
 <1EDA1615-2CDE-405A-A725-AF7CC7D3E273@gmail.com>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
X-Mailer: Apple Mail (2.1499)
Cc: freebsd-net@freebsd.org, FreeBSD Stable <freebsd-stable@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 13:58:44 -0000

On Oct 12, 2012, at 8:54 AM, Guy Helmer <guy.helmer@gmail.com> wrote:

>=20
> On Oct 10, 2012, at 1:37 PM, Alexander V. Chernikov =
<melifaro@freebsd.org> wrote:
>=20
>> On 10.10.2012 00:36, Guy Helmer wrote:
>>>=20
>>> On Oct 8, 2012, at 8:09 AM, Guy Helmer <guy.helmer@gmail.com> wrote:
>>>=20
>>>> I'm seeing a consistent new kernel panic in FreeBSD 8.3:
>>>> I'm not seeing how bd_sbuf would be NULL here. Any ideas?
>>>=20
>>> Since I've not had any replies, I hope nobody minds if I reply with =
more information.
>>>=20
>>> This panic seems to be occasionally triggered now that my user land =
code is changing the packet filter a while after the bpd device has been =
opened and an initial packet filter was set (previously, my code did not =
change the filter after it was initially set).
>>>=20
>>> I'm focusing on bpf_setf() since that seems to be the place that =
could be tickling a problem, and I see that bpf_setf() calls reset_d(d) =
to clear the hold buffer. I have manually verified that the BPFD lock is =
held during the call to reset_d(), and the lock is held every other =
place that the buffers are manipulated, so I haven't been able to find =
any place that seems vulnerable to losing one of the bpf buffers. Still =
searching, but any help would be appreciated.
>>=20
>> Can you please check this code on -current?
>> Locking has changed quite significantly some time ago, so there is =
good chance that you can get rid of this panic (or discover different =
one which is really "new") :).
>=20
> I'm not ready to run this app on current, so I have merged revs =
229898, 233937, 233938, 233946, 235744, 235745, 235746, 235747, 236231, =
236251, 236261, 236262, 236559, and 236806 to my 8.3 checkout to get =
code that should be virtually identical to current without the timestamp =
changes.
>=20
> Unfortunately, I have only been able to trigger the panic in my test =
lab once -- so I'm not sure whether a lack of problems with the updated =
code will be indicative of likely success in the field where this has =
been trigged regularly at some sites=85
>=20
> Thanks,
> Guy
>=20


FWIW, I was able to trigger the panic with the original 8.3 code again =
in my test lab. With these changes resulting from merging the revs =
mentioned above, I have not seen any panics in my test lab setup in two =
days of load testing, and AFAIK, packet capturing seems to be working =
fine.

I've included the diffs for reference for anyone encountering the issue.

Thanks, Alexander!

Guy

Index: net/bpf.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- net/bpf.c	(revision 239830)
+++ net/bpf.c	(working copy)
@@ -43,6 +43,8 @@
=20
 #include <sys/types.h>
 #include <sys/param.h>
+#include <sys/lock.h>
+#include <sys/rwlock.h>
 #include <sys/systm.h>
 #include <sys/conf.h>
 #include <sys/fcntl.h>
@@ -66,6 +68,7 @@
 #include <sys/socket.h>
=20
 #include <net/if.h>
+#define	BPF_INTERNAL
 #include <net/bpf.h>
 #include <net/bpf_buffer.h>
 #ifdef BPF_JITTER
@@ -139,6 +142,7 @@
=20
 static void	bpf_attachd(struct bpf_d *, struct bpf_if *);
 static void	bpf_detachd(struct bpf_d *);
+static void	bpf_detachd_locked(struct bpf_d *);
 static void	bpf_freed(struct bpf_d *);
 static int	bpf_movein(struct uio *, int, struct ifnet *, struct =
mbuf **,
 		    struct sockaddr *, int *, struct bpf_insn *);
@@ -150,7 +154,7 @@
 		    void (*)(struct bpf_d *, caddr_t, u_int, void *, =
u_int),
 		    struct timeval *);
 static void	reset_d(struct bpf_d *);
-static int	 bpf_setf(struct bpf_d *, struct bpf_program *, u_long =
cmd);
+static int	bpf_setf(struct bpf_d *, struct bpf_program *, u_long =
cmd);
 static int	bpf_getdltlist(struct bpf_d *, struct bpf_dltlist *);
 static int	bpf_setdlt(struct bpf_d *, u_int);
 static void	filt_bpfdetach(struct knote *);
@@ -168,6 +172,12 @@
 SYSCTL_NODE(_net_bpf, OID_AUTO, stats, CTLFLAG_MPSAFE | CTLFLAG_RW,
     bpf_stats_sysctl, "bpf statistics portal");
=20
+static VNET_DEFINE(int, bpf_optimize_writers) =3D 0;
+#define	V_bpf_optimize_writers VNET(bpf_optimize_writers)
+SYSCTL_VNET_INT(_net_bpf, OID_AUTO, optimize_writers,
+    CTLFLAG_RW, &VNET_NAME(bpf_optimize_writers), 0,
+    "Do not send packets until BPF program is set");
+
 static	d_open_t	bpfopen;
 static	d_read_t	bpfread;
 static	d_write_t	bpfwrite;
@@ -189,7 +199,38 @@
 static struct filterops bpfread_filtops =3D
 	{ 1, NULL, filt_bpfdetach, filt_bpfread };
=20
+eventhandler_tag	bpf_ifdetach_cookie =3D NULL;
+
 /*
+ * LOCKING MODEL USED BY BPF:
+ * Locks:
+ * 1) global lock (BPF_LOCK). Mutex, used to protect interface =
addition/removal,
+ * some global counters and every bpf_if reference.
+ * 2) Interface lock. Rwlock, used to protect list of BPF descriptors =
and their filters.
+ * 3) Descriptor lock. Mutex, used to protect BPF buffers and various =
structure fields
+ *   used by bpf_mtap code.
+ *
+ * Lock order:
+ *
+ * Global lock, interface lock, descriptor lock
+ *
+ * We have to acquire interface lock before descriptor main lock due to =
BPF_MTAP[2]
+ * working model. In many places (like bpf_detachd) we start with BPF =
descriptor
+ * (and we need to at least rlock it to get reliable interface =
pointer). This
+ * gives us potential LOR. As a result, we use global lock to protect =
from bpf_if
+ * change in every such place.
+ *
+ * Changing d->bd_bif is protected by 1) global lock, 2) interface lock =
and
+ * 3) descriptor main wlock.
+ * Reading bd_bif can be protected by any of these locks, typically =
global lock.
+ *
+ * Changing read/write BPF filter is protected by the same three locks,
+ * the same applies for reading.
+ *
+ * Sleeping in global lock is not allowed due to bpfdetach() using it.
+ */
+
+/*
  * Wrapper functions for various buffering methods.  If the set of =
buffer
  * modes expands, we will probably want to introduce a switch data =
structure
  * similar to protosw, et.
@@ -282,7 +323,6 @@
 static int
 bpf_canwritebuf(struct bpf_d *d)
 {
-
 	BPFD_LOCK_ASSERT(d);
=20
 	switch (d->bd_bufmode) {
@@ -561,18 +601,93 @@
 static void
 bpf_attachd(struct bpf_d *d, struct bpf_if *bp)
 {
+	int op_w;
+
+	BPF_LOCK_ASSERT();
+
 	/*
-	 * Point d at bp, and add d to the interface's list of =
listeners.
-	 * Finally, point the driver's bpf cookie at the interface so
-	 * it will divert packets to bpf.
+	 * Save sysctl value to protect from sysctl change
+	 * between reads
 	 */
-	BPFIF_LOCK(bp);
+	op_w =3D V_bpf_optimize_writers;
+
+	if (d->bd_bif !=3D NULL)
+		bpf_detachd_locked(d);
+	/*
+	 * Point d at bp, and add d to the interface's list.
+	 * Since there are many applicaiotns using BPF for
+	 * sending raw packets only (dhcpd, cdpd are good examples)
+	 * we can delay adding d to the list of active listeners until
+	 * some filter is configured.
+	 */
+
+	BPFIF_WLOCK(bp);
+	BPFD_LOCK(d);
+
 	d->bd_bif =3D bp;
-	LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next);
=20
+	if (op_w !=3D 0) {
+		/* Add to writers-only list */
+		LIST_INSERT_HEAD(&bp->bif_wlist, d, bd_next);
+		/*
+		 * We decrement bd_writer on every filter set operation.
+		 * First BIOCSETF is done by pcap_open_live() to set up
+		 * snap length. After that appliation usually sets its =
own filter
+		 */
+		d->bd_writer =3D 2;
+	} else
+		LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next);
+
+	BPFD_UNLOCK(d);
+	BPFIF_WUNLOCK(bp);
+
 	bpf_bpfd_cnt++;
-	BPFIF_UNLOCK(bp);
=20
+	CTR3(KTR_NET, "%s: bpf_attach called by pid %d, adding to %s =
list",
+	    __func__, d->bd_pid, d->bd_writer ? "writer" : "active");
+
+	if (op_w =3D=3D 0)
+		EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, =
1);
+}
+
+/*
+ * Add d to the list of active bp filters.
+ * Reuqires bpf_attachd() to be called before
+ */
+static void
+bpf_upgraded(struct bpf_d *d)
+{
+	struct bpf_if *bp;
+
+	BPF_LOCK_ASSERT();
+
+	bp =3D d->bd_bif;
+
+	/*
+	 * Filter can be set several times without specifying interface.
+	 * Mark d as reader and exit.
+	 */
+	if (bp =3D=3D NULL) {
+		BPFD_LOCK(d);
+		d->bd_writer =3D 0;
+		BPFD_UNLOCK(d);
+		return;
+	}
+
+	BPFIF_WLOCK(bp);
+	BPFD_LOCK(d);
+
+	/* Remove from writers-only list */
+	LIST_REMOVE(d, bd_next);
+	LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next);
+	/* Mark d as reader */
+	d->bd_writer =3D 0;
+
+	BPFD_UNLOCK(d);
+	BPFIF_WUNLOCK(bp);
+
+	CTR2(KTR_NET, "%s: upgrade required by pid %d", __func__, =
d->bd_pid);
+
 	EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1);
 }
=20
@@ -582,27 +697,48 @@
 static void
 bpf_detachd(struct bpf_d *d)
 {
+	BPF_LOCK();
+	bpf_detachd_locked(d);
+	BPF_UNLOCK();
+}
+
+static void
+bpf_detachd_locked(struct bpf_d *d)
+{
 	int error;
 	struct bpf_if *bp;
 	struct ifnet *ifp;
=20
-	bp =3D d->bd_bif;
-	BPFIF_LOCK(bp);
+	CTR2(KTR_NET, "%s: detach required by pid %d", __func__, =
d->bd_pid);
+
+	BPF_LOCK_ASSERT();
+
+	/* Check if descriptor is attached */
+	if ((bp =3D d->bd_bif) =3D=3D NULL)
+		return;
+
+	BPFIF_WLOCK(bp);
 	BPFD_LOCK(d);
-	ifp =3D d->bd_bif->bif_ifp;
=20
+	/* Save bd_writer value */
+	error =3D d->bd_writer;
+
 	/*
 	 * Remove d from the interface's descriptor list.
 	 */
 	LIST_REMOVE(d, bd_next);
=20
-	bpf_bpfd_cnt--;
+	ifp =3D bp->bif_ifp;
 	d->bd_bif =3D NULL;
 	BPFD_UNLOCK(d);
-	BPFIF_UNLOCK(bp);
+	BPFIF_WUNLOCK(bp);
=20
-	EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0);
+	bpf_bpfd_cnt--;
=20
+	/* Call event handler iff d is attached */
+	if (error =3D=3D 0)
+		EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0);
+
 	/*
 	 * Check if this descriptor had requested promiscuous mode.
 	 * If so, turn it off.
@@ -640,10 +776,7 @@
 	d->bd_state =3D BPF_IDLE;
 	BPFD_UNLOCK(d);
 	funsetown(&d->bd_sigio);
-	mtx_lock(&bpf_mtx);
-	if (d->bd_bif)
-		bpf_detachd(d);
-	mtx_unlock(&bpf_mtx);
+	bpf_detachd(d);
 #ifdef MAC
 	mac_bpfdesc_destroy(d);
 #endif /* MAC */
@@ -663,7 +796,7 @@
 bpfopen(struct cdev *dev, int flags, int fmt, struct thread *td)
 {
 	struct bpf_d *d;
-	int error;
+	int error, size;
=20
 	d =3D malloc(sizeof(*d), M_BPF, M_WAITOK | M_ZERO);
 	error =3D devfs_set_cdevpriv(d, bpf_dtor);
@@ -681,15 +814,19 @@
 	d->bd_bufmode =3D BPF_BUFMODE_BUFFER;
 	d->bd_sig =3D SIGIO;
 	d->bd_direction =3D BPF_D_INOUT;
-	d->bd_pid =3D td->td_proc->p_pid;
+	BPF_PID_REFRESH(d, td);
 #ifdef MAC
 	mac_bpfdesc_init(d);
 	mac_bpfdesc_create(td->td_ucred, d);
 #endif
-	mtx_init(&d->bd_mtx, devtoname(dev), "bpf cdev lock", MTX_DEF);
-	callout_init_mtx(&d->bd_callout, &d->bd_mtx, 0);
-	knlist_init_mtx(&d->bd_sel.si_note, &d->bd_mtx);
+	mtx_init(&d->bd_lock, devtoname(dev), "bpf cdev lock", MTX_DEF);
+	callout_init_mtx(&d->bd_callout, &d->bd_lock, 0);
+	knlist_init_mtx(&d->bd_sel.si_note, &d->bd_lock);
=20
+	/* Allocate default buffers */
+	size =3D d->bd_bufsize;
+	bpf_buffer_ioctl_sblen(d, &size);
+
 	return (0);
 }
=20
@@ -718,7 +855,7 @@
 	non_block =3D ((ioflag & O_NONBLOCK) !=3D 0);
=20
 	BPFD_LOCK(d);
-	d->bd_pid =3D curthread->td_proc->p_pid;
+	BPF_PID_REFRESH_CUR(d);
 	if (d->bd_bufmode !=3D BPF_BUFMODE_BUFFER) {
 		BPFD_UNLOCK(d);
 		return (EOPNOTSUPP);
@@ -764,7 +901,7 @@
 			BPFD_UNLOCK(d);
 			return (EWOULDBLOCK);
 		}
-		error =3D msleep(d, &d->bd_mtx, PRINET|PCATCH,
+		error =3D msleep(d, &d->bd_lock, PRINET|PCATCH,
 		     "bpf", d->bd_rtout);
 		if (error =3D=3D EINTR || error =3D=3D ERESTART) {
 			BPFD_UNLOCK(d);
@@ -881,8 +1018,9 @@
 	if (error !=3D 0)
 		return (error);
=20
-	d->bd_pid =3D curthread->td_proc->p_pid;
+	BPF_PID_REFRESH_CUR(d);
 	d->bd_wcount++;
+	/* XXX: locking required */
 	if (d->bd_bif =3D=3D NULL) {
 		d->bd_wdcount++;
 		return (ENXIO);
@@ -903,6 +1041,7 @@
 	bzero(&dst, sizeof(dst));
 	m =3D NULL;
 	hlen =3D 0;
+	/* XXX: bpf_movein() can sleep */
 	error =3D bpf_movein(uio, (int)d->bd_bif->bif_dlt, ifp,
 	    &m, &dst, &hlen, d->bd_wfilter);
 	if (error) {
@@ -962,7 +1101,7 @@
 reset_d(struct bpf_d *d)
 {
=20
-	mtx_assert(&d->bd_mtx, MA_OWNED);
+	BPFD_LOCK_ASSERT(d);
=20
 	if ((d->bd_hbuf !=3D NULL) &&
 	    (d->bd_bufmode !=3D BPF_BUFMODE_ZBUF || bpf_canfreebuf(d))) =
{
@@ -1028,7 +1167,7 @@
 	 * Refresh PID associated with this descriptor.
 	 */
 	BPFD_LOCK(d);
-	d->bd_pid =3D td->td_proc->p_pid;
+	BPF_PID_REFRESH(d, td);
 	if (d->bd_state =3D=3D BPF_WAITING)
 		callout_stop(&d->bd_callout);
 	d->bd_state =3D BPF_IDLE;
@@ -1079,7 +1218,9 @@
 	case BIOCGDLTLIST32:
 	case BIOCGRTIMEOUT32:
 	case BIOCSRTIMEOUT32:
+		BPFD_LOCK(d);
 		d->bd_compat32 =3D 1;
+		BPFD_UNLOCK(d);
 	}
 #endif
=20
@@ -1124,7 +1265,9 @@
 	 * Get buffer len [for read()].
 	 */
 	case BIOCGBLEN:
+		BPFD_LOCK(d);
 		*(u_int *)addr =3D d->bd_bufsize;
+		BPFD_UNLOCK(d);
 		break;
=20
 	/*
@@ -1179,10 +1322,12 @@
 	 * Get current data link type.
 	 */
 	case BIOCGDLT:
+		BPF_LOCK();
 		if (d->bd_bif =3D=3D NULL)
 			error =3D EINVAL;
 		else
 			*(u_int *)addr =3D d->bd_bif->bif_dlt;
+		BPF_UNLOCK();
 		break;
=20
 	/*
@@ -1197,6 +1342,7 @@
 			list32 =3D (struct bpf_dltlist32 *)addr;
 			dltlist.bfl_len =3D list32->bfl_len;
 			dltlist.bfl_list =3D PTRIN(list32->bfl_list);
+			BPF_LOCK();
 			if (d->bd_bif =3D=3D NULL)
 				error =3D EINVAL;
 			else {
@@ -1204,31 +1350,37 @@
 				if (error =3D=3D 0)
 					list32->bfl_len =3D =
dltlist.bfl_len;
 			}
+			BPF_UNLOCK();
 			break;
 		}
 #endif
=20
 	case BIOCGDLTLIST:
+		BPF_LOCK();
 		if (d->bd_bif =3D=3D NULL)
 			error =3D EINVAL;
 		else
 			error =3D bpf_getdltlist(d, (struct bpf_dltlist =
*)addr);
+		BPF_UNLOCK();
 		break;
=20
 	/*
 	 * Set data link type.
 	 */
 	case BIOCSDLT:
+		BPF_LOCK();
 		if (d->bd_bif =3D=3D NULL)
 			error =3D EINVAL;
 		else
 			error =3D bpf_setdlt(d, *(u_int *)addr);
+		BPF_UNLOCK();
 		break;
=20
 	/*
 	 * Get interface name.
 	 */
 	case BIOCGETIF:
+		BPF_LOCK();
 		if (d->bd_bif =3D=3D NULL)
 			error =3D EINVAL;
 		else {
@@ -1238,13 +1390,16 @@
 			strlcpy(ifr->ifr_name, ifp->if_xname,
 			    sizeof(ifr->ifr_name));
 		}
+		BPF_UNLOCK();
 		break;
=20
 	/*
 	 * Set interface.
 	 */
 	case BIOCSETIF:
+		BPF_LOCK();
 		error =3D bpf_setif(d, (struct ifreq *)addr);
+		BPF_UNLOCK();
 		break;
=20
 	/*
@@ -1327,7 +1482,9 @@
 	 * Set immediate mode.
 	 */
 	case BIOCIMMEDIATE:
+		BPFD_LOCK(d);
 		d->bd_immediate =3D *(u_int *)addr;
+		BPFD_UNLOCK(d);
 		break;
=20
 	case BIOCVERSION:
@@ -1343,21 +1500,27 @@
 	 * Get "header already complete" flag
 	 */
 	case BIOCGHDRCMPLT:
+		BPFD_LOCK(d);
 		*(u_int *)addr =3D d->bd_hdrcmplt;
+		BPFD_UNLOCK(d);
 		break;
=20
 	/*
 	 * Set "header already complete" flag
 	 */
 	case BIOCSHDRCMPLT:
+		BPFD_LOCK(d);
 		d->bd_hdrcmplt =3D *(u_int *)addr ? 1 : 0;
+		BPFD_UNLOCK(d);
 		break;
=20
 	/*
 	 * Get packet direction flag
 	 */
 	case BIOCGDIRECTION:
+		BPFD_LOCK(d);
 		*(u_int *)addr =3D d->bd_direction;
+		BPFD_UNLOCK(d);
 		break;
=20
 	/*
@@ -1372,7 +1535,9 @@
 			case BPF_D_IN:
 			case BPF_D_INOUT:
 			case BPF_D_OUT:
+				BPFD_LOCK(d);
 				d->bd_direction =3D direction;
+				BPFD_UNLOCK(d);
 				break;
 			default:
 				error =3D EINVAL;
@@ -1381,26 +1546,38 @@
 		break;
=20
 	case BIOCFEEDBACK:
+		BPFD_LOCK(d);
 		d->bd_feedback =3D *(u_int *)addr;
+		BPFD_UNLOCK(d);
 		break;
=20
 	case BIOCLOCK:
+		BPFD_LOCK(d);
 		d->bd_locked =3D 1;
+		BPFD_UNLOCK(d);
 		break;
=20
 	case FIONBIO:		/* Non-blocking I/O */
 		break;
=20
 	case FIOASYNC:		/* Send signal on receive packets */
+		BPFD_LOCK(d);
 		d->bd_async =3D *(int *)addr;
+		BPFD_UNLOCK(d);
 		break;
=20
 	case FIOSETOWN:
+		/*
+		 * XXX: Add some sort of locking here?
+		 * fsetown() can sleep.
+		 */
 		error =3D fsetown(*(int *)addr, &d->bd_sigio);
 		break;
=20
 	case FIOGETOWN:
+		BPFD_LOCK(d);
 		*(int *)addr =3D fgetown(&d->bd_sigio);
+		BPFD_UNLOCK(d);
 		break;
=20
 	/* This is deprecated, FIOSETOWN should be used instead. */
@@ -1421,16 +1598,23 @@
=20
 			if (sig >=3D NSIG)
 				error =3D EINVAL;
-			else
+			else {
+				BPFD_LOCK(d);
 				d->bd_sig =3D sig;
+				BPFD_UNLOCK(d);
+			}
 			break;
 		}
 	case BIOCGRSIG:
+		BPFD_LOCK(d);
 		*(u_int *)addr =3D d->bd_sig;
+		BPFD_UNLOCK(d);
 		break;
=20
 	case BIOCGETBUFMODE:
+		BPFD_LOCK(d);
 		*(u_int *)addr =3D d->bd_bufmode;
+		BPFD_UNLOCK(d);
 		break;
=20
 	case BIOCSETBUFMODE:
@@ -1485,95 +1669,130 @@
 /*
  * Set d's packet filter program to fp.  If this file already has a =
filter,
  * free it and replace it.  Returns EINVAL for bogus requests.
+ *
+ * Note we need global lock here to serialize bpf_setf() and =
bpf_setif() calls
+ * since reading d->bd_bif can't be protected by d or interface lock =
due to
+ * lock order.
+ *
+ * Additionally, we have to acquire interface write lock due to =
bpf_mtap() uses
+ * interface read lock to read all filers.
+ *
  */
 static int
 bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd)
 {
+#ifdef COMPAT_FREEBSD32
+	struct bpf_program fp_swab;
+	struct bpf_program32 *fp32;
+#endif
 	struct bpf_insn *fcode, *old;
-	u_int wfilter, flen, size;
 #ifdef BPF_JITTER
-	bpf_jit_filter *ofunc;
+	bpf_jit_filter *jfunc, *ofunc;
 #endif
+	size_t size;
+	u_int flen;
+	int need_upgrade;
+
 #ifdef COMPAT_FREEBSD32
-	struct bpf_program32 *fp32;
-	struct bpf_program fp_swab;
-
-	if (cmd =3D=3D BIOCSETWF32 || cmd =3D=3D BIOCSETF32 || cmd =3D=3D =
BIOCSETFNR32) {
+	switch (cmd) {
+	case BIOCSETF32:
+	case BIOCSETWF32:
+	case BIOCSETFNR32:
 		fp32 =3D (struct bpf_program32 *)fp;
 		fp_swab.bf_len =3D fp32->bf_len;
 		fp_swab.bf_insns =3D (struct bpf_insn =
*)(uintptr_t)fp32->bf_insns;
 		fp =3D &fp_swab;
-		if (cmd =3D=3D BIOCSETWF32)
+		switch (cmd) {
+		case BIOCSETF32:
+			cmd =3D BIOCSETF;
+			break;
+		case BIOCSETWF32:
 			cmd =3D BIOCSETWF;
+			break;
+		}
+		break;
 	}
 #endif
-	if (cmd =3D=3D BIOCSETWF) {
-		old =3D d->bd_wfilter;
-		wfilter =3D 1;
+
+	fcode =3D NULL;
 #ifdef BPF_JITTER
-		ofunc =3D NULL;
+	jfunc =3D ofunc =3D NULL;
 #endif
-	} else {
-		wfilter =3D 0;
-		old =3D d->bd_rfilter;
-#ifdef BPF_JITTER
-		ofunc =3D d->bd_bfilter;
-#endif
-	}
-	if (fp->bf_insns =3D=3D NULL) {
-		if (fp->bf_len !=3D 0)
+	need_upgrade =3D 0;
+
+	/*
+	 * Check new filter validness before acquiring any locks.
+	 * Allocate memory for new filter, if needed.
+	 */
+	flen =3D fp->bf_len;
+	if (flen > bpf_maxinsns || (fp->bf_insns =3D=3D NULL && flen !=3D =
0))
+		return (EINVAL);
+	size =3D flen * sizeof(*fp->bf_insns);
+	if (size > 0) {
+		/* We're setting up new filter.  Copy and check actual =
data. */
+		fcode =3D malloc(size, M_BPF, M_WAITOK);
+		if (copyin(fp->bf_insns, fcode, size) !=3D 0 ||
+		    !bpf_validate(fcode, flen)) {
+			free(fcode, M_BPF);
 			return (EINVAL);
-		BPFD_LOCK(d);
-		if (wfilter)
-			d->bd_wfilter =3D NULL;
-		else {
-			d->bd_rfilter =3D NULL;
-#ifdef BPF_JITTER
-			d->bd_bfilter =3D NULL;
-#endif
-			if (cmd =3D=3D BIOCSETF)
-				reset_d(d);
 		}
-		BPFD_UNLOCK(d);
-		if (old !=3D NULL)
-			free((caddr_t)old, M_BPF);
 #ifdef BPF_JITTER
-		if (ofunc !=3D NULL)
-			bpf_destroy_jit_filter(ofunc);
+		/* Filter is copied inside fcode and is perfectly valid. =
*/
+		jfunc =3D bpf_jitter(fcode, flen);
 #endif
-		return (0);
 	}
-	flen =3D fp->bf_len;
-	if (flen > bpf_maxinsns)
-		return (EINVAL);
=20
-	size =3D flen * sizeof(*fp->bf_insns);
-	fcode =3D (struct bpf_insn *)malloc(size, M_BPF, M_WAITOK);
-	if (copyin((caddr_t)fp->bf_insns, (caddr_t)fcode, size) =3D=3D 0 =
&&
-	    bpf_validate(fcode, (int)flen)) {
-		BPFD_LOCK(d);
-		if (wfilter)
-			d->bd_wfilter =3D fcode;
-		else {
-			d->bd_rfilter =3D fcode;
+	BPF_LOCK();
+
+	/*
+	 * Set up new filter.
+	 * Protect filter change by interface lock.
+	 * Additionally, we are protected by global lock here.
+	 */
+	if (d->bd_bif !=3D NULL)
+		BPFIF_WLOCK(d->bd_bif);
+	BPFD_LOCK(d);
+	if (cmd =3D=3D BIOCSETWF) {
+		old =3D d->bd_wfilter;
+		d->bd_wfilter =3D fcode;
+	} else {
+		old =3D d->bd_rfilter;
+		d->bd_rfilter =3D fcode;
 #ifdef BPF_JITTER
-			d->bd_bfilter =3D bpf_jitter(fcode, flen);
+		ofunc =3D d->bd_bfilter;
+		d->bd_bfilter =3D jfunc;
 #endif
-			if (cmd =3D=3D BIOCSETF)
-				reset_d(d);
+		if (cmd =3D=3D BIOCSETF)
+			reset_d(d);
+
+		if (fcode !=3D NULL) {
+			/*
+			 * Do not require upgrade by first BIOCSETF
+			 * (used to set snaplen) by pcap_open_live().
+			 */
+			if (d->bd_writer !=3D 0 && --d->bd_writer =3D=3D =
0)
+				need_upgrade =3D 1;
+			CTR4(KTR_NET, "%s: filter function set by pid =
%d, "
+			    "bd_writer counter %d, need_upgrade %d",
+			    __func__, d->bd_pid, d->bd_writer, =
need_upgrade);
 		}
-		BPFD_UNLOCK(d);
-		if (old !=3D NULL)
-			free((caddr_t)old, M_BPF);
+	}
+	BPFD_UNLOCK(d);
+	if (d->bd_bif !=3D NULL)
+		BPFIF_WUNLOCK(d->bd_bif);
+	if (old !=3D NULL)
+		free(old, M_BPF);
 #ifdef BPF_JITTER
-		if (ofunc !=3D NULL)
-			bpf_destroy_jit_filter(ofunc);
+	if (ofunc !=3D NULL)
+		bpf_destroy_jit_filter(ofunc);
 #endif
=20
-		return (0);
-	}
-	free((caddr_t)fcode, M_BPF);
-	return (EINVAL);
+	/* Move d to active readers list. */
+	if (need_upgrade)
+		bpf_upgraded(d);
+
+	BPF_UNLOCK();
+	return (0);
 }
=20
 /*
@@ -1587,28 +1806,30 @@
 	struct bpf_if *bp;
 	struct ifnet *theywant;
=20
+	BPF_LOCK_ASSERT();
+
 	theywant =3D ifunit(ifr->ifr_name);
 	if (theywant =3D=3D NULL || theywant->if_bpf =3D=3D NULL)
 		return (ENXIO);
=20
 	bp =3D theywant->if_bpf;
=20
+	/* Check if interface is not being detached from BPF */
+	BPFIF_RLOCK(bp);
+	if (bp->flags & BPFIF_FLAG_DYING) {
+		BPFIF_RUNLOCK(bp);
+		return (ENXIO);
+	}
+	BPFIF_RUNLOCK(bp);
+
 	/*
 	 * Behavior here depends on the buffering model.  If we're using
 	 * kernel memory buffers, then we can allocate them here.  If =
we're
 	 * using zero-copy, then the user process must have registered
 	 * buffers by the time we get here.  If not, return an error.
-	 *
-	 * XXXRW: There are locking issues here with multi-threaded use: =
what
-	 * if two threads try to set the interface at once?
 	 */
 	switch (d->bd_bufmode) {
 	case BPF_BUFMODE_BUFFER:
-		if (d->bd_sbuf =3D=3D NULL)
-			bpf_buffer_alloc(d);
-		KASSERT(d->bd_sbuf !=3D NULL, ("bpf_setif: bd_sbuf =
NULL"));
-		break;
-
 	case BPF_BUFMODE_ZBUF:
 		if (d->bd_sbuf =3D=3D NULL)
 			return (EINVAL);
@@ -1617,15 +1838,8 @@
 	default:
 		panic("bpf_setif: bufmode %d", d->bd_bufmode);
 	}
-	if (bp !=3D d->bd_bif) {
-		if (d->bd_bif)
-			/*
-			 * Detach if attached to something else.
-			 */
-			bpf_detachd(d);
-
+	if (bp !=3D d->bd_bif)
 		bpf_attachd(d, bp);
-	}
 	BPFD_LOCK(d);
 	reset_d(d);
 	BPFD_UNLOCK(d);
@@ -1653,7 +1867,7 @@
 	 */
 	revents =3D events & (POLLOUT | POLLWRNORM);
 	BPFD_LOCK(d);
-	d->bd_pid =3D td->td_proc->p_pid;
+	BPF_PID_REFRESH(d, td);
 	if (events & (POLLIN | POLLRDNORM)) {
 		if (bpf_ready(d))
 			revents |=3D events & (POLLIN | POLLRDNORM);
@@ -1688,7 +1902,7 @@
 	 * Refresh PID associated with this descriptor.
 	 */
 	BPFD_LOCK(d);
-	d->bd_pid =3D curthread->td_proc->p_pid;
+	BPF_PID_REFRESH_CUR(d);
 	kn->kn_fop =3D &bpfread_filtops;
 	kn->kn_hook =3D d;
 	knlist_add(&d->bd_sel.si_note, kn, 1);
@@ -1744,9 +1958,19 @@
 	struct timeval tv;
=20
 	gottime =3D 0;
-	BPFIF_LOCK(bp);
+
+	BPFIF_RLOCK(bp);
+
 	LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
-		BPFD_LOCK(d);
+		/*
+		 * We are not using any locks for d here because:
+		 * 1) any filter change is protected by interface
+		 * write lock
+		 * 2) destroying/detaching d is protected by interface
+		 * write lock, too
+		 */
+
+		/* XXX: Do not protect counter for the sake of =
performance. */
 		++d->bd_rcount;
 		/*
 		 * NB: We dont call BPF_CHECK_DIRECTION() here since =
there is no
@@ -1762,6 +1986,11 @@
 #endif
 		slen =3D bpf_filter(d->bd_rfilter, pkt, pktlen, pktlen);
 		if (slen !=3D 0) {
+			/*
+			 * Filter matches. Let's to acquire write lock.
+			 */
+			BPFD_LOCK(d);
+
 			d->bd_fcount++;
 			if (!gottime) {
 				microtime(&tv);
@@ -1772,10 +2001,10 @@
 #endif
 				catchpacket(d, pkt, pktlen, slen,
 				    bpf_append_bytes, &tv);
+			BPFD_UNLOCK(d);
 		}
-		BPFD_UNLOCK(d);
 	}
-	BPFIF_UNLOCK(bp);
+	BPFIF_RUNLOCK(bp);
 }
=20
 #define	BPF_CHECK_DIRECTION(d, r, i)				=
\
@@ -1784,6 +2013,7 @@
=20
 /*
  * Incoming linkage from device drivers, when packet is in an mbuf =
chain.
+ * Locking model is explained in bpf_tap().
  */
 void
 bpf_mtap(struct bpf_if *bp, struct mbuf *m)
@@ -1806,11 +2036,11 @@
=20
 	pktlen =3D m_length(m, NULL);
=20
-	BPFIF_LOCK(bp);
+	BPFIF_RLOCK(bp);
+
 	LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
 		if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, =
bp->bif_ifp))
 			continue;
-		BPFD_LOCK(d);
 		++d->bd_rcount;
 #ifdef BPF_JITTER
 		bf =3D bpf_jitter_enable !=3D 0 ? d->bd_bfilter : NULL;
@@ -1821,6 +2051,8 @@
 #endif
 		slen =3D bpf_filter(d->bd_rfilter, (u_char *)m, pktlen, =
0);
 		if (slen !=3D 0) {
+			BPFD_LOCK(d);
+
 			d->bd_fcount++;
 			if (!gottime) {
 				microtime(&tv);
@@ -1831,10 +2063,10 @@
 #endif
 				catchpacket(d, (u_char *)m, pktlen, =
slen,
 				    bpf_append_mbuf, &tv);
+			BPFD_UNLOCK(d);
 		}
-		BPFD_UNLOCK(d);
 	}
-	BPFIF_UNLOCK(bp);
+	BPFIF_RUNLOCK(bp);
 }
=20
 /*
@@ -1869,14 +2101,17 @@
 	mb.m_len =3D dlen;
 	pktlen +=3D dlen;
=20
-	BPFIF_LOCK(bp);
+
+	BPFIF_RLOCK(bp);
+
 	LIST_FOREACH(d, &bp->bif_dlist, bd_next) {
 		if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, =
bp->bif_ifp))
 			continue;
-		BPFD_LOCK(d);
 		++d->bd_rcount;
 		slen =3D bpf_filter(d->bd_rfilter, (u_char *)&mb, =
pktlen, 0);
 		if (slen !=3D 0) {
+			BPFD_LOCK(d);
+
 			d->bd_fcount++;
 			if (!gottime) {
 				microtime(&tv);
@@ -1887,10 +2122,10 @@
 #endif
 				catchpacket(d, (u_char *)&mb, pktlen, =
slen,
 				    bpf_append_mbuf, &tv);
+			BPFD_UNLOCK(d);
 		}
-		BPFD_UNLOCK(d);
 	}
-	BPFIF_UNLOCK(bp);
+	BPFIF_RUNLOCK(bp);
 }
=20
 #undef	BPF_CHECK_DIRECTION
@@ -2040,7 +2275,7 @@
 	}
 	if (d->bd_wfilter !=3D NULL)
 		free((caddr_t)d->bd_wfilter, M_BPF);
-	mtx_destroy(&d->bd_mtx);
+	mtx_destroy(&d->bd_lock);
 }
=20
 /*
@@ -2070,15 +2305,16 @@
 		panic("bpfattach");
=20
 	LIST_INIT(&bp->bif_dlist);
+	LIST_INIT(&bp->bif_wlist);
 	bp->bif_ifp =3D ifp;
 	bp->bif_dlt =3D dlt;
-	mtx_init(&bp->bif_mtx, "bpf interface lock", NULL, MTX_DEF);
+	rw_init(&bp->bif_lock, "bpf interface lock");
 	KASSERT(*driverp =3D=3D NULL, ("bpfattach2: driverp already =
initialized"));
 	*driverp =3D bp;
=20
-	mtx_lock(&bpf_mtx);
+	BPF_LOCK();
 	LIST_INSERT_HEAD(&bpf_iflist, bp, bif_next);
-	mtx_unlock(&bpf_mtx);
+	BPF_UNLOCK();
=20
 	/*
 	 * Compute the length of the bpf header.  This is not =
necessarily
@@ -2093,10 +2329,9 @@
 }
=20
 /*
- * Detach bpf from an interface.  This involves detaching each =
descriptor
- * associated with the interface, and leaving bd_bif NULL.  Notify each
- * descriptor as it's detached so that any sleepers wake up and get
- * ENXIO.
+ * Detach bpf from an interface. This involves detaching each =
descriptor
+ * associated with the interface. Notify each descriptor as it's =
detached
+ * so that any sleepers wake up and get ENXIO.
  */
 void
 bpfdetach(struct ifnet *ifp)
@@ -2109,31 +2344,45 @@
 	ndetached =3D 0;
 #endif
=20
+	BPF_LOCK();
 	/* Find all bpf_if struct's which reference ifp and detach them. =
*/
 	do {
-		mtx_lock(&bpf_mtx);
 		LIST_FOREACH(bp, &bpf_iflist, bif_next) {
 			if (ifp =3D=3D bp->bif_ifp)
 				break;
 		}
 		if (bp !=3D NULL)
 			LIST_REMOVE(bp, bif_next);
-		mtx_unlock(&bpf_mtx);
=20
 		if (bp !=3D NULL) {
 #ifdef INVARIANTS
 			ndetached++;
 #endif
 			while ((d =3D LIST_FIRST(&bp->bif_dlist)) !=3D =
NULL) {
-				bpf_detachd(d);
+				bpf_detachd_locked(d);
 				BPFD_LOCK(d);
 				bpf_wakeup(d);
 				BPFD_UNLOCK(d);
 			}
-			mtx_destroy(&bp->bif_mtx);
-			free(bp, M_BPF);
+			/* Free writer-only descriptors */
+			while ((d =3D LIST_FIRST(&bp->bif_wlist)) !=3D =
NULL) {
+				bpf_detachd_locked(d);
+				BPFD_LOCK(d);
+				bpf_wakeup(d);
+				BPFD_UNLOCK(d);
+			}
+
+			/*
+			 * Delay freing bp till interface is detached
+			 * and all routes through this interface are =
removed.
+			 * Mark bp as detached to restrict new =
consumers.
+			 */
+			BPFIF_WLOCK(bp);
+			bp->flags |=3D BPFIF_FLAG_DYING;
+			BPFIF_WUNLOCK(bp);
 		}
 	} while (bp !=3D NULL);
+	BPF_UNLOCK();
=20
 #ifdef INVARIANTS
 	if (ndetached =3D=3D 0)
@@ -2142,6 +2391,37 @@
 }
=20
 /*
+ * Interface departure handler.
+ * Note departure event does not guarantee interface is going down.
+ */
+static void
+bpf_ifdetach(void *arg __unused, struct ifnet *ifp)
+{
+	struct bpf_if *bp;
+
+	BPF_LOCK();
+	if ((bp =3D ifp->if_bpf) =3D=3D NULL) {
+		BPF_UNLOCK();
+		return;
+	}
+
+	/* Check if bpfdetach() was called previously */
+	if ((bp->flags & BPFIF_FLAG_DYING) =3D=3D 0) {
+		BPF_UNLOCK();
+		return;
+	}
+
+	CTR3(KTR_NET, "%s: freing BPF instance %p for interface %p",
+	    __func__, bp, ifp);
+
+	ifp->if_bpf =3D NULL;
+	BPF_UNLOCK();
+
+	rw_destroy(&bp->bif_lock);
+	free(bp, M_BPF);
+}
+
+/*
  * Get a list of available data link type of the interface.
  */
 static int
@@ -2151,24 +2431,22 @@
 	struct ifnet *ifp;
 	struct bpf_if *bp;
=20
+	BPF_LOCK_ASSERT();
+
 	ifp =3D d->bd_bif->bif_ifp;
 	n =3D 0;
 	error =3D 0;
-	mtx_lock(&bpf_mtx);
 	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
 		if (bp->bif_ifp !=3D ifp)
 			continue;
 		if (bfl->bfl_list !=3D NULL) {
-			if (n >=3D bfl->bfl_len) {
-				mtx_unlock(&bpf_mtx);
+			if (n >=3D bfl->bfl_len)
 				return (ENOMEM);
-			}
 			error =3D copyout(&bp->bif_dlt,
 			    bfl->bfl_list + n, sizeof(u_int));
 		}
 		n++;
 	}
-	mtx_unlock(&bpf_mtx);
 	bfl->bfl_len =3D n;
 	return (error);
 }
@@ -2183,18 +2461,19 @@
 	struct ifnet *ifp;
 	struct bpf_if *bp;
=20
+	BPF_LOCK_ASSERT();
+
 	if (d->bd_bif->bif_dlt =3D=3D dlt)
 		return (0);
 	ifp =3D d->bd_bif->bif_ifp;
-	mtx_lock(&bpf_mtx);
+
 	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
 		if (bp->bif_ifp =3D=3D ifp && bp->bif_dlt =3D=3D dlt)
 			break;
 	}
-	mtx_unlock(&bpf_mtx);
+
 	if (bp !=3D NULL) {
 		opromisc =3D d->bd_promisc;
-		bpf_detachd(d);
 		bpf_attachd(d, bp);
 		BPFD_LOCK(d);
 		reset_d(d);
@@ -2223,6 +2502,11 @@
 	dev =3D make_dev(&bpf_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, =
"bpf");
 	/* For compatibility */
 	make_dev_alias(dev, "bpf0");
+
+	/* Register interface departure handler */
+	bpf_ifdetach_cookie =3D EVENTHANDLER_REGISTER(
+		    ifnet_departure_event, bpf_ifdetach, NULL,
+		    EVENTHANDLER_PRI_ANY);
 }
=20
 /*
@@ -2236,9 +2520,9 @@
 	struct bpf_if *bp;
 	struct bpf_d *bd;
=20
-	mtx_lock(&bpf_mtx);
+	BPF_LOCK();
 	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
-		BPFIF_LOCK(bp);
+		BPFIF_RLOCK(bp);
 		LIST_FOREACH(bd, &bp->bif_dlist, bd_next) {
 			BPFD_LOCK(bd);
 			bd->bd_rcount =3D 0;
@@ -2249,11 +2533,14 @@
 			bd->bd_zcopy =3D 0;
 			BPFD_UNLOCK(bd);
 		}
-		BPFIF_UNLOCK(bp);
+		BPFIF_RUNLOCK(bp);
 	}
-	mtx_unlock(&bpf_mtx);
+	BPF_UNLOCK();
 }
=20
+/*
+ * Fill filter statistics
+ */
 static void
 bpfstats_fill_xbpf(struct xbpf_d *d, struct bpf_d *bd)
 {
@@ -2261,6 +2548,7 @@
 	bzero(d, sizeof(*d));
 	BPFD_LOCK_ASSERT(bd);
 	d->bd_structsize =3D sizeof(*d);
+	/* XXX: reading should be protected by global lock */
 	d->bd_immediate =3D bd->bd_immediate;
 	d->bd_promisc =3D bd->bd_promisc;
 	d->bd_hdrcmplt =3D bd->bd_hdrcmplt;
@@ -2285,6 +2573,9 @@
 	d->bd_bufmode =3D bd->bd_bufmode;
 }
=20
+/*
+ * Handle `netstat -B' stats request
+ */
 static int
 bpf_stats_sysctl(SYSCTL_HANDLER_ARGS)
 {
@@ -2322,24 +2613,31 @@
 	if (bpf_bpfd_cnt =3D=3D 0)
 		return (SYSCTL_OUT(req, 0, 0));
 	xbdbuf =3D malloc(req->oldlen, M_BPF, M_WAITOK);
-	mtx_lock(&bpf_mtx);
+	BPF_LOCK();
 	if (req->oldlen < (bpf_bpfd_cnt * sizeof(*xbd))) {
-		mtx_unlock(&bpf_mtx);
+		BPF_UNLOCK();
 		free(xbdbuf, M_BPF);
 		return (ENOMEM);
 	}
 	index =3D 0;
 	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
-		BPFIF_LOCK(bp);
+		BPFIF_RLOCK(bp);
+		/* Send writers-only first */
+		LIST_FOREACH(bd, &bp->bif_wlist, bd_next) {
+			xbd =3D &xbdbuf[index++];
+			BPFD_LOCK(bd);
+			bpfstats_fill_xbpf(xbd, bd);
+			BPFD_UNLOCK(bd);
+		}
 		LIST_FOREACH(bd, &bp->bif_dlist, bd_next) {
 			xbd =3D &xbdbuf[index++];
 			BPFD_LOCK(bd);
 			bpfstats_fill_xbpf(xbd, bd);
 			BPFD_UNLOCK(bd);
 		}
-		BPFIF_UNLOCK(bp);
+		BPFIF_RUNLOCK(bp);
 	}
-	mtx_unlock(&bpf_mtx);
+	BPF_UNLOCK();
 	error =3D SYSCTL_OUT(req, xbdbuf, index * sizeof(*xbd));
 	free(xbdbuf, M_BPF);
 	return (error);
Index: net/bpf.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- net/bpf.h	(revision 239830)
+++ net/bpf.h	(working copy)
@@ -917,14 +917,21 @@
=20
 /*
  * Descriptor associated with each attached hardware interface.
+ * FIXME: this structure is exposed to external callers to speed up
+ * bpf_peers_present() call. However we cover all fields not needed by
+ * this function via BPF_INTERNAL define
  */
 struct bpf_if {
 	LIST_ENTRY(bpf_if)	bif_next;	/* list of all =
interfaces */
 	LIST_HEAD(, bpf_d)	bif_dlist;	/* descriptor list */
+#ifdef BPF_INTERNAL
 	u_int bif_dlt;				/* link layer type */
 	u_int bif_hdrlen;		/* length of header (with =
padding) */
 	struct ifnet *bif_ifp;		/* corresponding interface */
-	struct mtx	bif_mtx;	/* mutex for interface */
+	struct rwlock bif_lock;		/* interface lock */
+	LIST_HEAD(, bpf_d)	bif_wlist;	/* writer-only list */
+	int flags;			/* Interface flags */
+#endif
 };
=20
 void	 bpf_bufheld(struct bpf_d *d);
Index: net/bpf_buffer.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- net/bpf_buffer.c	(revision 239830)
+++ net/bpf_buffer.c	(working copy)
@@ -93,21 +93,6 @@
 SYSCTL_INT(_net_bpf, OID_AUTO, maxbufsize, CTLFLAG_RW,
     &bpf_maxbufsize, 0, "Default capture buffer in bytes");
=20
-void
-bpf_buffer_alloc(struct bpf_d *d)
-{
-
-	KASSERT(d->bd_fbuf =3D=3D NULL, ("bpf_buffer_alloc: bd_fbuf !=3D =
NULL"));
-	KASSERT(d->bd_sbuf =3D=3D NULL, ("bpf_buffer_alloc: bd_sbuf !=3D =
NULL"));
-	KASSERT(d->bd_hbuf =3D=3D NULL, ("bpf_buffer_alloc: bd_hbuf !=3D =
NULL"));
-
-	d->bd_fbuf =3D (caddr_t)malloc(d->bd_bufsize, M_BPF, M_WAITOK);
-	d->bd_sbuf =3D (caddr_t)malloc(d->bd_bufsize, M_BPF, M_WAITOK);
-	d->bd_hbuf =3D NULL;
-	d->bd_slen =3D 0;
-	d->bd_hlen =3D 0;
-}
-
 /*
  * Simple data copy to the current kernel buffer.
  */
@@ -183,18 +168,42 @@
 bpf_buffer_ioctl_sblen(struct bpf_d *d, u_int *i)
 {
 	u_int size;
+	caddr_t fbuf, sbuf;
=20
+	size =3D *i;
+	if (size > bpf_maxbufsize)
+		*i =3D size =3D bpf_maxbufsize;
+	else if (size < BPF_MINBUFSIZE)
+		*i =3D size =3D BPF_MINBUFSIZE;
+
+	/* Allocate buffers immediately */
+	fbuf =3D (caddr_t)malloc(size, M_BPF, M_WAITOK);
+	sbuf =3D (caddr_t)malloc(size, M_BPF, M_WAITOK);
+
 	BPFD_LOCK(d);
 	if (d->bd_bif !=3D NULL) {
+		/* Interface already attached, unable to change buffers =
*/
 		BPFD_UNLOCK(d);
+		free(fbuf, M_BPF);
+		free(sbuf, M_BPF);
 		return (EINVAL);
 	}
-	size =3D *i;
-	if (size > bpf_maxbufsize)
-		*i =3D size =3D bpf_maxbufsize;
-	else if (size < BPF_MINBUFSIZE)
-		*i =3D size =3D BPF_MINBUFSIZE;
+
+	/* Free old buffers if set */
+	if (d->bd_fbuf !=3D NULL)
+		free(d->bd_fbuf, M_BPF);
+	if (d->bd_sbuf !=3D NULL)
+		free(d->bd_sbuf, M_BPF);
+
+	/* Fill in new data */
 	d->bd_bufsize =3D size;
+	d->bd_fbuf =3D fbuf;
+	d->bd_sbuf =3D sbuf;
+
+	d->bd_hbuf =3D NULL;
+	d->bd_slen =3D 0;
+	d->bd_hlen =3D 0;
+
 	BPFD_UNLOCK(d);
 	return (0);
 }
Index: net/bpf_buffer.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- net/bpf_buffer.h	(revision 239830)
+++ net/bpf_buffer.h	(working copy)
@@ -36,7 +36,6 @@
 #error "no user-serviceable parts inside"
 #endif
=20
-void	bpf_buffer_alloc(struct bpf_d *d);
 void	bpf_buffer_append_bytes(struct bpf_d *d, caddr_t buf, u_int =
offset,
 	    void *src, u_int len);
 void	bpf_buffer_append_mbuf(struct bpf_d *d, caddr_t buf, u_int =
offset,
Index: net/bpfdesc.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
--- net/bpfdesc.h	(revision 239830)
+++ net/bpfdesc.h	(working copy)
@@ -79,6 +79,7 @@
 	u_char		bd_promisc;	/* true if listening =
promiscuously */
 	u_char		bd_state;	/* idle, waiting, or timed out =
*/
 	u_char		bd_immediate;	/* true to return on packet =
arrival */
+	u_char		bd_writer;	/* non-zero if d is writer-only =
*/
 	int		bd_hdrcmplt;	/* false to fill in src lladdr =
automatically */
 	int		bd_direction;	/* select packet direction */
 	int		bd_feedback;	/* true to feed back sent =
packets */
@@ -86,7 +87,7 @@
 	int		bd_sig;		/* signal to send upon packet =
reception */
 	struct sigio *	bd_sigio;	/* information for async I/O */
 	struct selinfo	bd_sel;		/* bsd select info */
-	struct mtx	bd_mtx;		/* mutex for this descriptor */
+	struct mtx	bd_lock;	/* per-descriptor lock */
 	struct callout	bd_callout;	/* for BPF timeouts with select =
*/
 	struct label	*bd_label;	/* MAC label for descriptor */
 	u_int64_t	bd_fcount;	/* number of packets which =
matched filter */
@@ -105,10 +106,16 @@
 #define BPF_WAITING	1		/* waiting for read timeout in =
select */
 #define BPF_TIMED_OUT	2		/* read timeout has expired in =
select */
=20
-#define BPFD_LOCK(bd)		mtx_lock(&(bd)->bd_mtx)
-#define BPFD_UNLOCK(bd)		mtx_unlock(&(bd)->bd_mtx)
-#define BPFD_LOCK_ASSERT(bd)	mtx_assert(&(bd)->bd_mtx, MA_OWNED)
+#define BPFD_LOCK(bd)		mtx_lock(&(bd)->bd_lock)
+#define BPFD_UNLOCK(bd)		mtx_unlock(&(bd)->bd_lock)
+#define BPFD_LOCK_ASSERT(bd)	mtx_assert(&(bd)->bd_lock, MA_OWNED)
=20
+#define BPF_PID_REFRESH(bd, td)	(bd)->bd_pid =3D =
(td)->td_proc->p_pid
+#define BPF_PID_REFRESH_CUR(bd)	(bd)->bd_pid =3D =
curthread->td_proc->p_pid
+
+#define BPF_LOCK()		mtx_lock(&bpf_mtx)
+#define BPF_UNLOCK()		mtx_unlock(&bpf_mtx)
+#define BPF_LOCK_ASSERT()	mtx_assert(&bpf_mtx, MA_OWNED)
 /*
  * External representation of the bpf descriptor
  */
@@ -143,7 +150,11 @@
 	u_int64_t	bd_spare[4];
 };
=20
-#define BPFIF_LOCK(bif)		mtx_lock(&(bif)->bif_mtx)
-#define BPFIF_UNLOCK(bif)	mtx_unlock(&(bif)->bif_mtx)
+#define BPFIF_RLOCK(bif)	rw_rlock(&(bif)->bif_lock)
+#define BPFIF_RUNLOCK(bif)	rw_runlock(&(bif)->bif_lock)
+#define BPFIF_WLOCK(bif)	rw_wlock(&(bif)->bif_lock)
+#define BPFIF_WUNLOCK(bif)	rw_wunlock(&(bif)->bif_lock)
=20
+#define BPFIF_FLAG_DYING	1	/* Reject new bpf consumers */
+
 #endif


From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 16:50:05 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id DAD3154C
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 16:50:05 +0000 (UTC)
 (envelope-from ming.fu@netsweeper.com)
Received: from mail.netsweeper.com (mail.netsweeper.com [216.171.98.87])
 by mx1.freebsd.org (Postfix) with ESMTP id A81658FC0C
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 16:50:05 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by mail.netsweeper.com (Postfix) with ESMTP id 8EC8F1E410FB
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 12:41:45 -0400 (EDT)
X-Virus-Scanned: amavisd-new at mail.netsweeper.com
Received: from mail.netsweeper.com ([127.0.0.1])
 by localhost (mail.netsweeper.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id YEur7bOfJTI7 for <freebsd-net@freebsd.org>;
 Wed, 17 Oct 2012 12:41:45 -0400 (EDT)
Received: from [192.168.4.202] (unknown [216.171.98.93])
 by mail.netsweeper.com (Postfix) with ESMTPSA id 66A661E410A6
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 12:41:45 -0400 (EDT)
Message-ID: <507EDFCE.3060702@netsweeper.com>
Date: Wed, 17 Oct 2012 12:41:50 -0400
From: Ming Fu <ming.fu@netsweeper.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: netmap NETMAP_SW_RING or NETMAP_HW_RING
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 16:50:05 -0000

Hi,

What is the difference between NETMAP_SW_RING and NETMAP_HW_RING.
When using netmap_open() in the example code to create a netmap fdesc, 
one of these two need to be ORed to the queue ID, in order to bind only 
one RX queue.

netmap code updated from FreeBSD RELENG_9.

Regards,
Ming



From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 17:33:07 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 081E0337
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 17:33:07 +0000 (UTC)
 (envelope-from ming.fu@netsweeper.com)
Received: from mail.netsweeper.com (mail.netsweeper.com [216.171.98.87])
 by mx1.freebsd.org (Postfix) with ESMTP id C206C8FC08
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 17:33:06 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by mail.netsweeper.com (Postfix) with ESMTP id EEFED1E40E75
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 13:32:54 -0400 (EDT)
X-Virus-Scanned: amavisd-new at mail.netsweeper.com
Received: from mail.netsweeper.com ([127.0.0.1])
 by localhost (mail.netsweeper.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id ZCdPBCdv4-zz for <freebsd-net@freebsd.org>;
 Wed, 17 Oct 2012 13:32:54 -0400 (EDT)
Received: from [192.168.4.202] (unknown [216.171.98.93])
 by mail.netsweeper.com (Postfix) with ESMTPSA id BD9901E40DD9
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 13:32:54 -0400 (EDT)
Message-ID: <507EEBD1.3020703@netsweeper.com>
Date: Wed, 17 Oct 2012 13:33:05 -0400
From: Ming Fu <ming.fu@netsweeper.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Re: netmap NETMAP_SW_RING or NETMAP_HW_RING
References: <507EDFCE.3060702@netsweeper.com>
In-Reply-To: <507EDFCE.3060702@netsweeper.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 17:33:07 -0000

After a second look at the netmap_open code, I believe the 
NETMAP_HW_RING is the choice.

My next trouble to to receive packets.

The program spins off 8 threads, each thread try to bind to one of the 
queue on an igb card. (queue 0-7).
depending on how I call the netmap_open().
if I call
     netmap_open(&me, id /*| NETMAP_HW_RING */, 1);

The program will able to receive packets, but of course each thread 
receives packet from all 8 queues.

If I call
     netmap_open(&me, id | NETMAP_HW_RING , 1)
none of the thread was able to receive packet.

come across the following line of code in nm_util.c
     } else if (ringid & NETMAP_HW_RING) {
         D("XXX check multiple threads");
What does it suggest? any special requirement for multi-threaded program?

Regards,
Ming


On 10/17/2012 12:41 PM, Ming Fu wrote:
> Hi,
>
> What is the difference between NETMAP_SW_RING and NETMAP_HW_RING.
> When using netmap_open() in the example code to create a netmap fdesc, 
> one of these two need to be ORed to the queue ID, in order to bind 
> only one RX queue.
>
> netmap code updated from FreeBSD RELENG_9.
>
> Regards,
> Ming
>
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 18:46:11 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id DD99FBA;
 Wed, 17 Oct 2012 18:46:10 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id A71D38FC0A;
 Wed, 17 Oct 2012 18:46:10 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0FF4DB91E;
 Wed, 17 Oct 2012 14:46:10 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: ixgbe & if_igb RX ring locking
Date: Wed, 17 Oct 2012 10:06:51 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org>
In-Reply-To: <201210150904.27567.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210171006.51214.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Wed, 17 Oct 2012 14:46:10 -0400 (EDT)
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>,
 net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 18:46:11 -0000

On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
> > On 13.10.2012 23:24, Jack Vogel wrote:
> > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:
> > 
> > >>
> > >> one option could be (same as it is done in the timer
> > >> routine in dummynet) to build a list of all the packets
> > >> that need to be sent to if_input(), and then call
> > >> if_input with the entire list outside the lock.
> > >>
> > >> It would be even easier if we modify the various *_input()
> > >> routines to handle a list of mbufs instead of just one.
> > 
> > Bulk processing is generally a good idea we probably should implement.
> > Probably starting from driver queue ending with marked mbufs 
> > (OURS/forward/legacy processing (appletalk and similar))?
> > 
> > This can minimize an impact for all
> > locks on RX side:
> > L2
> > * rx PFIL hook
> > L3 (both IPv4 and IPv6)
> > * global IF_ADDR_RLOCK (currently commented out)
> > * Per-interface ADDR_RLOCK
> > * PFIL hook
> > 
> >  From the first glance, there can be problems with:
> > * Increased latency (we should have some kind of rx_process_limit), but 
> > still
> > * reader locks being acquired for much longer amount of time
> > 
> > >>
> > >> cheers
> > >> luigi
> > >>
> > >> Very interesting idea Luigi, will have to get that some thought.
> > >
> > > Jack
> > 
> > Returning to original post topic:
> > 
> > Given
> > 1) we are currently binding ixgbe ithreads to CPU cores
> > 2) RX queue lock is used by (indirectly) in only 2 places:
> > a) ISR routine (msix or legacy irq)
> > b) taskqueue routine which is scheduled if some packets remains in RX 
> > queue and rx_process_limit ended OR we need something to TX
> > 
> > 3) in practice taskqueue routine is a nightmare for many people since 
> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
> > some traffic burst happens: once it is called it starts to schedule 
> > itself more and more replacing original ISR routine. Additionally, 
> > increasing rx_process_limit does not help since taskqueue is called with 
> > the same limit. Finally, currently netisr taskq threads are not bound to 
> > any CPU which makes the process even more uncontrollable.
> 
> I think part of the problem here is that the taskqueue in ixgbe(4) is
> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
> just start transmitting packets directly.
> 
> I fixed this in igb(4) here:
> 
> http://svnweb.freebsd.org/base?view=revision&revision=233708
> 
> You can try this for ixgbe(4).  It also comments out a spurious taskqueue 
> reschedule from the watchdog handler that might also lower the taskqueue 
> usage.  You can try changing that #if 0 to an #if 1 to test just the txeof 
> changes:

Is anyone able to test this btw to see if it improves things on ixgbe at all?
(I don't have any ixgbe hardware.)

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 18:46:11 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id DD99FBA;
 Wed, 17 Oct 2012 18:46:10 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id A71D38FC0A;
 Wed, 17 Oct 2012 18:46:10 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0FF4DB91E;
 Wed, 17 Oct 2012 14:46:10 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: freebsd-net@freebsd.org
Subject: Re: ixgbe & if_igb RX ring locking
Date: Wed, 17 Oct 2012 10:06:51 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; )
References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org>
In-Reply-To: <201210150904.27567.jhb@freebsd.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201210171006.51214.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Wed, 17 Oct 2012 14:46:10 -0400 (EDT)
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, Jack Vogel <jfvogel@gmail.com>,
 net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 18:46:11 -0000

On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
> > On 13.10.2012 23:24, Jack Vogel wrote:
> > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:
> > 
> > >>
> > >> one option could be (same as it is done in the timer
> > >> routine in dummynet) to build a list of all the packets
> > >> that need to be sent to if_input(), and then call
> > >> if_input with the entire list outside the lock.
> > >>
> > >> It would be even easier if we modify the various *_input()
> > >> routines to handle a list of mbufs instead of just one.
> > 
> > Bulk processing is generally a good idea we probably should implement.
> > Probably starting from driver queue ending with marked mbufs 
> > (OURS/forward/legacy processing (appletalk and similar))?
> > 
> > This can minimize an impact for all
> > locks on RX side:
> > L2
> > * rx PFIL hook
> > L3 (both IPv4 and IPv6)
> > * global IF_ADDR_RLOCK (currently commented out)
> > * Per-interface ADDR_RLOCK
> > * PFIL hook
> > 
> >  From the first glance, there can be problems with:
> > * Increased latency (we should have some kind of rx_process_limit), but 
> > still
> > * reader locks being acquired for much longer amount of time
> > 
> > >>
> > >> cheers
> > >> luigi
> > >>
> > >> Very interesting idea Luigi, will have to get that some thought.
> > >
> > > Jack
> > 
> > Returning to original post topic:
> > 
> > Given
> > 1) we are currently binding ixgbe ithreads to CPU cores
> > 2) RX queue lock is used by (indirectly) in only 2 places:
> > a) ISR routine (msix or legacy irq)
> > b) taskqueue routine which is scheduled if some packets remains in RX 
> > queue and rx_process_limit ended OR we need something to TX
> > 
> > 3) in practice taskqueue routine is a nightmare for many people since 
> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after 
> > some traffic burst happens: once it is called it starts to schedule 
> > itself more and more replacing original ISR routine. Additionally, 
> > increasing rx_process_limit does not help since taskqueue is called with 
> > the same limit. Finally, currently netisr taskq threads are not bound to 
> > any CPU which makes the process even more uncontrollable.
> 
> I think part of the problem here is that the taskqueue in ixgbe(4) is
> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
> just start transmitting packets directly.
> 
> I fixed this in igb(4) here:
> 
> http://svnweb.freebsd.org/base?view=revision&revision=233708
> 
> You can try this for ixgbe(4).  It also comments out a spurious taskqueue 
> reschedule from the watchdog handler that might also lower the taskqueue 
> usage.  You can try changing that #if 0 to an #if 1 to test just the txeof 
> changes:

Is anyone able to test this btw to see if it improves things on ixgbe at all?
(I don't have any ixgbe hardware.)

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 19:29:52 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id E56CAF0;
 Wed, 17 Oct 2012 19:29:52 +0000 (UTC)
 (envelope-from eric@vangyzen.net)
Received: from aussmtpmrkpc120.us.dell.com (aussmtpmrkpc120.us.dell.com
 [143.166.82.159])
 by mx1.freebsd.org (Postfix) with ESMTP id A79AB8FC16;
 Wed, 17 Oct 2012 19:29:52 +0000 (UTC)
X-Loopcount0: from 64.238.244.148
X-IronPort-AV: E=Sophos;i="4.80,602,1344229200"; 
   d="scan'208";a="7329248"
Message-ID: <507F072F.6080707@vangyzen.net>
Date: Wed, 17 Oct 2012 14:29:51 -0500
From: Eric van Gyzen <eric@vangyzen.net>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:14.0) Gecko/20120822 Thunderbird/14.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org, "Bjoern A. Zeeb" <bz@FreeBSD.org>
Subject: Re: Tahi "Redirected On-link" Test Case
References: <507DD768.7000803@vangyzen.net>
In-Reply-To: <507DD768.7000803@vangyzen.net>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 19:29:53 -0000

On 10/16/2012 16:53, Eric van Gyzen wrote:
> I am currently working on a fix for kern/152791 (Tahi IPv6 Ready Logo 
> test case #169: Redirected On-link).  I have a change to add the host 
> route, and it works for test case 169.  However, the route never gets 
> removed, so all subsequent test cases fail (because they first verify 
> that the Node Under Test thinks the destination is off-link).
>
> How/When should I clean up the route?
>
> Each test case runs a common cleanup procedure, which sends a RA with 
> a Router Lifetime of zero and a Prefix Information option with a Valid 
> Lifetime and Preferred Lifetime of zero.  This deprecates the NUT's 
> only global address, by which it reaches the newly-on-link 
> destination.  However, it doesn't seem rational to use this event to 
> trigger a cleanup of the route.
>
> The only other trigger I can imagine is the transition of the 
> Destination Cache entry to the Stale state.  That also doesn't make 
> complete sense.  (It probably also wouldn't work, since in my testing, 
> test case 170 begins immediately after test case 169 ends.)
>
> I'm assuming a certain amount of familiarity (on your part) with these 
> tests.  If you'd like, I can explain them in more detail.
>
> Thanks in advance for any advice,
>
> Eric

Ignore me.  I was working with incomplete information.  The common 
cleanup procedure also includes packets that trigger NUD to delete the 
entry from the Neighbor Cache.  So, the now-obvious answer to my 
question is to delete the route on this event.

Eric

From owner-freebsd-net@FreeBSD.ORG  Wed Oct 17 21:55:40 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5EDD7532;
 Wed, 17 Oct 2012 21:55:40 +0000 (UTC)
 (envelope-from guy.helmer@gmail.com)
Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com
 [209.85.223.182])
 by mx1.freebsd.org (Postfix) with ESMTP id EF2968FC0A;
 Wed, 17 Oct 2012 21:55:39 +0000 (UTC)
Received: by mail-ie0-f182.google.com with SMTP id k10so16544767iea.13
 for <multiple recipients>; Wed, 17 Oct 2012 14:55:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=content-type:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to:x-mailer;
 bh=+VPzoL2TvdeKPfNnawln5jqNkmyBemf0SdUXnawLnFU=;
 b=dc9CcfcZkWorCHQz0jYbalZ/Abbkc1ThndzcXkgem+YVc4iCI2B87NSN0g2ncB9fBt
 t2zo45pXmpONFezcy73vn36oz5G1iqwehaRg6iH/9fAGSe8bmiZskZgCslLlBdaxVVp3
 DWuZ23DVaTP1n4JU4Px3HpLAequHW3R5VUEb+pWAkw0rtswaIfpdlxykxuIqwh9ds68P
 hMYYp+C+zIy6SpReSmBf7YdJpPVQldTJ9s8L3O6Ax8YeL4jcs8nm2+Ew082UDTUzg6Fr
 l2PHgC+2tN3wbFpGAMjiJDs6eVGk4n8aR2s31172dKdeQyma4IwQIgNMaldmb9JGhcrO
 IqMg==
Received: by 10.50.135.38 with SMTP id pp6mr3069810igb.36.1350510939583;
 Wed, 17 Oct 2012 14:55:39 -0700 (PDT)
Received: from [192.168.221.99] ([216.81.189.9])
 by mx.google.com with ESMTPS id yf6sm12681211igb.0.2012.10.17.14.55.37
 (version=TLSv1/SSLv3 cipher=OTHER);
 Wed, 17 Oct 2012 14:55:38 -0700 (PDT)
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: 8.3: kernel panic in bpf.c catchpacket()
From: Guy Helmer <guy.helmer@gmail.com>
In-Reply-To: <381E3EEC-7EDB-428B-A724-434443E51A53@gmail.com>
Date: Wed, 17 Oct 2012 16:55:40 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <FA1F07D4-C6F3-4F55-B084-749366C0DAE6@gmail.com>
References: <4B5399BF-4EE0-4182-8297-3BB97C4AA884@gmail.com>
 <59F9A36E-3DB2-4F6F-BB2A-A4C9DA76A70C@gmail.com>
 <5075C05E.9070800@FreeBSD.org>
 <1EDA1615-2CDE-405A-A725-AF7CC7D3E273@gmail.com>
 <381E3EEC-7EDB-428B-A724-434443E51A53@gmail.com>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
X-Mailer: Apple Mail (2.1499)
Cc: freebsd-net@freebsd.org, FreeBSD Stable <freebsd-stable@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Oct 2012 21:55:40 -0000

On Oct 17, 2012, at 8:58 AM, Guy Helmer <guy.helmer@gmail.com> wrote:

> On Oct 12, 2012, at 8:54 AM, Guy Helmer <guy.helmer@gmail.com> wrote:
>=20
>>=20
>> On Oct 10, 2012, at 1:37 PM, Alexander V. Chernikov =
<melifaro@freebsd.org> wrote:
>>=20
>>> On 10.10.2012 00:36, Guy Helmer wrote:
>>>>=20
>>>> On Oct 8, 2012, at 8:09 AM, Guy Helmer <guy.helmer@gmail.com> =
wrote:
>>>>=20
>>>>> I'm seeing a consistent new kernel panic in FreeBSD 8.3:
>>>>> I'm not seeing how bd_sbuf would be NULL here. Any ideas?
>>>>=20
>>>> Since I've not had any replies, I hope nobody minds if I reply with =
more information.
>>>>=20
>>>> This panic seems to be occasionally triggered now that my user land =
code is changing the packet filter a while after the bpd device has been =
opened and an initial packet filter was set (previously, my code did not =
change the filter after it was initially set).
>>>>=20
>>>> I'm focusing on bpf_setf() since that seems to be the place that =
could be tickling a problem, and I see that bpf_setf() calls reset_d(d) =
to clear the hold buffer. I have manually verified that the BPFD lock is =
held during the call to reset_d(), and the lock is held every other =
place that the buffers are manipulated, so I haven't been able to find =
any place that seems vulnerable to losing one of the bpf buffers. Still =
searching, but any help would be appreciated.
>>>=20
>>> Can you please check this code on -current?
>>> Locking has changed quite significantly some time ago, so there is =
good chance that you can get rid of this panic (or discover different =
one which is really "new") :).
>>=20
>> I'm not ready to run this app on current, so I have merged revs =
229898, 233937, 233938, 233946, 235744, 235745, 235746, 235747, 236231, =
236251, 236261, 236262, 236559, and 236806 to my 8.3 checkout to get =
code that should be virtually identical to current without the timestamp =
changes.
>>=20
>> Unfortunately, I have only been able to trigger the panic in my test =
lab once -- so I'm not sure whether a lack of problems with the updated =
code will be indicative of likely success in the field where this has =
been trigged regularly at some sites=85
>>=20
>> Thanks,
>> Guy
>>=20
>=20
>=20
> FWIW, I was able to trigger the panic with the original 8.3 code again =
in my test lab. With these changes resulting from merging the revs =
mentioned above, I have not seen any panics in my test lab setup in two =
days of load testing, and AFAIK, packet capturing seems to be working =
fine.

Of course, the test system panic'ed with the same problem in =
catchpacket() an hour after I wrote this.

(kgdb) where
#0  doadump () at pcpu.h:224
#1  0xffffffff804c8280 in boot (howto=3D260) at =
../../../kern/kern_shutdown.c:441
#2  0xffffffff804c8703 in panic (fmt=3D0x0) at =
../../../kern/kern_shutdown.c:614
#3  0xffffffff8069ffad in trap_fatal (frame=3D0xffffffff809edbc0, =
eva=3DVariable "eva" is not available.
)
    at ../../../amd64/amd64/trap.c:825
#4  0xffffffff806a02e1 in trap_pfault (frame=3D0xffffff800014a8a0, =
usermode=3D0)
    at ../../../amd64/amd64/trap.c:741
#5  0xffffffff806a06bf in trap (frame=3D0xffffff800014a8a0)
    at ../../../amd64/amd64/trap.c:478
#6  0xffffffff80687cd4 in calltrap () at =
../../../amd64/amd64/exception.S:228
#7  0xffffffff8069dc06 in bcopy () at ../../../amd64/amd64/support.S:124
#8  0xffffffff8056f69e in catchpacket (d=3D0xffffff005aaaf000,=20
    pkt=3D0xffffff0001f46200 "", pktlen=3D522, snaplen=3DVariable =
"snaplen" is not available.
) at ../../../net/bpf.c:2240
#9  0xffffffff8056fc66 in bpf_mtap (bp=3D0xffffff0001be8c80,=20
    m=3D0xffffff0001f46200) at ../../../net/bpf.c:2064
#10 0xffffffff80579c15 in ether_input (ifp=3D0xffffff0001b73800,=20
    m=3D0xffffff0001f46200) at ../../../net/if_ethersubr.c:635
#11 0xffffffff802b694a in em_rxeof (rxr=3D0xffffff0001bca200, count=3D99, =
done=3D0x0)
    at ../../../dev/e1000/if_em.c:4404
#12 0xffffffff802b6db8 in em_handle_que (context=3DVariable "context" is =
not available.
)
    at ../../../dev/e1000/if_em.c:1494
#13 0xffffffff80506d85 in taskqueue_run_locked =
(queue=3D0xffffff0001be1580)
    at ../../../kern/subr_taskqueue.c:250
---Type <return> to continue, or q <return> to quit---q=20
Quit
(kgdb) frame 8
#8  0xffffffff8056f69e in catchpacket (d=3D0xffffff005aaaf000,=20
    pkt=3D0xffffff0001f46200 "", pktlen=3D522, snaplen=3DVariable =
"snaplen" is not available.
) at ../../../net/bpf.c:2240
warning: Source file is more recent than executable.

2240		bpf_append_bytes(d, d->bd_sbuf, curlen, &hdr, =
sizeof(hdr));
(kgdb) print *d
$1 =3D {bd_next =3D {le_next =3D 0xffffff0023fff400, le_prev =3D =
0xffffff0001be8c90},=20
  bd_sbuf =3D 0x0, bd_hbuf =3D 0xffffff8000ffa000 "??~P", bd_fbuf =3D =
0x0,=20
  bd_slen =3D 0, bd_hlen =3D 2068, bd_bufsize =3D 8388608,=20
  bd_bif =3D 0xffffff0001be8c80, bd_rtout =3D 1, bd_rfilter =3D =
0xffffff0001e6f580,=20
  bd_wfilter =3D 0x0, bd_bfilter =3D 0x0, bd_rcount =3D 7, bd_dcount =3D =
0,=20
  bd_promisc =3D 1 '\001', bd_state =3D 0 '\0', bd_immediate =3D 1 =
'\001',=20
  bd_writer =3D 0 '\0', bd_hdrcmplt =3D 1, bd_direction =3D 1, =
bd_feedback =3D 0,=20
  bd_async =3D 0, bd_sig =3D 23, bd_sigio =3D 0x0, bd_sel =3D {si_tdlist =
=3D {
      tqh_first =3D 0x0, tqh_last =3D 0x0}, si_note =3D {kl_list =3D {
        slh_first =3D 0x0}, kl_lock =3D 0xffffffff80497920 =
<knlist_mtx_lock>,=20
      kl_unlock =3D 0xffffffff804978f0 <knlist_mtx_unlock>,=20
      kl_assert_locked =3D 0xffffffff804945d0 =
<knlist_mtx_assert_locked>,=20
      kl_assert_unlocked =3D 0xffffffff804945e0 =
<knlist_mtx_assert_unlocked>,=20
      kl_lockarg =3D 0xffffff005aaaf0d8}, si_mtx =3D 0x0}, bd_lock =3D {
    lock_object =3D {lo_name =3D 0xffffff0001a5fce0 "bpf", lo_flags =3D =
16973824,=20
      lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =3D =
18446742974226712768},=20
  bd_callout =3D {c_links =3D {sle =3D {sle_next =3D 0x0}, tqe =3D =
{tqe_next =3D 0x0,=20
        tqe_prev =3D 0x0}}, c_time =3D 0, c_arg =3D 0x0, c_func =3D 0,=20=

    c_lock =3D 0xffffff005aaaf0d8, c_flags =3D 0, c_cpu =3D 0}, bd_label =
=3D 0x0,=20
  bd_fcount =3D 7, bd_pid =3D 89517, bd_locked =3D 0, bd_bufmode =3D 1, =
bd_wcount =3D 0,=20
  bd_wfcount =3D 0, bd_wdcount =3D 0, bd_zcopy =3D 0, bd_compat32 =3D 0 =
'\0'}

Now, I am thinking the malloc() of the sbuf is failing but not sure =
how/why -- I thought malloc(size, M_BPF, M_WAITOK) should not fail?

Guy=

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 01:41:06 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 84C7EE24;
 Thu, 18 Oct 2012 01:41:06 +0000 (UTC)
 (envelope-from yongari@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id 534F38FC0A;
 Thu, 18 Oct 2012 01:41:06 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I1f6Cf052543;
 Thu, 18 Oct 2012 01:41:06 GMT
 (envelope-from yongari@freefall.freebsd.org)
Received: (from yongari@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I1f53s052539;
 Thu, 18 Oct 2012 01:41:05 GMT (envelope-from yongari)
Date: Thu, 18 Oct 2012 01:41:05 GMT
Message-Id: <201210180141.q9I1f53s052539@freefall.freebsd.org>
To: nevzorovn@gmail.com, yongari@FreeBSD.org, freebsd-net@FreeBSD.org,
 yongari@FreeBSD.org
From: yongari@FreeBSD.org
Subject: Re: kern/171520: [alc] alc network driver + tso + vlan does not work.
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 01:41:06 -0000

Synopsis: [alc] alc network driver + tso + vlan does not work.

State-Changed-From-To: open->feedback
State-Changed-By: yongari
State-Changed-When: Thu Oct 18 01:40:32 UTC 2012
State-Changed-Why: 
I'm pretty sure TSO over VLAN worked well on my box.
Could you share your exact network configuration and let me know
how I can reproduce it?


Responsible-Changed-From-To: freebsd-net->yongari
Responsible-Changed-By: yongari
Responsible-Changed-When: Thu Oct 18 01:40:32 UTC 2012
Responsible-Changed-Why: 
Grab.

http://www.freebsd.org/cgi/query-pr.cgi?pr=171520

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 01:49:54 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 03C3B232
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 01:49:53 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 by mx1.freebsd.org (Postfix) with ESMTP id BE2398FC0C
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 01:49:53 +0000 (UTC)
Received: from JRE-MBP-2.local (c-50-143-149-146.hsd1.ca.comcast.net
 [50.143.149.146]) (authenticated bits=0)
 by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q9I1nkd0057884
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Wed, 17 Oct 2012 18:49:47 -0700 (PDT)
 (envelope-from julian@freebsd.org)
Message-ID: <507F603A.4050808@freebsd.org>
Date: Wed, 17 Oct 2012 18:49:46 -0700
From: Julian Elischer <julian@freebsd.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Mariano Cediel <mariano.cediel@gmail.com>
Subject: Re: one physical interface -> n virtual interfaces
References: <CAB-01r59bep6pt96sYfT=QNV+SRum=1xVESfOU86Ohevd=Zs2A@mail.gmail.com>
In-Reply-To: <CAB-01r59bep6pt96sYfT=QNV+SRum=1xVESfOU86Ohevd=Zs2A@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 01:49:54 -0000

On 10/16/12 1:35 PM, Mariano Cediel wrote:
> How do I create, from a physical interface, n virtual interfaces, but
> all effects are real, their MAC different, on which we can do
> individually NAT, etc, etc.?
>
> I need one external interface has 2 public IPs, and I'll do every NAT
> over every <interface> (with ipfw and divert)
> individually (each of them has its own gateway)
>
> A little help to start researching .....
> Greetings.
>
> (sorry for my poor english)

use netgraph ng_eiface, ng_bridge and ng_ether




From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 02:05:28 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 156FB590;
 Thu, 18 Oct 2012 02:05:28 +0000 (UTC)
 (envelope-from yongari@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id C529B8FC12;
 Thu, 18 Oct 2012 02:05:27 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I25RAh053831;
 Thu, 18 Oct 2012 02:05:27 GMT
 (envelope-from yongari@freefall.freebsd.org)
Received: (from yongari@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I25QLO053825;
 Thu, 18 Oct 2012 02:05:26 GMT (envelope-from yongari)
Date: Thu, 18 Oct 2012 02:05:26 GMT
Message-Id: <201210180205.q9I25QLO053825@freefall.freebsd.org>
To: rich@enterprisesystems.net, yongari@FreeBSD.org, freebsd-net@FreeBSD.org, 
 yongari@FreeBSD.org
From: yongari@FreeBSD.org
Subject: Re: kern/169399: [re] RealTek RTL8168/8111/8111c network interface
 not working
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 02:05:28 -0000

Synopsis: [re] RealTek RTL8168/8111/8111c network interface not working

State-Changed-From-To: open->closed
State-Changed-By: yongari
State-Changed-When: Thu Oct 18 02:03:08 UTC 2012
State-Changed-Why: 
Support for RTL8168E-VL was added after releasing 7.4-RELEASE.
Update to latest stable/7 or use newer FreeBSD releases.


Responsible-Changed-From-To: freebsd-net->yongari
Responsible-Changed-By: yongari
Responsible-Changed-When: Thu Oct 18 02:03:08 UTC 2012
Responsible-Changed-Why: 
Track.

http://www.freebsd.org/cgi/query-pr.cgi?pr=169399

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 02:12:00 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id E65388D5;
 Thu, 18 Oct 2012 02:12:00 +0000 (UTC)
 (envelope-from yongari@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id B6BE88FC0C;
 Thu, 18 Oct 2012 02:12:00 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I2C0ZW054095;
 Thu, 18 Oct 2012 02:12:00 GMT
 (envelope-from yongari@freefall.freebsd.org)
Received: (from yongari@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I2C0O7054091;
 Thu, 18 Oct 2012 02:12:00 GMT (envelope-from yongari)
Date: Thu, 18 Oct 2012 02:12:00 GMT
Message-Id: <201210180212.q9I2C0O7054091@freefall.freebsd.org>
To: Felko1982@web.de, yongari@FreeBSD.org, freebsd-net@FreeBSD.org,
 yongari@FreeBSD.org
From: yongari@FreeBSD.org
Subject: Re: kern/161381: [re] RTL8169SC - re0: PHY write failed
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 02:12:01 -0000

Synopsis: [re] RTL8169SC - re0: PHY write failed

State-Changed-From-To: open->feedback
State-Changed-By: yongari
State-Changed-When: Thu Oct 18 02:11:29 UTC 2012
State-Changed-Why: 
Most of these kind of errors come from broken hardware or unstable
power supply. If your re(4) device is a stand-along PCI controller,
would you firmly resit the controller and try again? 
Knowing how other operating systems works on this device would be
good idea too.


Responsible-Changed-From-To: freebsd-net->yongari
Responsible-Changed-By: yongari
Responsible-Changed-When: Thu Oct 18 02:11:29 UTC 2012
Responsible-Changed-Why: 
Grab.

http://www.freebsd.org/cgi/query-pr.cgi?pr=161381

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 05:10:11 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 67C9476A;
 Thu, 18 Oct 2012 05:10:11 +0000 (UTC)
 (envelope-from yongari@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135])
 by mx1.freebsd.org (Postfix) with ESMTP id 34EBF8FC17;
 Thu, 18 Oct 2012 05:10:11 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I5AB4i007998;
 Thu, 18 Oct 2012 05:10:11 GMT
 (envelope-from yongari@freefall.freebsd.org)
Received: (from yongari@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I5AA0A007994;
 Thu, 18 Oct 2012 05:10:10 GMT (envelope-from yongari)
Date: Thu, 18 Oct 2012 05:10:10 GMT
Message-Id: <201210180510.q9I5AA0A007994@freefall.freebsd.org>
To: universite@ukr.net, yongari@FreeBSD.org, freebsd-net@FreeBSD.org,
 yongari@FreeBSD.org
From: yongari@FreeBSD.org
Subject: Re: kern/168152: [xl] Periodically,
 the network card xl0 stops working -- xl0: watchdog timeout
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 05:10:11 -0000

Synopsis: [xl] Periodically, the network card xl0 stops working -- xl0: watchdog timeout

State-Changed-From-To: open->feedback
State-Changed-By: yongari
State-Changed-When: Thu Oct 18 05:08:25 UTC 2012
State-Changed-Why: 
http://people.freebsd.org/~yongari/xl/xl.watchdog.diff
Would you give above patch spin and let me know how it goes?


Responsible-Changed-From-To: freebsd-net->yongari
Responsible-Changed-By: yongari
Responsible-Changed-When: Thu Oct 18 05:08:25 UTC 2012
Responsible-Changed-Why: 
Grab.

http://www.freebsd.org/cgi/query-pr.cgi?pr=168152

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 06:02:16 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9227A247
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 06:02:16 +0000 (UTC)
 (envelope-from saeedeh.motlagh@gmail.com)
Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com
 [209.85.216.47])
 by mx1.freebsd.org (Postfix) with ESMTP id C79D38FC0A
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 06:02:15 +0000 (UTC)
Received: by mail-qa0-f47.google.com with SMTP id i29so1164408qaf.13
 for <freebsd-net@freebsd.org>; Wed, 17 Oct 2012 23:02:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc:content-type;
 bh=xfiqw0/qqWm0Az66P0owQNuVuKKI4ieuT6Tg70uL67c=;
 b=RLFTDBQrRD+HmIXqdzlK0i1JILk1wXg83uDVvpoFbqYZZCr5gxbXrTqf9WA/BV6w/I
 +pLt9VxG7nee7zydTxiz6xiP793ba+6Ziq9Mf0KQ3xzbIU0kwWNQlXD2jB8w+QGrG1/Y
 WzkNrQqixuSceAf4QYhgEYrvGRBtPCfBjGH+bXMVmSJkCZARxiBWXfLBRe5RzJ8Opnv+
 r5xwE+GcyZXJMjBEtpLANI3gy4tvGjFli8Nebpe+Gz1WbzmBcz5GqJ6uzk/setAzEYuz
 Jy9m9oPCaXwrxxdXTUu8SYzYT9w1SkaFzZgJdKB2gnU7Wac8Z4BBrZuoI+7AVsGMcSmy
 QgFA==
Received: by 10.229.172.10 with SMTP id j10mr9252192qcz.97.1350540135160; Wed,
 17 Oct 2012 23:02:15 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.49.105.71 with HTTP; Wed, 17 Oct 2012 23:01:34 -0700 (PDT)
In-Reply-To: <CAARSjE15=zkw0V3hWFgmt0drnAOzB+UZ9TGZo+4Z9UcgNLPG4A@mail.gmail.com>
References: <CAARSjE15=zkw0V3hWFgmt0drnAOzB+UZ9TGZo+4Z9UcgNLPG4A@mail.gmail.com>
From: saeedeh motlagh <saeedeh.motlagh@gmail.com>
Date: Thu, 18 Oct 2012 09:31:34 +0330
Message-ID: <CAN+S=WAsRiEUwODtBCTQd+T03Rc+_8qBJjbRgzr+XRAREg+5Ng@mail.gmail.com>
Subject: Re: TCP_DROP_SYNFIN kernel option side effects?!
To: h bagade <bagadeh@gmail.com>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 06:02:16 -0000

i know in RFC 1644 TCP packets SYN and FIN flags are set for some testing
issues but not sure if it has being used in any other issues*.**
*
*
*
*
*
*
*
On Tue, Oct 16, 2012 at 6:57 PM, h bagade <bagadeh@gmail.com> wrote:

> Hi all,
>
> I need to add this option to kernel in order to defeating Nmap
> OS-Fingerprinting. My system is running as Web Server and  also it is the
> gateway on the network.
> I want to know if setting this option has any side effects on other parts
> of the system? Is there any situation that SYN and FIN bits are set both in
> TCP packets? Is it a normal situation?
>
> Any helps or comments are really appreciated.
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 13:20:20 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 7F017E12
 for <net@freebsd.org>; Thu, 18 Oct 2012 13:20:20 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id D06F58FC0C
 for <net@freebsd.org>; Thu, 18 Oct 2012 13:20:19 +0000 (UTC)
Received: (qmail 13162 invoked from network); 18 Oct 2012 14:59:13 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <rizzo@iet.unipi.it>; 18 Oct 2012 14:59:13 -0000
Message-ID: <5080020E.1010603@networx.ch>
Date: Thu, 18 Oct 2012 15:20:14 +0200
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Luigi Rizzo <rizzo@iet.unipi.it>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org>
 <20121013182223.GA73341@onelab2.iet.unipi.it>
In-Reply-To: <20121013182223.GA73341@onelab2.iet.unipi.it>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 13:20:20 -0000

On 13.10.2012 20:22, Luigi Rizzo wrote:
> On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote:
>> Hello list!
>>
>>
>> Packets receiving code for both ixgbe and if_igb looks like the following:
>>
>>
>> ixgbe_msix_que
>>
>> -- ixgbe_rxeof()
>>     {
>>        IXGBE_RX_LOCK(rxr);
>>          while
>>          {
>>             get_packet;
>>
>>             -- ixgbe_rx_input()
>>                {
>>                   ++ IXGBE_RX_UNLOCK(rxr);
>>                   if_input(packet);
>>                   ++ IXGBE_RX_LOCK(rxr);
>>                }
>>
>>          }
>>        IXGBE_RX_UNLOCK(rxr);
>>      }
>>
>> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe).
>>
>> These lines probably do LORs masking (if any) well.
>> However, such change introduce quite significant performance drop:
>>
>> On my routing setup (nearly the same from previous -Intel 10G thread in
>> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is
>> nearly 20%.
>
> one option could be (same as it is done in the timer
> routine in dummynet) to build a list of all the packets
> that need to be sent to if_input(), and then call
> if_input with the entire list outside the lock.
>
> It would be even easier if we modify the various *_input()
> routines to handle a list of mbufs instead of just one.

Not really. You'd just run into tons of layering complexity.
Somewhere the decomposition and serialization has to be done.

Perhaps the right place is to dequeue a batch of packets from
the HW ring and then have a task/thread send it up the stack
one by one.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 13:26:59 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 49E8FFEF
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 13:26:59 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id CF4D78FC12
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 13:26:58 +0000 (UTC)
Received: (qmail 13207 invoked from network); 18 Oct 2012 15:05:53 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <bagadeh@gmail.com>; 18 Oct 2012 15:05:53 -0000
Message-ID: <5080039E.9070202@networx.ch>
Date: Thu, 18 Oct 2012 15:26:54 +0200
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: h bagade <bagadeh@gmail.com>
Subject: Re: TCP_DROP_SYNFIN kernel option side effects?!
References: <CAARSjE15=zkw0V3hWFgmt0drnAOzB+UZ9TGZo+4Z9UcgNLPG4A@mail.gmail.com>
In-Reply-To: <CAARSjE15=zkw0V3hWFgmt0drnAOzB+UZ9TGZo+4Z9UcgNLPG4A@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 13:26:59 -0000

On 16.10.2012 17:27, h bagade wrote:
> Hi all,
>
> I need to add this option to kernel in order to defeating Nmap
> OS-Fingerprinting. My system is running as Web Server and  also it is the
> gateway on the network.
> I want to know if setting this option has any side effects on other parts
> of the system? Is there any situation that SYN and FIN bits are set both in
> TCP packets? Is it a normal situation?

SYN and FIN is not normal. Doing TCP_DROP_SYNFIN is not RFC compliant
but doesn't cause any problems.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 14:09:40 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 0C01A3A7
 for <net@freebsd.org>; Thu, 18 Oct 2012 14:09:40 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 6B1758FC0A
 for <net@freebsd.org>; Thu, 18 Oct 2012 14:09:37 +0000 (UTC)
Received: (qmail 13412 invoked from network); 18 Oct 2012 15:48:31 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <vijju.singh@gmail.com>; 18 Oct 2012 15:48:31 -0000
Message-ID: <50800D9D.1090705@networx.ch>
Date: Thu, 18 Oct 2012 16:09:33 +0200
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Vijay Singh <vijju.singh@gmail.com>
Subject: Re: A small cleanup patch
References: <CALCNsJTWhVaV-2U1J5EtN2-6iyi_CGgCCrBVZ3VO1H0JLUKfvQ@mail.gmail.com>
In-Reply-To: <CALCNsJTWhVaV-2U1J5EtN2-6iyi_CGgCCrBVZ3VO1H0JLUKfvQ@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 14:09:40 -0000

On 05.10.2012 01:21, Vijay Singh wrote:
> Folks, I came up with this while going through the lltable code.

Thank you. I just purged a larger number of stray spl* from the
net*/* directories. This stuff won't be backported to 9-STABLE
though.

-- 
Andre

> kong@[/u/vijay/bsd/CODE/cur/sys]# svn diff net/if.c
> Index: net/if.c
> ===================================================================
> --- net/if.c	(revision 241169)
> +++ net/if.c	(working copy)
> @@ -691,12 +691,9 @@
>   if_attachdomain(void *dummy)
>   {
>   	struct ifnet *ifp;
> -	int s;
>
> -	s = splnet();
>   	TAILQ_FOREACH(ifp, &V_ifnet, if_link)
>   		if_attachdomain1(ifp);
> -	splx(s);
>   }
>   SYSINIT(domainifattach, SI_SUB_PROTO_IFATTACHDOMAIN, SI_ORDER_SECOND,
>       if_attachdomain, NULL);
> @@ -705,22 +702,17 @@
>   if_attachdomain1(struct ifnet *ifp)
>   {
>   	struct domain *dp;
> -	int s;
>
> -	s = splnet();
> -
>   	/*
>   	 * Since dp->dom_ifattach calls malloc() with M_WAITOK, we
>   	 * cannot lock ifp->if_afdata initialization, entirely.
>   	 */
>   	if (IF_AFDATA_TRYLOCK(ifp) == 0) {
> -		splx(s);
>   		return;
>   	}
>   	if (ifp->if_afdata_initialized >= domain_init_status) {
>   		IF_AFDATA_UNLOCK(ifp);
> -		splx(s);
> -		printf("if_attachdomain called more than once on %s\n",
> +		log(LOG_WARNING, "if_attachdomain called more than once on %s\n",
>   		    ifp->if_xname);
>   		return;
>   	}
> @@ -734,8 +726,6 @@
>   			ifp->if_afdata[dp->dom_family] =
>   			    (*dp->dom_ifattach)(ifp);
>   	}
> -
> -	splx(s);
>   }
>
>   /*
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>
>


From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 17:00:43 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id AE74DA32
 for <net@freebsd.org>; Thu, 18 Oct 2012 17:00:43 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
 by mx1.freebsd.org (Postfix) with ESMTP id 254668FC16
 for <net@freebsd.org>; Thu, 18 Oct 2012 17:00:42 +0000 (UTC)
Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1])
 by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q9IH0pFU081891;
 Thu, 18 Oct 2012 20:00:51 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
 by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q9IH0dnf087071;
 Thu, 18 Oct 2012 20:00:39 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
 by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q9IH0duu087070;
 Thu, 18 Oct 2012 20:00:39 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
 kostikbel@gmail.com using -f
Date: Thu, 18 Oct 2012 20:00:39 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: A small cleanup patch
Message-ID: <20121018170039.GS35915@deviant.kiev.zoral.com.ua>
References: <CALCNsJTWhVaV-2U1J5EtN2-6iyi_CGgCCrBVZ3VO1H0JLUKfvQ@mail.gmail.com>
 <50800D9D.1090705@networx.ch>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="WQ/nOZqcqYGY8uZi"
Content-Disposition: inline
In-Reply-To: <50800D9D.1090705@networx.ch>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00
 autolearn=ham version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
 skuns.kiev.zoral.com.ua
Cc: net@freebsd.org, Vijay Singh <vijju.singh@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 17:00:43 -0000


--WQ/nOZqcqYGY8uZi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Oct 18, 2012 at 04:09:33PM +0200, Andre Oppermann wrote:
> On 05.10.2012 01:21, Vijay Singh wrote:
> > Folks, I came up with this while going through the lltable code.
>=20
> Thank you. I just purged a larger number of stray spl* from the
> net*/* directories. This stuff won't be backported to 9-STABLE
> though.

Why ? What is the value of having the fossils in the actively maintained
stable tree ?

--WQ/nOZqcqYGY8uZi
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAlCANbcACgkQC3+MBN1Mb4iUggCg6D6yJWMjj5xWSZ8XBJpSdgMZ
uIMAnieSq76yERz2C9XACrD1e+aTKJ0g
=yG+l
-----END PGP SIGNATURE-----

--WQ/nOZqcqYGY8uZi--

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 18:09:17 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 3C4B06DA;
 Thu, 18 Oct 2012 18:09:17 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com
 [209.85.212.54])
 by mx1.freebsd.org (Postfix) with ESMTP id C52A18FC0A;
 Thu, 18 Oct 2012 18:09:16 +0000 (UTC)
Received: by mail-vb0-f54.google.com with SMTP id v11so11593029vbm.13
 for <multiple recipients>; Thu, 18 Oct 2012 11:09:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=6xBmignxqv8e3WBsSItXa3RxiYoe1WRrJZ/N6CKnPh8=;
 b=K1V3I5qLhH+QphFD5pRElKzR4kE1ApJgsWEi6ap6vVIJtJbdh3AOAm1HrysxL65y9k
 Ya61cXCXaV+8GXg/W+1xRnrxyJeNOvv8st6kyynPZRVBZbXcRwXVnJnptYf14IxjuXbD
 ZiPPDUmmnXqhu4lZOHCCBQQBBRJ9tLYcQv8pGJPUgJ9pJaWk9keuhd3QWpdGRnw2lkMH
 g2R6409yOY9pQmFxjeK7nLmmx3vcUVwYDrFtY8K9BOL7nuesl9nrmrbCC9pMOHDIJHvW
 vAdzGew8OGbuMxTmbOmRtfpdU+SRJGHEoFYJmBmiF684fa87yLBj0QIQQqtoiXdiSQX2
 +RVg==
MIME-Version: 1.0
Received: by 10.52.34.37 with SMTP id w5mr13188544vdi.86.1350583756089; Thu,
 18 Oct 2012 11:09:16 -0700 (PDT)
Received: by 10.58.68.8 with HTTP; Thu, 18 Oct 2012 11:09:16 -0700 (PDT)
In-Reply-To: <5080020E.1010603@networx.ch>
References: <5079A9A1.4070403@FreeBSD.org>
 <20121013182223.GA73341@onelab2.iet.unipi.it>
 <5080020E.1010603@networx.ch>
Date: Thu, 18 Oct 2012 11:09:16 -0700
Message-ID: <CAFOYbcnT0tT0S6Q8Eos9oEDoSSZZZSr8GD5f9oO4bgpo-t6vNg@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Jack Vogel <jfvogel@gmail.com>
To: Andre Oppermann <oppermann@networx.ch>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 18:09:17 -0000

On Thu, Oct 18, 2012 at 6:20 AM, Andre Oppermann <oppermann@networx.ch>wrote:

> On 13.10.2012 20:22, Luigi Rizzo wrote:
>
>> On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote:
>>
>>> Hello list!
>>>
>>>
>>> Packets receiving code for both ixgbe and if_igb looks like the
>>> following:
>>>
>>>
>>> ixgbe_msix_que
>>>
>>> -- ixgbe_rxeof()
>>>     {
>>>        IXGBE_RX_LOCK(rxr);
>>>          while
>>>          {
>>>             get_packet;
>>>
>>>             -- ixgbe_rx_input()
>>>                {
>>>                   ++ IXGBE_RX_UNLOCK(rxr);
>>>                   if_input(packet);
>>>                   ++ IXGBE_RX_LOCK(rxr);
>>>                }
>>>
>>>          }
>>>        IXGBE_RX_UNLOCK(rxr);
>>>      }
>>>
>>> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe).
>>>
>>> These lines probably do LORs masking (if any) well.
>>> However, such change introduce quite significant performance drop:
>>>
>>> On my routing setup (nearly the same from previous -Intel 10G thread in
>>> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is
>>> nearly 20%.
>>>
>>
>> one option could be (same as it is done in the timer
>> routine in dummynet) to build a list of all the packets
>> that need to be sent to if_input(), and then call
>> if_input with the entire list outside the lock.
>>
>> It would be even easier if we modify the various *_input()
>> routines to handle a list of mbufs instead of just one.
>>
>
> Not really. You'd just run into tons of layering complexity.
> Somewhere the decomposition and serialization has to be done.
>
> Perhaps the right place is to dequeue a batch of packets from
> the HW ring and then have a task/thread send it up the stack
> one by one.
>

I was thinking about how to code this, something like what I did with
the refresh routine, in any case I will experiment with it.

Jack

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 18:43:56 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C1973F9C;
 Thu, 18 Oct 2012 18:43:56 +0000 (UTC)
 (envelope-from luigi@onelab2.iet.unipi.it)
Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238])
 by mx1.freebsd.org (Postfix) with ESMTP id 7D7788FC1B;
 Thu, 18 Oct 2012 18:43:56 +0000 (UTC)
Received: by onelab2.iet.unipi.it (Postfix, from userid 275)
 id 005EF73029; Thu, 18 Oct 2012 21:04:21 +0200 (CEST)
Date: Thu, 18 Oct 2012 21:04:20 +0200
From: Luigi Rizzo <rizzo@iet.unipi.it>
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: ixgbe & if_igb RX ring locking
Message-ID: <20121018190420.GB98348@onelab2.iet.unipi.it>
References: <5079A9A1.4070403@FreeBSD.org>
 <20121013182223.GA73341@onelab2.iet.unipi.it> <5080020E.1010603@networx.ch>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5080020E.1010603@networx.ch>
User-Agent: Mutt/1.4.2.3i
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 18:43:56 -0000

On Thu, Oct 18, 2012 at 03:20:14PM +0200, Andre Oppermann wrote:
> On 13.10.2012 20:22, Luigi Rizzo wrote:
> >On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote:
> >>Hello list!
> >>
> >>
> >>Packets receiving code for both ixgbe and if_igb looks like the following:
> >>
> >>
> >>ixgbe_msix_que
> >>
> >>-- ixgbe_rxeof()
> >>    {
> >>       IXGBE_RX_LOCK(rxr);
> >>         while
> >>         {
> >>            get_packet;
> >>
> >>            -- ixgbe_rx_input()
> >>               {
> >>                  ++ IXGBE_RX_UNLOCK(rxr);
> >>                  if_input(packet);
> >>                  ++ IXGBE_RX_LOCK(rxr);
> >>               }
> >>
> >>         }
> >>       IXGBE_RX_UNLOCK(rxr);
> >>     }
> >>
> >>Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe).
> >>
> >>These lines probably do LORs masking (if any) well.
> >>However, such change introduce quite significant performance drop:
> >>
> >>On my routing setup (nearly the same from previous -Intel 10G thread in
> >>-net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is
> >>nearly 20%.
> >
> >one option could be (same as it is done in the timer
> >routine in dummynet) to build a list of all the packets
> >that need to be sent to if_input(), and then call
> >if_input with the entire list outside the lock.
> >
> >It would be even easier if we modify the various *_input()
> >routines to handle a list of mbufs instead of just one.
> 
> Not really. You'd just run into tons of layering complexity.
> Somewhere the decomposition and serialization has to be done.
> 
> Perhaps the right place is to dequeue a batch of packets from
> the HW ring and then have a task/thread send it up the stack
> one by one.

this is exactly what the dummynet code does -- collect a batch
of packets, release the lock, and then loop over the batch to feed
ip_input/ip_output or other things.

My point was, however, that instead of having to write an explicit
loop in all clients of ether_input(), we could make ether_input()
itself (or ether_input_batch(), does not really matter)
able to handle the batch and in turn call the main function.
The frontend then could apply some smarts to try and group
packets (not too different from TCP Receive Side Coalescing/Large
Receive Offload) within the batch, and this could be done
without locking/unlocking on each packet.

Furthermore, chances are that you can pass batches from one layer
to the next one in this way, something that wouldn't work if your
workflow can only handle one packet at a time.

And finally, the good thing is that implementation can be
incremental and on a case-by-case basis.

The VALE bridge uses this strategy
http://info.iet.unipi.it/~luigi/vale/
and moving batches instead of single packets brings the
forwarding rate from 4 to 17~Mpps.
At high rates, it really pays off.

cheers
luigi

> -- 
> Andre
> 

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 21:00:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id D942C351
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 21:00:30 +0000 (UTC)
 (envelope-from rafaelhfaria@cenadigital.com.br)
Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com
 [209.85.160.54])
 by mx1.freebsd.org (Postfix) with ESMTP id A35A18FC1A
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 21:00:30 +0000 (UTC)
Received: by mail-pb0-f54.google.com with SMTP id rp8so9418181pbb.13
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 14:00:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=cenadigital.com.br; s=mail;
 h=mime-version:from:date:message-id:subject:to:content-type;
 bh=29S3XOJqxqmp3+6lnfKAIWouKSzQu5OsydZjCeuhDtU=;
 b=GV+wbq2YwOA59oefF4s1fUdeIGoK5uC20NlrAsLQQVZF9Ax1UdRIToZiYGMvpmQ6pf
 GbGIZWVTLFQcDug2TNbVz0owDeD2By3Do5JXUh3oSW+6aOGd9gbJmjttMiP3/X9Na7bW
 0ARbMjOjpWRDhCi0+RzebRJqWPPRAG8oSZLZc=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:from:date:message-id:subject:to:content-type
 :x-gm-message-state;
 bh=29S3XOJqxqmp3+6lnfKAIWouKSzQu5OsydZjCeuhDtU=;
 b=IHZs3xP5cJ+iL71D0K3aT8abD6bKTkOPMBCZt5EAat73DVVzKWu072nl7sCFLI+sqm
 a23rvNbVQOcyBm/gzZ8r5PjOrvQ+NKwIwNVKFFAZK8Isyco0nmglzek7hGyclPqN+eh5
 advIU7Y523rLoF0zImtstFEFd+JaTy/4+fwicPo/5P6Hjxhm+HQvo3ZlCJdCZFYZPGwo
 cJcHActh/4decbvTd2LBh3lnkeBimphMtCMUL0Sr/Ly0B8HPkjKmZMzttWkicfvtUXwU
 CqxNjrRe2WGtx96SNc+P9szY5xYLBGT+36DK9POVnaToSu4BTU1AjyFGKNvVQCEwM+JA
 92VQ==
Received: by 10.68.189.138 with SMTP id gi10mr69387676pbc.165.1350594029708;
 Thu, 18 Oct 2012 14:00:29 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.66.11.166 with HTTP; Thu, 18 Oct 2012 13:59:49 -0700 (PDT)
From: Rafael Henrique Faria <rafaelhfaria@cenadigital.com.br>
Date: Thu, 18 Oct 2012 17:59:49 -0300
Message-ID: <CAOxoo31Ujzumi+hbZbRgY3EivY6dLwvP5nAZOOptgAEV9iKgzg@mail.gmail.com>
Subject: CARP on vSwitch
To: freebsd-net@freebsd.org
X-Gm-Message-State: ALoCoQmy4hdv0DoZOq3zaNyKSghNoECon+XyCkPpTlveAWM1r7qh/QPNccsmjSBTs7FVMgJQGAD2
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 21:00:31 -0000

Hi, I'm trying to use CARP on two FreeBSD servers in a ESX environment. But
it's not working.

The problem is that every frame sent from CARP gets back to the same host.
This is an old problem:

http://www.mail-archive.com/freebsd-net@freebsd.org/msg30562.html

And already have a patch, but its 3 years old. And not yet commit-ed. There
is any reason for this?
I always used freebsd-update to keep the servers updated, and don't want to
compile a kernel just to use the CARP.

Someone have any suggestion or correction to this problem?

Thanks in advance.

-- 
Rafael Henrique da Silva Faria

From owner-freebsd-net@FreeBSD.ORG  Thu Oct 18 21:41:17 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id ABF11DFA
 for <net@freebsd.org>; Thu, 18 Oct 2012 21:41:17 +0000 (UTC)
 (envelope-from david@catwhisker.org)
Received: from albert.catwhisker.org (m209-73.dsl.rawbw.com [198.144.209.73])
 by mx1.freebsd.org (Postfix) with ESMTP id 640678FC17
 for <net@freebsd.org>; Thu, 18 Oct 2012 21:41:17 +0000 (UTC)
Received: from albert.catwhisker.org (localhost [127.0.0.1])
 by albert.catwhisker.org (8.14.5/8.14.5) with ESMTP id q9ILfB65018156
 for <net@freebsd.org>; Thu, 18 Oct 2012 14:41:11 -0700 (PDT)
 (envelope-from david@albert.catwhisker.org)
Received: (from david@localhost)
 by albert.catwhisker.org (8.14.5/8.14.5/Submit) id q9ILfBgB018155
 for net@freebsd.org; Thu, 18 Oct 2012 14:41:11 -0700 (PDT)
 (envelope-from david)
Date: Thu, 18 Oct 2012 14:41:10 -0700
From: David Wolfskill <david@catwhisker.org>
To: net@freebsd.org
Subject: Seeing "rn_addmask: mask impossibly already in tree" on console
Message-ID: <20121018214110.GE1817@albert.catwhisker.org>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="gRZ38brEgCoUohoa"
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: net@freebsd.org, David Wolfskill <david@catwhisker.org>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Oct 2012 21:41:17 -0000


--gRZ38brEgCoUohoa
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Running:

FreeBSD g1-227.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #273 24=
1679M: Thu Oct 18 04:52:24 PDT 2012     root@g1-227.catwhisker.org:/usr/obj=
/usr/src/sys/CANARY  i386

on my laptop, at times when I switch from X to vty0, I see (e.g.):

Oct 18 08:27:07 g1-227 kernel: rn_addmask: mask impossibly already in treer=
n_addmask: mask impossibly already in treern_addmask: mask impossibly alrea=
dy in treern_addmask: mask impossibly already in treern_addmask: mask impos=
sibly already in treern_addmask: mask impossibly already in treern_addmask:=
 mask impossibly already in tree...


I see where the message is issued (sys/net/radix.c:539 @r210122,
last updated 2010-07-15 14:41:59Z).

As this is a laptop, and thus subject to being connected to networks
I do not control, I run a packet filter on it, and the one with
which I happen to be most familair is ipfw.  Thus that's what I
use.

So it's possible that either ipfw or routing is driving rn_addmask().

However, I'm unclear on:

* What (specifically) is actually causing this.

* Whether or not it's enough of an issue or problem that I should take
  evasive action.  Put differently: what is my exposure here?

* If so, what sort of evasive action is appropriate for me to take.

I suppose I could try a packet-capture, but the lack of timestamps
make correlating the message-issuance with particular packets a
little more challenging than I'd prefer.

I note, too, that I'm running a very similar ipfw configuration on
the packet-filter machine here at home; while I find the above-
quoted whines in /var/log/console.log on the laptop, I do not find
them mentioned in that file for the packet filter machine.

Clues?

[Please include me in the recipient list, as I'm not subscribed to net@;
I've set Reply-To as a hint.]

Thanks!

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Taliban: Evil men with guns afraid of truth from a 14-year old girl.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--gRZ38brEgCoUohoa
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iEYEARECAAYFAlCAd3UACgkQmprOCmdXAD1x+wCfTfmvzlGbIzPcoS7pv4HhdvJC
VecAn1FFqYD2W2Zkjb9rjwNxNfqg8wOE
=DPnx
-----END PGP SIGNATURE-----

--gRZ38brEgCoUohoa--

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 00:05:32 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 51F579DF;
 Fri, 19 Oct 2012 00:05:32 +0000 (UTC)
 (envelope-from kob6558@gmail.com)
Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com
 [209.85.217.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 8EF8E8FC1B;
 Fri, 19 Oct 2012 00:05:30 +0000 (UTC)
Received: by mail-lb0-f182.google.com with SMTP id b5so7765432lbd.13
 for <multiple recipients>; Thu, 18 Oct 2012 17:05:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=IIsw37PqoV2e/719pra0mZrpmVMAiycBPKw0J9ZxOGA=;
 b=GU77uSZwLBpY3wL6i+pEDs54arD/azBdiTd7k0BPm1Sem3Un8qmHkEpi33HMnTw9rX
 1pjF/X4Fm7wKWAiopKmB+TMHWOUl22uyIcvSsgm/QGlc9i4qYh2LbiqFWiZmSbNAGg1d
 F5vg9hnEFWRl8SuYPyi3gqqMCyayjVXk/XRoXmKr4CFJ2QA6LLDNwnLB4QuERS3k79g6
 yFqG+tjAfr5evZqphq5GJsdsQIyegv3VmaxnR5mqdjM8ioA6SALLWgDcSN3IHioaCgcD
 y0PKJ272dGBBysIn2s8Wt5RXVWHOkP8EcUTWwARd1PGKpdKosvFHcYQV3BD/x+YutTJX
 AlAA==
MIME-Version: 1.0
Received: by 10.112.103.106 with SMTP id fv10mr8411346lbb.8.1350605129973;
 Thu, 18 Oct 2012 17:05:29 -0700 (PDT)
Received: by 10.112.4.227 with HTTP; Thu, 18 Oct 2012 17:05:29 -0700 (PDT)
In-Reply-To: <16534.1350461943@tristatelogic.com>
References: <CAJ-Vmonk0xtmqPMFnCZp-YVzmC3-boeu0o9A4DwSeBGYC+5=sg@mail.gmail.com>
 <16534.1350461943@tristatelogic.com>
Date: Thu, 18 Oct 2012 17:05:29 -0700
Message-ID: <CAN6yY1u2cnXZnPuCOwigDRGD9FayDucf78Gw9BX3FMYoSGBZfQ@mail.gmail.com>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
From: Kevin Oberman <kob6558@gmail.com>
To: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-net@freebsd.org, Adrian Chadd <adrian@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 00:05:32 -0000

On Wed, Oct 17, 2012 at 1:19 AM, Ronald F. Guilmette
<rfg@tristatelogic.com> wrote:
>
> In message <CAJ-Vmonk0xtmqPMFnCZp-YVzmC3-boeu0o9A4DwSeBGYC+5=sg@mail.gmail.com>
> , you wrote:
>
>>for wifi - you need to configure /etc/wpa_supplicant.conf as well,
>>right?
>
> Did that.  Yes.
>
>>You don't need the ssid in the ifconfig line;
>
> OK.  If you say so.  (See my prior e-mail where I wondered aloud if there
> are circumstances where the ssid might have to appear in both places.)
>
>  wpa_supplicant
> 9
>>will scan and find your AP.
>>
>>The driver should call back to non-n and non-g if needs be.
>>
>>As for the config - erm, you have two interfaces on the same L2.
>>That's going to confuse things, right?
>
> Well, I can't speak for the hardware, but it sure as hell does confuse
> *me*. (1/2 :-)
>
>>What's 'netstat -rn' show?
>
>
> Routing tables
>
> Internet:
> Destination        Gateway            Flags    Refs      Use  Netif Expire
> default            192.168.1.1        UGS         0   104122    re0
> 127.0.0.1          link#10            UH          0        0    lo0
> 192.168.1.0/24     link#4             U           0    23515    re0
> 192.168.1.21       link#11            UHS         0        0    lo0
> 192.168.1.23       link#4             UHS         0        0    lo0
>
> Internet6:
> Destination                       Gateway                       Flags      Netif Expire
> ::/96                             ::1                           UGRS        lo0
> ::1                               link#10                       UH          lo0
> ::ffff:0.0.0.0/96                 ::1                           UGRS        lo0
> fe80::/10                         ::1                           UGRS        lo0
> fe80::%re0/64                     link#4                        U           re0
> fe80::224:21ff:fe65:ada0%re0      link#4                        UHS         lo0
> fe80::%lo0/64                     link#10                       U           lo0
> fe80::1%lo0                       link#10                       UHS         lo0
> fe80::%wlan0/64                   link#11                       U         wlan0
> fe80::222:fbff:fe76:6d18%wlan0    link#11                       UHS         lo0
> ff01::%re0/32                     fe80::224:21ff:fe65:ada0%re0  U           re0
> ff01::%lo0/32                     ::1                           U           lo0
> ff01::%wlan0/32                   fe80::222:fbff:fe76:6d18%wlan0 U         wlan0
> ff02::/16                         ::1                           UGRS        lo0
> ff02::%re0/32                     fe80::224:21ff:fe65:ada0%re0  U           re0
> ff02::%lo0/32                     ::1                           U           lo0
> ff02::%wlan0/32                   fe80::222:fbff:fe76:6d18%wlan0 U         wlan0

To use WPA and a static address, you need something like:
ifconfig_wlan0 ="WPA inet 192.168.1.21/24"
so that was OK.

Now, you seem to have both interfaces on the same /24 with a /24
netmask. This is probably going to result in some poorly defined
behavior. I'm not sure just what you are trying to do, but I suspect
that it is not what you are doing.

If you are trying to allow the system to use wired when it is
connected and wireless when disconnected, thi is the wrong way. You
should put both interfaces into a lagg and have a single IP on the
lagg interface. As it is, there is no way to be sure which outgoing
interface will be used when both are connected and exactly

This said, I am not sure how this might cause the interface to fail to
associate. I'm guessing that you are simply not associating and the
scan is falling back to 'B' after failing to find an AP in faster
modes.  The question is "why?". What is the output of "ifconfig wlan0
list aps"?

One thing I see is:
country US authmode WPA1+WPA2/802.11i privacy OFF

For home users I would normally expect WPA-PSK to be used. What
key_mgmt are you specifying? It looks like authentication might be
failing. You might try running the supplicant manually (after stopping
any that is running) and see what you get.

> P.S.  I ain't using IPv6... like not at all.

Unfortunate, but I can't run it at home, either, as Comcast is not yet
offering it in my area. (Nor is Verizon who will be my provider next
month.)

-- 
R. Kevin Oberman, Network Engineer
E-mail: kob6558@gmail.com

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 04:24:44 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 05D71B1D;
 Fri, 19 Oct 2012 04:24:44 +0000 (UTC)
 (envelope-from rfg@tristatelogic.com)
Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com
 [69.62.255.118])
 by mx1.freebsd.org (Postfix) with ESMTP id B35228FC17;
 Fri, 19 Oct 2012 04:24:43 +0000 (UTC)
Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1])
 by segfault.tristatelogic.com (Postfix) with ESMTP id 3B00250821;
 Thu, 18 Oct 2012 21:24:36 -0700 (PDT)
To: Kevin Oberman <kob6558@gmail.com>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
In-Reply-To: <CAN6yY1u2cnXZnPuCOwigDRGD9FayDucf78Gw9BX3FMYoSGBZfQ@mail.gmail.com>
Date: Thu, 18 Oct 2012 21:24:36 -0700
Message-ID: <2529.1350620676@tristatelogic.com>
From: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Cc: freebsd-net@freebsd.org, Adrian Chadd <adrian@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 04:24:44 -0000


In message <CAN6yY1u2cnXZnPuCOwigDRGD9FayDucf78Gw9BX3FMYoSGBZfQ@mail.gmail.com>
, Kevin Oberman <kob6558@gmail.com> wrote:

>To use WPA and a static address, you need something like:
>ifconfig_wlan0 ="WPA inet 192.168.1.21/24"
>so that was OK.

Yea, actually I did already have the static+WPA working.

>Now, you seem to have both interfaces on the same /24 with a /24
>netmask. This is probably going to result in some poorly defined
>behavior.

:-)

I think that's the polite way of putting it.

>I'm not sure just what you are trying to do...

That's OK. Tha makes two of us.
(1/2 :-)

>but I suspect that it is not what you are doing.

Actually, I wasn't trying to achieve much of anything, specifically,
when I had _two_ ifconfig_XXX= lines in rc.conf for _both_ my wired and
wirless interfaces.  I was just being lazy, not taking the ifconfig
for the wired out when I started using the wirless.  And then I
looked at it and realized pretty much what you said, which is basically:
Who the hell knows where the packets will go if there are multiple
routes from one machine to someplace else, and if none of them is
more specific than the other.

It's definitely an enigma.  Does this produce Heisenbergian packet flow?

(I was rather hoping that you FreeBSD networking gurus would enlighten
me on this somewhat interesting, even if obscure point.)

>This said, I am not sure how this might cause the interface to fail to
>associate.

I am with you.  I don't think it would or should.

>I'm guessing that you are simply not associating and the
>scan is falling back to 'B' after failing to find an AP in faster
>modes.  The question is "why?".

Yea, it seems kind of odd.

>What is the output of "ifconfig wlan0 list aps"?

Umm... well... I've rebooted since I mailed/posted earlier, and now things
are looking rather different.  In particular, it appears that I have `G'
notw on the wirless link _and_ from elsewher on my network I can successfully
ping _both_ 192.168.1.23 (the wired connection) _and_ also 192.168.1.21
(the wireless connection).  And traceroute says that those are both one
hop away from my other server which is at 192.168.1.2.

==============================================================================
re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE>
        ether 00:24:21:65:ad:a0
        inet 192.168.1.23 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::224:21ff:fe65:ada0%re0 prefixlen 64 scopeid 0x4 
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
iwn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 2290
        ether 00:22:fb:76:6d:18
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: IEEE 802.11 Wireless Ethernet autoselect mode 11ng
        status: associated
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128 
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa 
        inet 127.0.0.1 netmask 0xff000000 
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 00:22:fb:76:6d:18
        inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255
        inet6 fe80::222:fbff:fe76:6d18%wlan0 prefixlen 64 scopeid 0xb 
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: IEEE 802.11 Wireless Ethernet OFDM/36Mbps mode 11ng
        status: associated
        ssid ronair2-1 channel 11 (2462 MHz 11g ht/40-) bssid c0:c1:c0:8b:4b:f3
        country US authmode WPA2/802.11i privacy ON deftxkey UNDEF
        AES-CCM 2:128-bit AES-CCM 3:128-bit txpower 15 bmiss 10 scanvalid 450
        bgscan bgscanintvl 300 bgscanidle 250 roam:rssi 7 roam:rate 64
        protmode CTS ampdulimit 64k ampdudensity 8 -amsdutx amsdurx shortgi
        wme roaming MANUAL
==============================================================================

Here's the stuff that you specifically asked for, although I don't know if
it is even relevant anymore:

==============================================================================
% ifconfig wlan0 list aps
SSID/MESH ID    BSSID              CHAN RATE   S:N     INT CAPS
Cisco           58:6d:8f:7e:6c:5d   11   54M -74:-95  100 EP   RSN HTCAP WPS WPA WME
ronair2-1       c0:c1:c0:8b:4b:f3   11   54M -65:-95  100 EP   RSN HTCAP WPS WME
belkin.194      08:86:3b:6f:91:94   11   54M -81:-95  100 EP   HTCAP WPA RSN WME WPS
Cisco           58:6d:8f:7e:6c:5e   36   54M -86:-95  100 EP   RSN HTCAP WPS WPA WME
Fluff           c0:3f:0e:78:3e:f5    2   54M -82:-95  100 EPS  RSN WPA WME HTCAP ATH WPS
linksysLA       00:18:f8:e6:4b:58    5   54M -90:-95  100 EP   RSN HTCAP WPA WME
belkin.194....  08:86:3b:6f:91:96  149   54M -90:-95  100 EP   WPS HTCAP WPA RSN WME
erikadoming...  a0:21:b7:9d:0f:98    5   54M -83:-95  100 EPS  RSN WPA WME HTCAP ATH WPS
dcz_26          00:1b:2f:02:40:de   11   54M -91:-95  100 EPS  WPA
==============================================================================

(My AP is "ronair2-1".  As you can see, I live is a busy neighborhood.)

>One thing I see is:
>country US authmode WPA1+WPA2/802.11i privacy OFF

Huh??  Where are you seeing THAT?

Oh!  I see. I guess it must have been in the ifconfig -a ioutput that I sent
earlier.

Well, as you can see above, that appears to have changed now too.

>For home users I would normally expect WPA-PSK to be used.

Indeed.  And as far as I know, that _is_ what I _am_ using.

>What key_mgmt are you specifying?

Sorry.  I don't understand the question.

Anyway, the bottom line here is that it appears that I no longer have any
bug report to file.  Whatever was causing me to get `B' (or was it `A'?
I forget now) before seems to have sorted itself out on its own.

As I was telling my neighbor just the other day, it never ceases to amaze
me what vast numers of problems are so often cured by a simple hard reset
(i.e. power cycle).

Anyway, if I ever see the problme arise again, I'll let somebody know.


Regards,
rfg

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 06:53:22 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F30B1F3F
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 06:53:21 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com
 [209.85.220.54])
 by mx1.freebsd.org (Postfix) with ESMTP id C0EF48FC0A
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 06:53:21 +0000 (UTC)
Received: by mail-pa0-f54.google.com with SMTP id bi1so148914pad.13
 for <freebsd-net@freebsd.org>; Thu, 18 Oct 2012 23:53:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=uEnKUqssRp4PjGfZ0pIJ93+U5sajYSJCSzW/wTkqC9s=;
 b=cfCFhw0lvTgvx8b4kcaH7qPugDGVsij8DninMpZVl4BCpu3rPMDgEDN7zpKEgYTiqC
 7l/RIKGTAzVK/ByhmaIuPs6gJS+X7uLYj5o3oW/nNIqnfxyaOjgoRHBiIi5Q7RsEVrah
 Kn5sYcW3oDoRHptk7Iv8Tbwo/AWiSO2JhcZS6c6UM9+cjS5jcuSWOETHFB5INoN2if3Q
 iJFn7o8WXS7pHtZ1rEOSt/kv/NENZGrWznikhQgac99w88Xv6cvlZBXb/6nDly/0JUU4
 wCMELqLUP1EorvGvbLvN1KiCUEC5n3yp798xGZ05gYT/B/+SDVNMz47RqckzIHFASSfh
 kCyA==
MIME-Version: 1.0
Received: by 10.68.229.138 with SMTP id sq10mr2509879pbc.126.1350629601239;
 Thu, 18 Oct 2012 23:53:21 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.68.146.233 with HTTP; Thu, 18 Oct 2012 23:53:21 -0700 (PDT)
In-Reply-To: <2529.1350620676@tristatelogic.com>
References: <CAN6yY1u2cnXZnPuCOwigDRGD9FayDucf78Gw9BX3FMYoSGBZfQ@mail.gmail.com>
 <2529.1350620676@tristatelogic.com>
Date: Thu, 18 Oct 2012 23:53:21 -0700
X-Google-Sender-Auth: tJqo2dGfIP6XUI-VIWyWmZotzH0
Message-ID: <CAJ-Vmo=T+VfXzMvgy0iKjG_cVeixcNztDtnOcYo-3Ch+ZTK4TA@mail.gmail.com>
Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?)
From: Adrian Chadd <adrian@freebsd.org>
To: "Ronald F. Guilmette" <rfg@tristatelogic.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-net@freebsd.org, Kevin Oberman <kob6558@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 06:53:22 -0000

The obscure answer has to do with what the L2 adjacency stuff is doing.

Because it adds that default route out a specific interface, it will
send ARP requests out that. Even if the other interface goes down,
it'll still throw them out that interface.

It's just a side effect of how the L2 adjacency stuff works.



Adrian

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 07:53:07 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 73401864;
 Fri, 19 Oct 2012 07:53:07 +0000 (UTC)
 (envelope-from fabien.thomas@netasq.com)
Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1])
 by mx1.freebsd.org (Postfix) with ESMTP id E61AA8FC14;
 Fri, 19 Oct 2012 07:53:06 +0000 (UTC)
Received: from [10.2.1.1] (unknown [10.2.1.1])
 by work.netasq.com (Postfix) with ESMTPSA id 54D6E2705764;
 Fri, 19 Oct 2012 09:53:05 +0200 (CEST)
Subject: Re: ixgbe & if_igb RX ring locking
Mime-Version: 1.0 (Apple Message framework v1283)
From: Fabien Thomas <fabien.thomas@netasq.com>
In-Reply-To: <CAFOYbcnT0tT0S6Q8Eos9oEDoSSZZZSr8GD5f9oO4bgpo-t6vNg@mail.gmail.com>
Date: Fri, 19 Oct 2012 09:53:07 +0200
Message-Id: <390AF360-AEC3-495E-881A-1ACCFEF42815@netasq.com>
References: <5079A9A1.4070403@FreeBSD.org>
 <20121013182223.GA73341@onelab2.iet.unipi.it> <5080020E.1010603@networx.ch>
 <CAFOYbcnT0tT0S6Q8Eos9oEDoSSZZZSr8GD5f9oO4bgpo-t6vNg@mail.gmail.com>
To: Jack Vogel <jfvogel@gmail.com>
X-Mailer: Apple Mail (2.1283)
Content-Type: text/plain;
	charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: "Alexander V. Chernikov" <melifaro@freebsd.org>,
 Luigi Rizzo <rizzo@iet.unipi.it>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 07:53:07 -0000


Le 18 oct. 2012 =E0 20:09, Jack Vogel a =E9crit :

> On Thu, Oct 18, 2012 at 6:20 AM, Andre Oppermann =
<oppermann@networx.ch>wrote:
>=20
>> On 13.10.2012 20:22, Luigi Rizzo wrote:
>>=20
>>> On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov =
wrote:
>>>=20
>>>> Hello list!
>>>>=20
>>>>=20
>>>> Packets receiving code for both ixgbe and if_igb looks like the
>>>> following:
>>>>=20
>>>>=20
>>>> ixgbe_msix_que
>>>>=20
>>>> -- ixgbe_rxeof()
>>>>    {
>>>>       IXGBE_RX_LOCK(rxr);
>>>>         while
>>>>         {
>>>>            get_packet;
>>>>=20
>>>>            -- ixgbe_rx_input()
>>>>               {
>>>>                  ++ IXGBE_RX_UNLOCK(rxr);
>>>>                  if_input(packet);
>>>>                  ++ IXGBE_RX_LOCK(rxr);
>>>>               }
>>>>=20
>>>>         }
>>>>       IXGBE_RX_UNLOCK(rxr);
>>>>     }
>>>>=20
>>>> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe).
>>>>=20
>>>> These lines probably do LORs masking (if any) well.
>>>> However, such change introduce quite significant performance drop:
>>>>=20
>>>> On my routing setup (nearly the same from previous -Intel 10G =
thread in
>>>> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which =
is
>>>> nearly 20%.
>>>>=20
>>>=20
>>> one option could be (same as it is done in the timer
>>> routine in dummynet) to build a list of all the packets
>>> that need to be sent to if_input(), and then call
>>> if_input with the entire list outside the lock.
>>>=20
>>> It would be even easier if we modify the various *_input()
>>> routines to handle a list of mbufs instead of just one.
>>>=20
>>=20
>> Not really. You'd just run into tons of layering complexity.
>> Somewhere the decomposition and serialization has to be done.
>>=20
>> Perhaps the right place is to dequeue a batch of packets from
>> the HW ring and then have a task/thread send it up the stack
>> one by one.
>>=20
>=20
> I was thinking about how to code this, something like what I did with
> the refresh routine, in any case I will experiment with it.

This modified version for mq polling create a list of packet that are =
injected later (mc is the list).
=
http://www.gitorious.org/~fabient/freebsd/fabient-freebsd/blobs/work/polln=
g_mq_stable_8/sys/dev/ixgbe/ixgbe.c#line4615


>=20
> Jack
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 08:01:57 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 4CA7AA5B
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 08:01:57 +0000 (UTC)
 (envelope-from sunkeyong@gmail.com)
Received: from mail-qc0-f182.google.com (mail-qc0-f182.google.com
 [209.85.216.182])
 by mx1.freebsd.org (Postfix) with ESMTP id E43258FC14
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 08:01:56 +0000 (UTC)
Received: by mail-qc0-f182.google.com with SMTP id l39so152036qcs.13
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 01:01:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=T6awshakab+GOnOXmQ4FgrLRfJkjuzE5OBRl0WZUy4o=;
 b=02s/f2icWbXPCIUS7qcEBmbeL/hb8ckSS27nBIFN8R32TQscgT3OSlEo9uhX0CDV4W
 rYFYineIycpZClSDqtYKcedSK99KoVMLdgRp2QuQYqfp1ZB+QxRN3NyVIypEQqYBdrWR
 zXTG7ob8h4snVenMSjS5n5CAbYokQERlzSWoUImZWNA77oj6cT1wc+oS8lOAaLvi8K5v
 vO3rnonvwYBf7pwJaBZpkb+HublxQxPBsmBjSo1RUOO1kFJndv1iSZZrPAhulRqUeukX
 5VE9+5ZxjQITXATCX0e69bTdk8syptPMRvUgCHdPO15TJvjodCdIEXKUwjVqRl0qjBlp
 GLAA==
MIME-Version: 1.0
Received: by 10.224.213.10 with SMTP id gu10mr384338qab.10.1350633716181; Fri,
 19 Oct 2012 01:01:56 -0700 (PDT)
Received: by 10.49.39.167 with HTTP; Fri, 19 Oct 2012 01:01:56 -0700 (PDT)
Date: Fri, 19 Oct 2012 16:01:56 +0800
Message-ID: <CAGAfov0-k8UC5Cy2m3GAnorr4OYqpMQTKqEWfB0TEkPfp5urQA@mail.gmail.com>
Subject: kern/94020: [tcp] tcp_timer_2msl_tw NULL pointer dereference panic
From: Sun Keyong <sunkeyong@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 08:01:57 -0000

Hi,

Recently I meet a bug about the tcp_timer_2msl_tw null pointer dereference
panic.
I found there is a PR94020, the status was closed. Could anyone point to me
how to fix this, and where I can found how to fix?

Thanks a lot
KY

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 09:01:31 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 53175AE1
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 09:01:31 +0000 (UTC)
 (envelope-from simond@irrelevant.org)
Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com
 [209.85.217.182])
 by mx1.freebsd.org (Postfix) with ESMTP id BBC988FC14
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 09:01:29 +0000 (UTC)
Received: by mail-lb0-f182.google.com with SMTP id b5so243180lbd.13
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 02:01:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=irrelevant.org; s=irrelevant;
 h=mime-version:x-originating-ip:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type;
 bh=xJduYT4hlOPmQbFunLrA0S42mDzxMrivIORS1PXqfZg=;
 b=RWRXDiJpMHk5dSpNPzXkDBTzMFxTwNToTaXgyCOW4ke//ZWzhECLYsRsoAEb++IvVt
 JHz6JRt7ncbg00vIgLr1Q5x+JVT7DekCHuZJv0rXgUiVtI+AVFjK39G5F9SlQuH/5q69
 70Mbscd6RHj52TLpQJJ3dEoJ8iLsVJjRB0qm4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=mime-version:x-originating-ip:in-reply-to:references:date
 :message-id:subject:from:to:cc:content-type:x-gm-message-state;
 bh=xJduYT4hlOPmQbFunLrA0S42mDzxMrivIORS1PXqfZg=;
 b=C1F2yonZCe93v8HHdN5JEC4BdEKRGRlwyoEPXH8wRqIMhTwUsc/y7CfMA2VdDwzyco
 QKPdQ5kEh/2nHrT0VpMhZ538rlAAnsfq1gCaB+Ta4OPfBfsudFYKCyNSjB2Bgyo4Bigw
 w0dCxgjaiT9wgUGb0SzJOnk1qsUNMa4BWwQmcg5TBYERm7fp8i3j3o0FX74Utyf7zuoR
 HwKE+n4JmPLtX9xoitSo8YxkqCzRUJxLzFwIDpn3N8mv8PpQmXd4erZiVZtSovkgRp7F
 cBQk945WunFbkDLcTtUpvxLuNBdL/VUOx9f1ruKuJHaLnsMEV6xqxBioGs2uX7Pi4jWA
 Ob/g==
MIME-Version: 1.0
Received: by 10.152.103.38 with SMTP id ft6mr528732lab.40.1350637288567; Fri,
 19 Oct 2012 02:01:28 -0700 (PDT)
Received: by 10.114.63.83 with HTTP; Fri, 19 Oct 2012 02:01:28 -0700 (PDT)
X-Originating-IP: [94.31.26.5]
In-Reply-To: <CAOxoo31Ujzumi+hbZbRgY3EivY6dLwvP5nAZOOptgAEV9iKgzg@mail.gmail.com>
References: <CAOxoo31Ujzumi+hbZbRgY3EivY6dLwvP5nAZOOptgAEV9iKgzg@mail.gmail.com>
Date: Fri, 19 Oct 2012 10:01:28 +0100
Message-ID: <CAPyG9gP59K5i3SDTA45JLDxJWsPohVeyMKipNeHkQZ+XYN2aUA@mail.gmail.com>
Subject: Re: CARP on vSwitch
From: Simon Dick <simond@irrelevant.org>
To: Rafael Henrique Faria <rafaelhfaria@cenadigital.com.br>
Content-Type: text/plain; charset=UTF-8
X-Gm-Message-State: ALoCoQk68fXbUovceY4leu3RUOR5xLyPMu3yymydjHJwGORMxbBq33qAMmZ14jeU20Ve30q585tW
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 09:01:31 -0000

On 18 October 2012 21:59, Rafael Henrique Faria
<rafaelhfaria@cenadigital.com.br> wrote:
> Hi, I'm trying to use CARP on two FreeBSD servers in a ESX environment. But
> it's not working.
>
> The problem is that every frame sent from CARP gets back to the same host.
> This is an old problem:
>
> http://www.mail-archive.com/freebsd-net@freebsd.org/msg30562.html
>
> And already have a patch, but its 3 years old. And not yet commit-ed. There
> is any reason for this?
> I always used freebsd-update to keep the servers updated, and don't want to
> compile a kernel just to use the CARP.
>
> Someone have any suggestion or correction to this problem?

I found the following page useful when getting CARP working between
ESXi servers:
http://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting#ESX_VDS_Promisc_Workaround

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 09:15:06 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9D1BF290
 for <net@freebsd.org>; Fri, 19 Oct 2012 09:15:06 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id F0C258FC0C
 for <net@freebsd.org>; Fri, 19 Oct 2012 09:15:05 +0000 (UTC)
Received: (qmail 29979 invoked from network); 19 Oct 2012 10:53:51 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <kostikbel@gmail.com>; 19 Oct 2012 10:53:51 -0000
Message-ID: <50811A14.5080903@networx.ch>
Date: Fri, 19 Oct 2012 11:15:00 +0200
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: A small cleanup patch
References: <CALCNsJTWhVaV-2U1J5EtN2-6iyi_CGgCCrBVZ3VO1H0JLUKfvQ@mail.gmail.com>
 <50800D9D.1090705@networx.ch>
 <20121018170039.GS35915@deviant.kiev.zoral.com.ua>
In-Reply-To: <20121018170039.GS35915@deviant.kiev.zoral.com.ua>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: net@freebsd.org, Vijay Singh <vijju.singh@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 09:15:06 -0000

On 18.10.2012 19:00, Konstantin Belousov wrote:
> On Thu, Oct 18, 2012 at 04:09:33PM +0200, Andre Oppermann wrote:
>> On 05.10.2012 01:21, Vijay Singh wrote:
>>> Folks, I came up with this while going through the lltable code.
>>
>> Thank you. I just purged a larger number of stray spl* from the
>> net*/* directories. This stuff won't be backported to 9-STABLE
>> though.
>
> Why ? What is the value of having the fossils in the actively maintained
> stable tree ?

To avoid churn within a stable release track.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 11:25:39 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 633F0D03;
 Fri, 19 Oct 2012 11:25:39 +0000 (UTC) (envelope-from ae@FreeBSD.org)
Received: from [127.0.0.1] (hub.freebsd.org [8.8.178.136])
 by mx2.freebsd.org (Postfix) with ESMTP id 19E8D3B53B7;
 Fri, 19 Oct 2012 11:25:36 +0000 (UTC)
Message-ID: <508138A4.5030901@FreeBSD.org>
Date: Fri, 19 Oct 2012 15:25:24 +0400
From: "Andrey V. Elsukov" <ae@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:15.0) Gecko/20121010 Thunderbird/15.0.1
MIME-Version: 1.0
To: net@freebsd.org
Subject: [RFC] Enabling IPFIREWALL_FORWARD in run-time
X-Enigmail-Version: 1.4.3
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature";
 boundary="------------enigCA59115641B47F6217D4A48C"
Cc: ipfw@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 11:25:39 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigCA59115641B47F6217D4A48C
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi All,

Many years ago i have already proposed this feature, but at that time
several people were against, because as they said, it could affect
performance. Now, when we have high speed network adapters, SMP kernel
and network stack, several locks acquired in the path of each packet,
and i have an ability to test this in the lab.

So, i prepared the patch, that removes IPFIREWALL_FORWARD option from
the kernel and makes this functionality always build-in, but it is
turned off by default and can be enabled via the sysctl(8) variable
net.pfil.forward=3D1.

	http://people.freebsd.org/~ae/pfil_forward.diff

Also we have done some tests with the ixia traffic generator connected
via 10G network adapter. Tests have show that there is no visible
difference, and there is no visible performance degradation.

Any objections?

--=20
WBR, Andrey V. Elsukov


--------------enigCA59115641B47F6217D4A48C
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQEcBAEBAgAGBQJQgTisAAoJEAHF6gQQyKF6+UwH/2xemnR6Si2AtLcRJrB0HpXa
Kr8r2BCyTulAdBsYBznduCj4cvhpiVrXNhqIf9y1mrY4LMz0Ci98OClOTaom82t/
/1msCig4nt61ZV5X21aQ19xzWUqu/Njx1gGz63v2dBKAyhngdJ3EjGa5sU1L2RU2
wJnJ4/iSmq1IT9Y6x0iFXG+1LZTs/Kg+/9j5G8qnTJDRP0sIRwopG4Imd5MdHOLM
KrnpCm2HMxvxq6xls4phaBy20p/Yy5LDl7iDgJLyK7Ro8TA05me6zVBzz9hnuJjJ
zN65HAMlhZsfeXb5VxRfKh11QcS8jdYhHATUSYuHIlGibdAa4Pj+hZlWzVKTS1E=
=9ra7
-----END PGP SIGNATURE-----

--------------enigCA59115641B47F6217D4A48C--

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 11:47:11 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id EC093305;
 Fri, 19 Oct 2012 11:47:10 +0000 (UTC)
 (envelope-from zam4ever@gmail.com)
Received: from mail-qa0-f54.google.com (mail-qa0-f54.google.com
 [209.85.216.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 655C58FC08;
 Fri, 19 Oct 2012 11:47:10 +0000 (UTC)
Received: by mail-qa0-f54.google.com with SMTP id p27so67209qat.13
 for <multiple recipients>; Fri, 19 Oct 2012 04:47:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=KCt2rbM2J0//nzGbRbVss9xuEKCsfdhls/n/5ye2RJE=;
 b=ysfoB5JETcdWj7wpywQ8gZJG837a3N1HcUGudlE1YaInSeL3tXjMtDS5pH2m0M1zVA
 OBAxH8ouBtesyLr2Tu2eDAJbktBrZSLJzAH7QJr/R/oRQy8UG5/e481saxOTKiPvqpWU
 i4eTXO7MOeakEo705nqHo4eoAFp5Lv2J6v/5cmE19aNHB8rStwVetSC1KtrCkw2gQoWR
 Xq+5KY2+C6JEfLpkkSFdEgNtzDwQlsM+mOlEY8Utwb8pCAcCv6USKoTbEpNW+hUxMKYF
 D+UdnAAyctuXHNRZiE7gFKmglLRhbAYkImso16z2hIg1EyeHk14aVxEaRdPs1NqQilaD
 24LA==
MIME-Version: 1.0
Received: by 10.224.168.136 with SMTP id u8mr602590qay.17.1350647223650; Fri,
 19 Oct 2012 04:47:03 -0700 (PDT)
Received: by 10.49.117.134 with HTTP; Fri, 19 Oct 2012 04:47:03 -0700 (PDT)
Received: by 10.49.117.134 with HTTP; Fri, 19 Oct 2012 04:47:03 -0700 (PDT)
In-Reply-To: <508138A4.5030901@FreeBSD.org>
References: <508138A4.5030901@FreeBSD.org>
Date: Fri, 19 Oct 2012 19:47:03 +0800
Message-ID: <CAF0dOhHFVQ6tzMuT3j8q_A9KHpi9_PzCrmAezpvDqkSvWqTsPA@mail.gmail.com>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
From: Zamri Besar <zam4ever@gmail.com>
To: "Andrey V. Elsukov" <ae@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 11:47:11 -0000

On Oct 19, 2012 7:25 PM, "Andrey V. Elsukov" <ae@freebsd.org> wrote:
>
> Hi All,
>
> Many years ago i have already proposed this feature, but at that time
> several people were against, because as they said, it could affect
> performance. Now, when we have high speed network adapters, SMP kernel
> and network stack, several locks acquired in the path of each packet,
> and i have an ability to test this in the lab.
>
> So, i prepared the patch, that removes IPFIREWALL_FORWARD option from
> the kernel and makes this functionality always build-in, but it is
> turned off by default and can be enabled via the sysctl(8) variable
> net.pfil.forward=1.
>
>         http://people.freebsd.org/~ae/pfil_forward.diff
>
> Also we have done some tests with the ixia traffic generator connected
> via 10G network adapter. Tests have show that there is no visible
> difference, and there is no visible performance degradation.
>
> Any objections?
>
> --
> WBR, Andrey V. Elsukov
>

This is what I want many years ago too... ;)

I vote for "yes"

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 11:47:11 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 675BF307
 for <net@freebsd.org>; Fri, 19 Oct 2012 11:47:11 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
 by mx1.freebsd.org (Postfix) with ESMTP id D34C48FC0A
 for <net@freebsd.org>; Fri, 19 Oct 2012 11:47:10 +0000 (UTC)
Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1])
 by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q9JBlGrI018455;
 Fri, 19 Oct 2012 14:47:16 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
 by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q9JBl42Q093798;
 Fri, 19 Oct 2012 14:47:04 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
 by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q9JBl4Di093797;
 Fri, 19 Oct 2012 14:47:04 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
 kostikbel@gmail.com using -f
Date: Fri, 19 Oct 2012 14:47:04 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: A small cleanup patch
Message-ID: <20121019114704.GY35915@deviant.kiev.zoral.com.ua>
References: <CALCNsJTWhVaV-2U1J5EtN2-6iyi_CGgCCrBVZ3VO1H0JLUKfvQ@mail.gmail.com>
 <50800D9D.1090705@networx.ch>
 <20121018170039.GS35915@deviant.kiev.zoral.com.ua>
 <50811A14.5080903@networx.ch>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="j7Mt6kSMu9nlWjvx"
Content-Disposition: inline
In-Reply-To: <50811A14.5080903@networx.ch>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00
 autolearn=ham version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
 skuns.kiev.zoral.com.ua
Cc: net@freebsd.org, Vijay Singh <vijju.singh@gmail.com>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 11:47:11 -0000


--j7Mt6kSMu9nlWjvx
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Oct 19, 2012 at 11:15:00AM +0200, Andre Oppermann wrote:
> On 18.10.2012 19:00, Konstantin Belousov wrote:
> > On Thu, Oct 18, 2012 at 04:09:33PM +0200, Andre Oppermann wrote:
> >> On 05.10.2012 01:21, Vijay Singh wrote:
> >>> Folks, I came up with this while going through the lltable code.
> >>
> >> Thank you. I just purged a larger number of stray spl* from the
> >> net*/* directories. This stuff won't be backported to 9-STABLE
> >> though.
> >
> > Why ? What is the value of having the fossils in the actively maintained
> > stable tree ?
>=20
> To avoid churn within a stable release track.

IMO, this is wrong argument. The artificial difference, esp. due to the
nop and garbage, makes both code reading and MFC harder.

--j7Mt6kSMu9nlWjvx
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAlCBPbcACgkQC3+MBN1Mb4g2bwCgiSCaKpxMvb8SDGt/seLlFpEd
t0sAoLbvwk45lSW8OxbVSy5bL+05wMKU
=jAmM
-----END PGP SIGNATURE-----

--j7Mt6kSMu9nlWjvx--

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 12:02:53 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id A4292A89
 for <net@freebsd.org>; Fri, 19 Oct 2012 12:02:53 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id E969E8FC0C
 for <net@freebsd.org>; Fri, 19 Oct 2012 12:02:52 +0000 (UTC)
Received: (qmail 35284 invoked from network); 19 Oct 2012 13:41:36 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <ae@FreeBSD.org>; 19 Oct 2012 13:41:36 -0000
Message-ID: <50814166.1000602@networx.ch>
Date: Fri, 19 Oct 2012 14:02:46 +0200
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: "Andrey V. Elsukov" <ae@FreeBSD.org>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
References: <508138A4.5030901@FreeBSD.org>
In-Reply-To: <508138A4.5030901@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 12:02:53 -0000

On 19.10.2012 13:25, Andrey V. Elsukov wrote:
> Hi All,
>
> Many years ago i have already proposed this feature, but at that time
> several people were against, because as they said, it could affect
> performance. Now, when we have high speed network adapters, SMP kernel
> and network stack, several locks acquired in the path of each packet,
> and i have an ability to test this in the lab.
>
> So, i prepared the patch, that removes IPFIREWALL_FORWARD option from
> the kernel and makes this functionality always build-in, but it is
> turned off by default and can be enabled via the sysctl(8) variable
> net.pfil.forward=1.
>
> 	http://people.freebsd.org/~ae/pfil_forward.diff
>
> Also we have done some tests with the ixia traffic generator connected
> via 10G network adapter. Tests have show that there is no visible
> difference, and there is no visible performance degradation.
>
> Any objections?

No objection as such.  However I don't entirely agree with the
naming of pfil_forward.  The functionality is specific to IPFW
and TCP, it's doing transparent interjected termination of tcp
connections on the local host while keeping the original IP
addresses and port numbers visible in netstat output.

So it's a feature of IPFW/IP and should be fitted in there for
sysctl name and .h files instead of pfil.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 12:18:56 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 64A37E39;
 Fri, 19 Oct 2012 12:18:56 +0000 (UTC) (envelope-from ae@FreeBSD.org)
Received: from [127.0.0.1] (hub.freebsd.org [8.8.178.136])
 by mx2.freebsd.org (Postfix) with ESMTP id 232A43B4F7F;
 Fri, 19 Oct 2012 12:18:54 +0000 (UTC)
Message-ID: <50814523.2070002@FreeBSD.org>
Date: Fri, 19 Oct 2012 16:18:43 +0400
From: "Andrey V. Elsukov" <ae@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:15.0) Gecko/20121010 Thunderbird/15.0.1
MIME-Version: 1.0
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
References: <508138A4.5030901@FreeBSD.org> <50814166.1000602@networx.ch>
In-Reply-To: <50814166.1000602@networx.ch>
X-Enigmail-Version: 1.4.3
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature";
 boundary="------------enigC2F9C7A14662BA4A777BD6AB"
Cc: ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 12:18:56 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigC2F9C7A14662BA4A777BD6AB
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 19.10.2012 16:02, Andre Oppermann wrote:>>
http://people.freebsd.org/~ae/pfil_forward.diff
>>
>> Also we have done some tests with the ixia traffic generator connected=

>> via 10G network adapter. Tests have show that there is no visible
>> difference, and there is no visible performance degradation.
>>
>> Any objections?
>
> No objection as such.  However I don't entirely agree with the
> naming of pfil_forward.  The functionality is specific to IPFW
> and TCP, it's doing transparent interjected termination of tcp
> connections on the local host while keeping the original IP
> addresses and port numbers visible in netstat output.
>
> So it's a feature of IPFW/IP and should be fitted in there for
> sysctl name and .h files instead of pfil.

Actually it can be used not only by ipfw. We already have
net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and
placing it into net.inet.ip.fw is undesirable, because we can have
kernel without ipfw. So, i decided to choose pfil, because it could not
work without pfil.

--=20
WBR, Andrey V. Elsukov


--------------enigC2F9C7A14662BA4A777BD6AB
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQEcBAEBAgAGBQJQgUUqAAoJEAHF6gQQyKF6pyMIAILQkM9tSI6KL5bdG7qotu/Q
ulM49kdqP6eHNGt2FMCy634r6uM7HNPK0oY3cZq9acxbUFXf/es8PViz/ELCFmcL
V1BUAoDj2J6PBx4n1oGNf+efV9J/s/7YHLj93RH1hgFWVOIOoPdzlyhm/bIs5Dz2
HQ7Nw92GfMCIFREEcZZ55H5ai9xUJoP4BOYDrJ/za9I/XpxTTzqoGUrEJFJUKJP9
ASArYTggA5UrESKTMg/WV2/pIlmwkfEtgAjzAkjweeUi4N3T6QRjY8w8lbz7aZn0
GOq60Ia6cmmrwDZkmTmJ9NJGNKQ7yRlheprcLh9pmoWPEKpgZedcYeDcTLkrprk=
=fWAC
-----END PGP SIGNATURE-----

--------------enigC2F9C7A14662BA4A777BD6AB--

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 13:56:54 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id CDBD5FC0;
 Fri, 19 Oct 2012 13:56:54 +0000 (UTC)
 (envelope-from smithi@nimnet.asn.au)
Received: from sola.nimnet.asn.au (paqi.nimnet.asn.au [115.70.110.159])
 by mx1.freebsd.org (Postfix) with ESMTP id 09EE58FC08;
 Fri, 19 Oct 2012 13:56:53 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by sola.nimnet.asn.au (8.14.2/8.14.2) with ESMTP id q9JDujNr096880;
 Sat, 20 Oct 2012 00:56:45 +1100 (EST)
 (envelope-from smithi@nimnet.asn.au)
Date: Sat, 20 Oct 2012 00:56:45 +1100 (EST)
From: Ian Smith <smithi@nimnet.asn.au>
To: "Andrey V. Elsukov" <ae@freebsd.org>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
In-Reply-To: <508138A4.5030901@FreeBSD.org>
Message-ID: <20121020002249.X88114@sola.nimnet.asn.au>
References: <508138A4.5030901@FreeBSD.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Cc: ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 13:56:54 -0000

On Fri, 19 Oct 2012 15:25:24 +0400, Andrey V. Elsukov wrote:
 > Hi All,
 > 
 > Many years ago i have already proposed this feature, but at that time
 > several people were against, because as they said, it could affect
 > performance. Now, when we have high speed network adapters, SMP kernel
 > and network stack, several locks acquired in the path of each packet,
 > and i have an ability to test this in the lab.
 > 
 > So, i prepared the patch, that removes IPFIREWALL_FORWARD option from
 > the kernel and makes this functionality always build-in, but it is
 > turned off by default and can be enabled via the sysctl(8) variable
 > net.pfil.forward=1.
 > 
 > 	http://people.freebsd.org/~ae/pfil_forward.diff
 > 
 > Also we have done some tests with the ixia traffic generator connected
 > via 10G network adapter. Tests have show that there is no visible
 > difference, and there is no visible performance degradation.
 > 
 > Any objections?

Looks great.  I'll no longer have to tell people on questions@ that 
using ipfw fwd is the only reason left not to just load the module.

Taking the code on trust, only to comment on the documentation:

ipfw.8:
======= 
 To enable
 .Cm fwd
-a custom kernel needs to be compiled with the option
-.Cd "options IPFIREWALL_FORWARD" .
+the
+.Xr sysctl 8
+variable
+.Va net.pfil.forward
+should be set to 1.

NOTES:
=======
-# IPFIREWALL_FORWARD enables changing of the packet destination either
-# to do some sort of policy routing or transparent proxying.  Used by
-# ``ipfw forward''. All  redirections apply to locally generated
-# packets too.  Because of this great care is required when
-# crafting the ruleset.

ipfw(8) could perhaps incorporate that description (and warning) from 
NOTES in the entry under SYSCTLS where net.pfil.forward (or whatever :) 
would be expected to be described, apart from sysctl -d ?

cheers, Ian

 > WBR, Andrey V. Elsukov

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 14:05:50 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 40CDF3ED
 for <net@freebsd.org>; Fri, 19 Oct 2012 14:05:50 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 994FB8FC14
 for <net@freebsd.org>; Fri, 19 Oct 2012 14:05:48 +0000 (UTC)
Received: (qmail 35725 invoked from network); 19 Oct 2012 15:44:32 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <ae@FreeBSD.org>; 19 Oct 2012 15:44:32 -0000
Message-ID: <50815E36.6010703@networx.ch>
Date: Fri, 19 Oct 2012 16:05:42 +0200
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: "Andrey V. Elsukov" <ae@FreeBSD.org>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
References: <508138A4.5030901@FreeBSD.org> <50814166.1000602@networx.ch>
 <50814523.2070002@FreeBSD.org>
In-Reply-To: <50814523.2070002@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 14:05:50 -0000

On 19.10.2012 14:18, Andrey V. Elsukov wrote:
> On 19.10.2012 16:02, Andre Oppermann wrote:>>
> http://people.freebsd.org/~ae/pfil_forward.diff
>>>
>>> Also we have done some tests with the ixia traffic generator connected
>>> via 10G network adapter. Tests have show that there is no visible
>>> difference, and there is no visible performance degradation.
>>>
>>> Any objections?
>>
>> No objection as such.  However I don't entirely agree with the
>> naming of pfil_forward.  The functionality is specific to IPFW
>> and TCP, it's doing transparent interjected termination of tcp
>> connections on the local host while keeping the original IP
>> addresses and port numbers visible in netstat output.
>>
>> So it's a feature of IPFW/IP and should be fitted in there for
>> sysctl name and .h files instead of pfil.
>
> Actually it can be used not only by ipfw. We already have
> net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and
> placing it into net.inet.ip.fw is undesirable, because we can have
> kernel without ipfw. So, i decided to choose pfil, because it could not
> work without pfil.

Again, it's not a property of pfil.  It's a property of IP and it
should live there from a configuration point of view. Other firewalls
than ipfw don't make use of it.

You could rename it to transparent connection proxy or some such.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 15:17:44 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id C7959854;
 Fri, 19 Oct 2012 15:17:44 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id E46CB3B6660;
 Fri, 19 Oct 2012 15:17:41 +0000 (UTC)
Message-ID: <50816ECE.4020002@FreeBSD.org>
Date: Fri, 19 Oct 2012 19:16:30 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
References: <508138A4.5030901@FreeBSD.org> <50814166.1000602@networx.ch>
 <50814523.2070002@FreeBSD.org> <50815E36.6010703@networx.ch>
In-Reply-To: <50815E36.6010703@networx.ch>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: "Andrey V. Elsukov" <ae@FreeBSD.org>, ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 15:17:44 -0000

On 19.10.2012 18:05, Andre Oppermann wrote:
> On 19.10.2012 14:18, Andrey V. Elsukov wrote:
>> On 19.10.2012 16:02, Andre Oppermann wrote:>>
>> http://people.freebsd.org/~ae/pfil_forward.diff
>>>>
>>>> Also we have done some tests with the ixia traffic generator connected
>>>> via 10G network adapter. Tests have show that there is no visible
>>>> difference, and there is no visible performance degradation.
>>>>
>>>> Any objections?
>>>
>>> No objection as such.  However I don't entirely agree with the
>>> naming of pfil_forward.  The functionality is specific to IPFW
>>> and TCP, it's doing transparent interjected termination of tcp
>>> connections on the local host while keeping the original IP
>>> addresses and port numbers visible in netstat output.
>>>
>>> So it's a feature of IPFW/IP and should be fitted in there for
>>> sysctl name and .h files instead of pfil.
>>
>> Actually it can be used not only by ipfw. We already have
>> net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and
>> placing it into net.inet.ip.fw is undesirable, because we can have
>> kernel without ipfw. So, i decided to choose pfil, because it could not
>> work without pfil.
>
> Again, it's not a property of pfil.  It's a property of IP and it
Not exactly. It is currently supported in both IPv4 and IPv6.
> should live there from a configuration point of view. Other firewalls
> than ipfw don't make use of it.
>
> You could rename it to transparent connection proxy or some such.
fwd is widely used as policy-based routing, so it is not just 
upper-layer TCP feature.
>



-- 
WBR, Alexander


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 15:24:15 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 82BA4B13;
 Fri, 19 Oct 2012 15:24:15 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id 2BB123B4F81;
 Fri, 19 Oct 2012 15:24:13 +0000 (UTC)
Message-ID: <50817057.3090200@FreeBSD.org>
Date: Fri, 19 Oct 2012 19:23:03 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org>
In-Reply-To: <201210171006.51214.jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 15:24:15 -0000

On 17.10.2012 18:06, John Baldwin wrote:
> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
>>> On 13.10.2012 23:24, Jack Vogel wrote:
>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:
>>>
>>>>>
>>>>> one option could be (same as it is done in the timer
>>>>> routine in dummynet) to build a list of all the packets
>>>>> that need to be sent to if_input(), and then call
>>>>> if_input with the entire list outside the lock.
>>>>>
>>>>> It would be even easier if we modify the various *_input()
>>>>> routines to handle a list of mbufs instead of just one.
>>>
>>> Bulk processing is generally a good idea we probably should implement.
>>> Probably starting from driver queue ending with marked mbufs
>>> (OURS/forward/legacy processing (appletalk and similar))?
>>>
>>> This can minimize an impact for all
>>> locks on RX side:
>>> L2
>>> * rx PFIL hook
>>> L3 (both IPv4 and IPv6)
>>> * global IF_ADDR_RLOCK (currently commented out)
>>> * Per-interface ADDR_RLOCK
>>> * PFIL hook
>>>
>>>   From the first glance, there can be problems with:
>>> * Increased latency (we should have some kind of rx_process_limit), but
>>> still
>>> * reader locks being acquired for much longer amount of time
>>>
>>>>>
>>>>> cheers
>>>>> luigi
>>>>>
>>>>> Very interesting idea Luigi, will have to get that some thought.
>>>>
>>>> Jack
>>>
>>> Returning to original post topic:
>>>
>>> Given
>>> 1) we are currently binding ixgbe ithreads to CPU cores
>>> 2) RX queue lock is used by (indirectly) in only 2 places:
>>> a) ISR routine (msix or legacy irq)
>>> b) taskqueue routine which is scheduled if some packets remains in RX
>>> queue and rx_process_limit ended OR we need something to TX
>>>
>>> 3) in practice taskqueue routine is a nightmare for many people since
>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>>> some traffic burst happens: once it is called it starts to schedule
>>> itself more and more replacing original ISR routine. Additionally,
>>> increasing rx_process_limit does not help since taskqueue is called with
>>> the same limit. Finally, currently netisr taskq threads are not bound to
>>> any CPU which makes the process even more uncontrollable.
>>
>> I think part of the problem here is that the taskqueue in ixgbe(4) is
>> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>> just start transmitting packets directly.
>>
>> I fixed this in igb(4) here:
>>
>> http://svnweb.freebsd.org/base?view=revision&revision=233708
>>
>> You can try this for ixgbe(4).  It also comments out a spurious taskqueue
>> reschedule from the watchdog handler that might also lower the taskqueue
>> usage.  You can try changing that #if 0 to an #if 1 to test just the txeof
>> changes:
>
> Is anyone able to test this btw to see if it improves things on ixgbe at all?
> (I don't have any ixgbe hardware.)
Yes. I'll try to to this next week (since ixgbe driver from at least 9-S 
fails to detect twinax cable which works in 8-S....)).
>


-- 
WBR, Alexander



From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 15:24:15 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id 82BA4B13;
 Fri, 19 Oct 2012 15:24:15 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135])
 by mx2.freebsd.org (Postfix) with ESMTP id 2BB123B4F81;
 Fri, 19 Oct 2012 15:24:13 +0000 (UTC)
Message-ID: <50817057.3090200@FreeBSD.org>
Date: Fri, 19 Oct 2012 19:23:03 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: John Baldwin <jhb@freebsd.org>
Subject: Re: ixgbe & if_igb RX ring locking
References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org>
In-Reply-To: <201210171006.51214.jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 Jack Vogel <jfvogel@gmail.com>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 15:24:15 -0000

On 17.10.2012 18:06, John Baldwin wrote:
> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
>>> On 13.10.2012 23:24, Jack Vogel wrote:
>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>  wrote:
>>>
>>>>>
>>>>> one option could be (same as it is done in the timer
>>>>> routine in dummynet) to build a list of all the packets
>>>>> that need to be sent to if_input(), and then call
>>>>> if_input with the entire list outside the lock.
>>>>>
>>>>> It would be even easier if we modify the various *_input()
>>>>> routines to handle a list of mbufs instead of just one.
>>>
>>> Bulk processing is generally a good idea we probably should implement.
>>> Probably starting from driver queue ending with marked mbufs
>>> (OURS/forward/legacy processing (appletalk and similar))?
>>>
>>> This can minimize an impact for all
>>> locks on RX side:
>>> L2
>>> * rx PFIL hook
>>> L3 (both IPv4 and IPv6)
>>> * global IF_ADDR_RLOCK (currently commented out)
>>> * Per-interface ADDR_RLOCK
>>> * PFIL hook
>>>
>>>   From the first glance, there can be problems with:
>>> * Increased latency (we should have some kind of rx_process_limit), but
>>> still
>>> * reader locks being acquired for much longer amount of time
>>>
>>>>>
>>>>> cheers
>>>>> luigi
>>>>>
>>>>> Very interesting idea Luigi, will have to get that some thought.
>>>>
>>>> Jack
>>>
>>> Returning to original post topic:
>>>
>>> Given
>>> 1) we are currently binding ixgbe ithreads to CPU cores
>>> 2) RX queue lock is used by (indirectly) in only 2 places:
>>> a) ISR routine (msix or legacy irq)
>>> b) taskqueue routine which is scheduled if some packets remains in RX
>>> queue and rx_process_limit ended OR we need something to TX
>>>
>>> 3) in practice taskqueue routine is a nightmare for many people since
>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>>> some traffic burst happens: once it is called it starts to schedule
>>> itself more and more replacing original ISR routine. Additionally,
>>> increasing rx_process_limit does not help since taskqueue is called with
>>> the same limit. Finally, currently netisr taskq threads are not bound to
>>> any CPU which makes the process even more uncontrollable.
>>
>> I think part of the problem here is that the taskqueue in ixgbe(4) is
>> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>> just start transmitting packets directly.
>>
>> I fixed this in igb(4) here:
>>
>> http://svnweb.freebsd.org/base?view=revision&revision=233708
>>
>> You can try this for ixgbe(4).  It also comments out a spurious taskqueue
>> reschedule from the watchdog handler that might also lower the taskqueue
>> usage.  You can try changing that #if 0 to an #if 1 to test just the txeof
>> changes:
>
> Is anyone able to test this btw to see if it improves things on ixgbe at all?
> (I don't have any ixgbe hardware.)
Yes. I'll try to to this next week (since ixgbe driver from at least 9-S 
fails to detect twinax cable which works in 8-S....)).
>


-- 
WBR, Alexander



From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 15:48:03 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9ADE129E;
 Fri, 19 Oct 2012 15:48:03 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com
 [209.85.220.182])
 by mx1.freebsd.org (Postfix) with ESMTP id E74748FC12;
 Fri, 19 Oct 2012 15:48:02 +0000 (UTC)
Received: by mail-vc0-f182.google.com with SMTP id fw7so820958vcb.13
 for <multiple recipients>; Fri, 19 Oct 2012 08:48:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=LyQpCVIq+rlEOBpqfsniQevQqr2BzqLmlbe5ZKcEjGI=;
 b=XqPrwuIvrS2KLuRGcw8wwZnWDT4JsuYhRRsytdSrxFRmyh7v4Hji0Q7neaWOSeqJX/
 DtAl54DqTKVLqbjQ7QXeC1MHT6L+ic6ors2/TXpJGtS2kjHT2VCoee9VvittH3vPIU1z
 P3hFxgO9+5m3XEVHjMC1m5WYbaNj/RN9g3HSNeMSXVKTdt6epgf5aB1CZwcXGJaRshYf
 6GKgUmSElKrLtyBvJJkT4PRSUAO60A58TZC54BJpchK5XGIK4BNx7vhWJVf0PchM0rsB
 8v/lN+wTHmqtUoxzz7rbdj6Jewq6h0Tdu0a7khpd3tcd5/3BlL0fvOtB71Xevbu6GBzI
 UJ4Q==
MIME-Version: 1.0
Received: by 10.52.75.70 with SMTP id a6mr1762200vdw.5.1350661681798; Fri, 19
 Oct 2012 08:48:01 -0700 (PDT)
Received: by 10.58.68.8 with HTTP; Fri, 19 Oct 2012 08:48:01 -0700 (PDT)
In-Reply-To: <50817057.3090200@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org>
 <201210171006.51214.jhb@freebsd.org> <50817057.3090200@FreeBSD.org>
Date: Fri, 19 Oct 2012 08:48:01 -0700
Message-ID: <CAFOYbc=vCjATR3gqgDqHVrQpvMO7w+8NMKcZ0UJ9MjuuENS-dQ@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Jack Vogel <jfvogel@gmail.com>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 John Baldwin <jhb@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 15:48:03 -0000

On Fri, Oct 19, 2012 at 8:23 AM, Alexander V. Chernikov <
melifaro@freebsd.org> wrote:

> On 17.10.2012 18:06, John Baldwin wrote:
>
>> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
>>
>>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
>>>
>>>> On 13.10.2012 23:24, Jack Vogel wrote:
>>>>
>>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>
>>>>>  wrote:
>>>>>
>>>>
>>>>
>>>>>> one option could be (same as it is done in the timer
>>>>>> routine in dummynet) to build a list of all the packets
>>>>>> that need to be sent to if_input(), and then call
>>>>>> if_input with the entire list outside the lock.
>>>>>>
>>>>>> It would be even easier if we modify the various *_input()
>>>>>> routines to handle a list of mbufs instead of just one.
>>>>>>
>>>>>
>>>> Bulk processing is generally a good idea we probably should implement.
>>>> Probably starting from driver queue ending with marked mbufs
>>>> (OURS/forward/legacy processing (appletalk and similar))?
>>>>
>>>> This can minimize an impact for all
>>>> locks on RX side:
>>>> L2
>>>> * rx PFIL hook
>>>> L3 (both IPv4 and IPv6)
>>>> * global IF_ADDR_RLOCK (currently commented out)
>>>> * Per-interface ADDR_RLOCK
>>>> * PFIL hook
>>>>
>>>>   From the first glance, there can be problems with:
>>>> * Increased latency (we should have some kind of rx_process_limit), but
>>>> still
>>>> * reader locks being acquired for much longer amount of time
>>>>
>>>>
>>>>>> cheers
>>>>>> luigi
>>>>>>
>>>>>> Very interesting idea Luigi, will have to get that some thought.
>>>>>>
>>>>>
>>>>> Jack
>>>>>
>>>>
>>>> Returning to original post topic:
>>>>
>>>> Given
>>>> 1) we are currently binding ixgbe ithreads to CPU cores
>>>> 2) RX queue lock is used by (indirectly) in only 2 places:
>>>> a) ISR routine (msix or legacy irq)
>>>> b) taskqueue routine which is scheduled if some packets remains in RX
>>>> queue and rx_process_limit ended OR we need something to TX
>>>>
>>>> 3) in practice taskqueue routine is a nightmare for many people since
>>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>>>> some traffic burst happens: once it is called it starts to schedule
>>>> itself more and more replacing original ISR routine. Additionally,
>>>> increasing rx_process_limit does not help since taskqueue is called with
>>>> the same limit. Finally, currently netisr taskq threads are not bound to
>>>> any CPU which makes the process even more uncontrollable.
>>>>
>>>
>>> I think part of the problem here is that the taskqueue in ixgbe(4) is
>>> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>>> just start transmitting packets directly.
>>>
>>> I fixed this in igb(4) here:
>>>
>>> http://svnweb.freebsd.org/**base?view=revision&revision=**233708<http://svnweb.freebsd.org/base?view=revision&revision=233708>
>>>
>>> You can try this for ixgbe(4).  It also comments out a spurious taskqueue
>>> reschedule from the watchdog handler that might also lower the taskqueue
>>> usage.  You can try changing that #if 0 to an #if 1 to test just the
>>> txeof
>>> changes:
>>>
>>
>> Is anyone able to test this btw to see if it improves things on ixgbe at
>> all?
>> (I don't have any ixgbe hardware.)
>>
> Yes. I'll try to to this next week (since ixgbe driver from at least 9-S
> fails to detect twinax cable which works in 8-S....)).
>
>>
>>
>
If you have a major problem like this you might want to put it in a bug
report or at least an
email with that specific topic rather than bury it in an unrelated thread
in a parenthetic remark :(
This is the first I've heard of this, did you check the code on HEAD to see
if it also has the issue?

Jack

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 15:48:03 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9ADE129E;
 Fri, 19 Oct 2012 15:48:03 +0000 (UTC)
 (envelope-from jfvogel@gmail.com)
Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com
 [209.85.220.182])
 by mx1.freebsd.org (Postfix) with ESMTP id E74748FC12;
 Fri, 19 Oct 2012 15:48:02 +0000 (UTC)
Received: by mail-vc0-f182.google.com with SMTP id fw7so820958vcb.13
 for <multiple recipients>; Fri, 19 Oct 2012 08:48:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=LyQpCVIq+rlEOBpqfsniQevQqr2BzqLmlbe5ZKcEjGI=;
 b=XqPrwuIvrS2KLuRGcw8wwZnWDT4JsuYhRRsytdSrxFRmyh7v4Hji0Q7neaWOSeqJX/
 DtAl54DqTKVLqbjQ7QXeC1MHT6L+ic6ors2/TXpJGtS2kjHT2VCoee9VvittH3vPIU1z
 P3hFxgO9+5m3XEVHjMC1m5WYbaNj/RN9g3HSNeMSXVKTdt6epgf5aB1CZwcXGJaRshYf
 6GKgUmSElKrLtyBvJJkT4PRSUAO60A58TZC54BJpchK5XGIK4BNx7vhWJVf0PchM0rsB
 8v/lN+wTHmqtUoxzz7rbdj6Jewq6h0Tdu0a7khpd3tcd5/3BlL0fvOtB71Xevbu6GBzI
 UJ4Q==
MIME-Version: 1.0
Received: by 10.52.75.70 with SMTP id a6mr1762200vdw.5.1350661681798; Fri, 19
 Oct 2012 08:48:01 -0700 (PDT)
Received: by 10.58.68.8 with HTTP; Fri, 19 Oct 2012 08:48:01 -0700 (PDT)
In-Reply-To: <50817057.3090200@FreeBSD.org>
References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org>
 <201210150904.27567.jhb@freebsd.org>
 <201210171006.51214.jhb@freebsd.org> <50817057.3090200@FreeBSD.org>
Date: Fri, 19 Oct 2012 08:48:01 -0700
Message-ID: <CAFOYbc=vCjATR3gqgDqHVrQpvMO7w+8NMKcZ0UJ9MjuuENS-dQ@mail.gmail.com>
Subject: Re: ixgbe & if_igb RX ring locking
From: Jack Vogel <jfvogel@gmail.com>
To: "Alexander V. Chernikov" <melifaro@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>,
 John Baldwin <jhb@freebsd.org>, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 15:48:03 -0000

On Fri, Oct 19, 2012 at 8:23 AM, Alexander V. Chernikov <
melifaro@freebsd.org> wrote:

> On 17.10.2012 18:06, John Baldwin wrote:
>
>> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote:
>>
>>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote:
>>>
>>>> On 13.10.2012 23:24, Jack Vogel wrote:
>>>>
>>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo<rizzo@iet.unipi.it>
>>>>>  wrote:
>>>>>
>>>>
>>>>
>>>>>> one option could be (same as it is done in the timer
>>>>>> routine in dummynet) to build a list of all the packets
>>>>>> that need to be sent to if_input(), and then call
>>>>>> if_input with the entire list outside the lock.
>>>>>>
>>>>>> It would be even easier if we modify the various *_input()
>>>>>> routines to handle a list of mbufs instead of just one.
>>>>>>
>>>>>
>>>> Bulk processing is generally a good idea we probably should implement.
>>>> Probably starting from driver queue ending with marked mbufs
>>>> (OURS/forward/legacy processing (appletalk and similar))?
>>>>
>>>> This can minimize an impact for all
>>>> locks on RX side:
>>>> L2
>>>> * rx PFIL hook
>>>> L3 (both IPv4 and IPv6)
>>>> * global IF_ADDR_RLOCK (currently commented out)
>>>> * Per-interface ADDR_RLOCK
>>>> * PFIL hook
>>>>
>>>>   From the first glance, there can be problems with:
>>>> * Increased latency (we should have some kind of rx_process_limit), but
>>>> still
>>>> * reader locks being acquired for much longer amount of time
>>>>
>>>>
>>>>>> cheers
>>>>>> luigi
>>>>>>
>>>>>> Very interesting idea Luigi, will have to get that some thought.
>>>>>>
>>>>>
>>>>> Jack
>>>>>
>>>>
>>>> Returning to original post topic:
>>>>
>>>> Given
>>>> 1) we are currently binding ixgbe ithreads to CPU cores
>>>> 2) RX queue lock is used by (indirectly) in only 2 places:
>>>> a) ISR routine (msix or legacy irq)
>>>> b) taskqueue routine which is scheduled if some packets remains in RX
>>>> queue and rx_process_limit ended OR we need something to TX
>>>>
>>>> 3) in practice taskqueue routine is a nightmare for many people since
>>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after
>>>> some traffic burst happens: once it is called it starts to schedule
>>>> itself more and more replacing original ISR routine. Additionally,
>>>> increasing rx_process_limit does not help since taskqueue is called with
>>>> the same limit. Finally, currently netisr taskq threads are not bound to
>>>> any CPU which makes the process even more uncontrollable.
>>>>
>>>
>>> I think part of the problem here is that the taskqueue in ixgbe(4) is
>>> bogusly rescheduled for TX handling.  Instead, ixgbe_msix_que() should
>>> just start transmitting packets directly.
>>>
>>> I fixed this in igb(4) here:
>>>
>>> http://svnweb.freebsd.org/**base?view=revision&revision=**233708<http://svnweb.freebsd.org/base?view=revision&revision=233708>
>>>
>>> You can try this for ixgbe(4).  It also comments out a spurious taskqueue
>>> reschedule from the watchdog handler that might also lower the taskqueue
>>> usage.  You can try changing that #if 0 to an #if 1 to test just the
>>> txeof
>>> changes:
>>>
>>
>> Is anyone able to test this btw to see if it improves things on ixgbe at
>> all?
>> (I don't have any ixgbe hardware.)
>>
> Yes. I'll try to to this next week (since ixgbe driver from at least 9-S
> fails to detect twinax cable which works in 8-S....)).
>
>>
>>
>
If you have a major problem like this you might want to put it in a bug
report or at least an
email with that specific topic rather than bury it in an unrelated thread
in a parenthetic remark :(
This is the first I've heard of this, did you check the code on HEAD to see
if it also has the issue?

Jack

From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 20:06:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 160597AE;
 Fri, 19 Oct 2012 20:06:30 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 by mx1.freebsd.org (Postfix) with ESMTP id D70C88FC08;
 Fri, 19 Oct 2012 20:06:29 +0000 (UTC)
Received: from JRE-MBP-2.local (c-50-143-149-146.hsd1.ca.comcast.net
 [50.143.149.146]) (authenticated bits=0)
 by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q9JK6L8K068127
 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO);
 Fri, 19 Oct 2012 13:06:23 -0700 (PDT)
 (envelope-from julian@freebsd.org)
Message-ID: <5081B2BD.3090103@freebsd.org>
Date: Fri, 19 Oct 2012 13:06:21 -0700
From: Julian Elischer <julian@freebsd.org>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6;
 rv:16.0) Gecko/20121010 Thunderbird/16.0.1
MIME-Version: 1.0
To: "Andrey V. Elsukov" <ae@freebsd.org>
Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time
References: <508138A4.5030901@FreeBSD.org>
In-Reply-To: <508138A4.5030901@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: ipfw@freebsd.org, net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 20:06:30 -0000

On 10/19/12 4:25 AM, Andrey V. Elsukov wrote:
> Hi All,
>
> Many years ago i have already proposed this feature, but at that time
> several people were against, because as they said, it could affect
> performance. Now, when we have high speed network adapters, SMP kernel
> and network stack, several locks acquired in the path of each packet,
> and i have an ability to test this in the lab.
>
> So, i prepared the patch, that removes IPFIREWALL_FORWARD option from
> the kernel and makes this functionality always build-in, but it is
> turned off by default and can be enabled via the sysctl(8) variable
> net.pfil.forward=1.
>
> 	http://people.freebsd.org/~ae/pfil_forward.diff
>
> Also we have done some tests with the ixia traffic generator connected
> via 10G network adapter. Tests have show that there is no visible
> difference, and there is no visible performance degradation.
>
> Any objections?
>
NO objection from me..
It was always my intention to "some day" either make it standard, OR 
at least default it to 'on'.

looks ot me as if a couple of your 'goto's might just be changed to  {}


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 20:34:07 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 459BB1F7
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 20:34:07 +0000 (UTC)
 (envelope-from guy.helmer@gmail.com)
Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com
 [209.85.223.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 033188FC12
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 20:34:06 +0000 (UTC)
Received: by mail-ie0-f182.google.com with SMTP id k10so1693598iea.13
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 13:34:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=content-type:mime-version:subject:from:in-reply-to:date:cc
 :content-transfer-encoding:message-id:references:to:x-mailer;
 bh=8+2bZHNRYT0d1IjMjvWzeYnkzvwIGSp49vRgjPE9v+4=;
 b=znXaoCA3h/DT7SqCt36WWTqrc90Aj4QumJiFRnMxK8EgZE/QL8R2A3UDTOIysP6mtm
 oT2jes56PFF+5Us/+xXtZbMjnKmat0pPhI5A86vIqKLtvSVoh7sW9mDqvpoit1eK5XVo
 AnWoeqgomzIOlHstiqnfi7w/4niMTY4rPYW7IZWKOu8QE2jumDiaF9OW1xa5gLPytp2S
 imLVu9HR3P8ukt2dTj9+hst+kmuQup1NoeaKh86uFYNI7viDRkzHVCceO8fQWovwfI8S
 tofHVS8zIoTjl7gpxt6tOJioTda2SRK60XeB/YGt4DoqBdPCQMuUmEPP5VCsxvTFxbjg
 iWxQ==
Received: by 10.50.57.200 with SMTP id k8mr9935857igq.29.1350678846395;
 Fri, 19 Oct 2012 13:34:06 -0700 (PDT)
Received: from [192.168.221.107] ([216.81.189.9])
 by mx.google.com with ESMTPS id 7sm1235690igh.0.2012.10.19.13.34.04
 (version=TLSv1/SSLv3 cipher=OTHER);
 Fri, 19 Oct 2012 13:34:05 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: CARP on vSwitch
From: Guy Helmer <guy.helmer@gmail.com>
In-Reply-To: <CAOxoo31Ujzumi+hbZbRgY3EivY6dLwvP5nAZOOptgAEV9iKgzg@mail.gmail.com>
Date: Fri, 19 Oct 2012 15:34:00 -0500
Content-Transfer-Encoding: quoted-printable
Message-Id: <DEC38DE7-F5D6-4F85-8F45-C4332BB69A50@gmail.com>
References: <CAOxoo31Ujzumi+hbZbRgY3EivY6dLwvP5nAZOOptgAEV9iKgzg@mail.gmail.com>
To: Rafael Henrique Faria <rafaelhfaria@cenadigital.com.br>
X-Mailer: Apple Mail (2.1499)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 20:34:07 -0000


On Oct 18, 2012, at 3:59 PM, Rafael Henrique Faria =
<rafaelhfaria@cenadigital.com.br> wrote:

> Hi, I'm trying to use CARP on two FreeBSD servers in a ESX =
environment. But
> it's not working.
>=20
> The problem is that every frame sent from CARP gets back to the same =
host.
> This is an old problem:
>=20
> http://www.mail-archive.com/freebsd-net@freebsd.org/msg30562.html
>=20
> And already have a patch, but its 3 years old. And not yet commit-ed. =
There
> is any reason for this?
> I always used freebsd-update to keep the servers updated, and don't =
want to
> compile a kernel just to use the CARP.
>=20
> Someone have any suggestion or correction to this problem?
>=20
> Thanks in advance.

I have been using this ipfw rule pair to filter the CARP packets to work =
around this problem:

# Allow CARP advertisements out from me and in from anyone but me
${fwcmd} add allow carp from me to any out
${fwcmd} add deny carp from me to any in

Guy


From owner-freebsd-net@FreeBSD.ORG  Fri Oct 19 22:00:35 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 50BAC3E7
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 22:00:35 +0000 (UTC)
 (envelope-from steven@pyro.eu.org)
Received: from falkenstein-2.sn.de.cluster.ok24.net
 (falkenstein-2.sn.de.cluster.ok24.net [IPv6:2002:4e2f:2f89:2::1])
 by mx1.freebsd.org (Postfix) with ESMTP id C553D8FC0C
 for <freebsd-net@freebsd.org>; Fri, 19 Oct 2012 22:00:34 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=simple/simple; d=pyro.eu.org;
 s=10.2012; 
 h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID;
 bh=2gU+Pj28jGmHVfDVRaUDRf9Dgc7Zf6wtlvYt5VN5CGM=; 
 b=L2f6e5N2wObvI0z5SVzE7sbmkz1fGpTWsZuaHtatfYe7LXMVkqTyNM+bBNIJlI4ViK4Tp1xJHeX7I3Noel0MIwBZIpGTHCy7RarX8uZEKrVCKFODwuuQ0GWGFCwj+udrrcNAhKpC55lFmIWrsxzADW95dIvXacPtoA66J2sV5qk=;
X-Spam-Status: No, score=-1.1 required=2.0 tests=ALL_TRUSTED, BAYES_00,
 DKIM_ADSP_DISCARD, TVD_RCVD_IP
Received: from 188-220-33-66.zone11.bethere.co.uk ([188.220.33.66]
 helo=guisborough-1.rcc.uk.cluster.ok24.net)
 by falkenstein-2.sn.de.cluster.ok24.net with esmtp (Exim 4.72)
 (envelope-from <steven@pyro.eu.org>)
 id 1TPKcF-0000kq-HH; Fri, 19 Oct 2012 23:00:31 +0100
X-Spam-Status: No, score=-4.0 required=2.0 tests=ALL_TRUSTED, AWL, BAYES_00,
 DKIM_POLICY_SIGNALL
Received: from [192.168.0.110] (helo=[192.168.0.9])
 by guisborough-1.rcc.uk.cluster.ok24.net with esmtpsa
 (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69)
 (envelope-from <steven@pyro.eu.org>)
 id 1TPKc5-0002nL-PT; Fri, 19 Oct 2012 23:00:27 +0100
Message-ID: <5081CD71.2050709@pyro.eu.org>
Date: Fri, 19 Oct 2012 23:00:17 +0100
From: Steven Chamberlain <steven@pyro.eu.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:10.0.7) Gecko/20120922 Icedove/10.0.7
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Debian Bug#690986: CVE-2012-5363 CVE-2012-5365
References: <20121019193436.5031.87058.reportbug@pisco.westfalen.local>
In-Reply-To: <20121019193436.5031.87058.reportbug@pisco.westfalen.local>
X-Enigmail-Version: 1.4.1
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: Moritz Muehlenhoff <jmm@debian.org>, 690986@bugs.debian.org,
 690986-forwarded@bugs.debian.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Oct 2012 22:00:35 -0000

Hi,

On 19/10/12 20:34, Moritz Muehlenhoff wrote:
> Two security issues were found in the kfreebsd network stack:
> http://www.openwall.com/lists/oss-security/2012/10/10/8

> Issue #1 was assigned CVE-2012-5363
> Issue #2 was assigned CVE-2012-5365

Thank you for mentioning it.

Issue #2 seems similar to CVE-2011-2393, which I assumed was only
relevant where we'd set net.inet6.ip6.accept_rtadv=1, which isn't the
upstream FreeBSD default.  Issue #1 however might affect any FreeBSD
system acting as an IPv6 router.

If this can actually be confirmed, then the worst case I can imagine, is
if a FreeBSD box acts as an IPv6 router for multiple interfaces, perhaps
serving different users;  any one of them might flood with Neighbour
Solicitations on their local link and create a DoS affecting other
interfaces.


I found some code committed to OpenBSD (in 2008, uh-oh), supposedly from
KAME (but I can't find it in their repository?), implementing
per-interface and global limits on the number of prefixes/routes
accepted via RA.  I imagine that's the best way to avoid some or all of
these issues.

> http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet6/in6_proto.c?sortby=date#rev1.56

Just recently it seems this was also committed to NetBSD HEAD:  "4 new
sysctls to avoid ipv6 DoS attacks from OpenBSD".  I don't know of an
easier way to link to a whole CVS commit, but here are (hopefully all)
the changes to individual files:

> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/ip6_input.c.diff?r1=1.138&r2=1.139&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/ip6_var.h.diff?r1=1.58&r2=1.59&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/nd6.c.diff?r1=1.142&r2=1.143&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/nd6.h.diff?r1=1.56&r2=1.57&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/icmp6.c.diff?r1=1.160&r2=1.161&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/in6.c.diff?r1=1.160&r2=1.161&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/in6_proto.c.diff?r1=1.96&r2=1.97&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/in6_var.h.diff?r1=1.64&r2=1.65&sortby=date&only_with_tag=MAIN
> http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/nd6_rtr.c.diff?r1=1.82&r2=1.83&sortby=date&only_with_tag=MAIN

Regards,
-- 
Steven Chamberlain
steven@pyro.eu.org

From owner-freebsd-net@FreeBSD.ORG  Sat Oct 20 13:06:38 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 478A5B32;
 Sat, 20 Oct 2012 13:06:38 +0000 (UTC)
 (envelope-from nevzorovn@gmail.com)
Received: from mail-la0-f54.google.com (mail-la0-f54.google.com
 [209.85.215.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 4B6458FC12;
 Sat, 20 Oct 2012 13:06:37 +0000 (UTC)
Received: by mail-la0-f54.google.com with SMTP id e12so1038047lag.13
 for <multiple recipients>; Sat, 20 Oct 2012 06:06:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=4QX/jLEYeyNPTrmdWEKKeKOTt5XMQNl6ICXaaC1Zyh4=;
 b=cTM3MiwVt6UNtMgzN/8Eg+3oRoDzy+hZGtR58tJMKuUNEImFjBBnOeReuDGLcZOO7K
 obWspFKnoQ8HdmHw1wfVB0yZRDaF1qlCHBlJ2Zo2S3D8KZTm27DQaecxgnZm2/rbBjYP
 pIf6OYXQqAyt4yFykUyXUoUDquCZ9JvlhaL+us8b92UKIuHR9lJX9TqGJkYmAiNJKkJa
 QaYOlAL7jDRDC8zb6+3g+2RI2pRye9VK81dOlXNiXf2jY65mTYe0QBlxWl4jIzdC2mV4
 v9uBaDpjZRpMpjh/x9HeU7vHnq5QdtHLc3imjdu1pb0fIFYiHW5EWvcsA/PcBxMdfXCF
 UthA==
MIME-Version: 1.0
Received: by 10.112.14.107 with SMTP id o11mr1655206lbc.98.1350738390001; Sat,
 20 Oct 2012 06:06:30 -0700 (PDT)
Received: by 10.112.42.70 with HTTP; Sat, 20 Oct 2012 06:06:29 -0700 (PDT)
In-Reply-To: <201210180141.q9I1f53s052539@freefall.freebsd.org>
References: <201210180141.q9I1f53s052539@freefall.freebsd.org>
Date: Sat, 20 Oct 2012 19:06:29 +0600
Message-ID: <CAHtHi9kphRZnX53mKeWzG3HcJoq8-E1Cb1JJTQsrybqLwPGk1g@mail.gmail.com>
Subject: Re: kern/171520: [alc] alc network driver + tso + vlan does not work.
From: Nikolay Nevzorov <nevzorovn@gmail.com>
To: yongari@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Oct 2012 13:06:38 -0000

On my netbook TSO over VLAN doesn't on generic and my kernel in any network
config.

#ifconfig
alc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500

options=c3098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MCAST,WOL_MAGIC,VLAN_HWTSO,LINKSTATE>
        ether 88:ae:1d:61:29:d2
        inet6 fe80::8aae:1dff:fe61:29d2%alc0 prefixlen 64 scopeid 0x1
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
ath0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 2290
        ether c4:46:19:3b:0d:cf
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
        status: no carrier
ipfw0: flags=8801<UP,SIMPLEX,MULTICAST> metric 0 mtu 65536
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=3<RXCSUM,TXCSUM>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x9
        inet 127.0.0.1 netmask 0xff000000
        inet 172.31.1.1 netmask 0xffffffff
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
vlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 88:ae:1d:61:29:d2
        inet 192.168.255.254 netmask 0xffffff00 broadcast 192.168.255.255
        inet6 fe80::8aae:1dff:fe61:29d2%vlan2 prefixlen 64 scopeid 0xa
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
        vlan: 2 parent interface: alc0
vlan3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 88:ae:1d:61:29:d2
        inet 10.196.179.142 netmask 0xffffff00 broadcast 10.196.179.255
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
        vlan: 3 parent interface: alc0
vlan4: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 88:ae:1d:61:29:d2
        inet 192.168.84.254 netmask 0xffffff00 broadcast 192.168.84.255
        inet6 fe80::8aae:1dff:fe61:29d2%vlan4 prefixlen 64 scopeid 0xc
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
        vlan: 4 parent interface: alc0
ng0: flags=88d1<UP,POINTOPOINT,RUNNING,NOARP,SIMPLEX,MULTICAST> metric 0
mtu 1400
        inet 145.255.22.221 --> 79.140.16.89 netmask 0xffffffff
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


#dmesg
Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.0-RELEASE #1 r237140: Sun Jun 17 12:20:32 YEKT 2012
    niko@louna:/usr/obj/usr/src/sys/LOUNA amd64
CPU: Intel(R) Atom(TM) CPU N450   @ 1.66GHz (1662.63-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x106ca  Family = 6  Model = 1c  Stepping =
10

Features=0xbfe9fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>

Features2=0x40e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant, performance statistics
real memory  = 1073741824 (1024 MB)
avail memory = 1007714304 (961 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <ACRSYS ACRPRDCT>
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 1 core(s) x 2 HTT threads
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP/HT): APIC ID:  1
ioapic0: Changing APIC ID to 4
ioapic0 <Version 2.0> irqs 0-23 on motherboard
kbd1 at kbdmux0
smbios0: <System Management BIOS> at iomem 0xfe120-0xfe13e on motherboard
smbios0: Version: 2.6, BCD Revision: 2.6
acpi0: <ACRSYS ACRPRDCT> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
acpi_ec0: <Embedded Controller: GPE 0x19> port 0x62,0x66 on acpi0
acpi_button0: <Power Button> on acpi0
acpi_button1: <Sleep Button> on acpi0
acpi_lid0: <Control Method Lid Switch> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0x60c0-0x60c7 mem
0x58180000-0x581fffff,0x40000000-0x4fffffff,0x58000000-0x580fffff irq 16 at
device 2.0 on pci0
agp0: <Intel Pineview (M) SVGA controller> on vgapci0
agp0: aperture size is 256M, detected 8188k stolen memory
vgapci1: <VGA-compatible display> mem 0x58100000-0x5817ffff at device 2.1
on pci0
hdac0: <Intel 82801G High Definition Audio Controller> mem
0x58200000-0x58203fff irq 16 at device 27.0 on pci0
pcib1: <ACPI PCI-PCI bridge> at device 28.0 on pci0
pci1: <ACPI PCI bus> on pcib1
alc0: <Atheros AR8132 PCIe Fast Ethernet> port 0x5000-0x507f mem
0x57000000-0x5703ffff irq 16 at device 0.0 on pci1
alc0: 15872 Tx FIFO, 15360 Rx FIFO
alc0: Using 1 MSI message(s).
miibus0: <MII bus> on alc0
atphy0: <Atheros F1 10/100/1000 PHY> PHY 0 on miibus0
atphy0:  none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto,
auto-flow
alc0: Ethernet address: 88:ae:1d:61:29:d2
pcib2: <ACPI PCI-PCI bridge> at device 28.1 on pci0
pci2: <ACPI PCI bus> on pcib2
ath0: <Atheros 9285> mem 0x56000000-0x5600ffff irq 17 at device 0.0 on pci2
ath0: [HT] enabling HT modes
ath0: [HT] 1 RX streams; 1 TX streams
ath0: AR9285 mac 192.2 RF5133 phy 14.0
uhci0: <Intel 82801G (ICH7) USB controller USB-A> port 0x6080-0x609f irq 16
at device 29.0 on pci0
uhci0: LegSup = 0x2f00
usbus0: <Intel 82801G (ICH7) USB controller USB-A> on uhci0
uhci1: <Intel 82801G (ICH7) USB controller USB-B> port 0x6060-0x607f irq 17
at device 29.1 on pci0
uhci1: LegSup = 0x2f00
usbus1: <Intel 82801G (ICH7) USB controller USB-B> on uhci1
uhci2: <Intel 82801G (ICH7) USB controller USB-C> port 0x6040-0x605f irq 18
at device 29.2 on pci0
uhci2: LegSup = 0x2f00
usbus2: <Intel 82801G (ICH7) USB controller USB-C> on uhci2
uhci3: <Intel 82801G (ICH7) USB controller USB-D> port 0x6020-0x603f irq 19
at device 29.3 on pci0
uhci3: LegSup = 0x2f00
usbus3: <Intel 82801G (ICH7) USB controller USB-D> on uhci3
ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem
0x58204400-0x582047ff irq 16 at device 29.7 on pci0
usbus4: EHCI version 1.0
usbus4: <Intel 82801GB/R (ICH7) USB 2.0 controller> on ehci0
pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci5: <ACPI PCI bus> on pcib3
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel ICH7 AHCI SATA controller> port
0x60b8-0x60bf,0x60cc-0x60cf,0x60b0-0x60b7,0x60c8-0x60cb,0x60a0-0x60af mem
0x58204000-0x582043ff irq 17 at device 31.2 on pci0
ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_tz0: <Thermal Zone> on acpi0
battery0: <ACPI Control Method Battery> on acpi0
acpi_acad0: <AC Adapter> on acpi0
atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0
atrtc0: Warning: Couldn't map I/O.
Event timer "RTC" frequency 32768 Hz quality 0
hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on
acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 450
Event timer "HPET1" frequency 14318180 Hz quality 440
Event timer "HPET2" frequency 14318180 Hz quality 440
attimer0: <AT timer> port 0x40-0x43,0x50-0x53 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse, device ID 3
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
coretemp0: <CPU On-Die Thermal Sensors> on cpu0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
coretemp1: <CPU On-Die Thermal Sensors> on cpu1
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
Timecounters tick every 1.000 msec
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based
forwarding enabled, default to accept, logging disabled
DUMMYNET 0 with IPv6 initialized (100409)
load_dn_sched dn_sched RR loaded
load_dn_sched dn_sched WF2Q+ loaded
load_dn_sched dn_sched FIFO loaded
load_dn_sched dn_sched PRIO loaded
load_dn_sched dn_sched QFQ loaded
hdac0: HDA Codec #0: Realtek ALC272
pcm0: <HDA Realtek ALC272 PCM #0 Analog> at cad 0 nid 1 on hdac0
pcm1: <HDA Realtek ALC272 PCM #1 Analog> at cad 0 nid 1 on hdac0
usbus0: 12Mbps Full Speed USB v1.0
usbus1: 12Mbps Full Speed USB v1.0
usbus2: 12Mbps Full Speed USB v1.0
usbus3: 12Mbps Full Speed USB v1.0
usbus4: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
ugen2.1: <Intel> at usbus2
uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
ugen3.1: <Intel> at usbus3
uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
ugen4.1: <Intel> at usbus4
uhub4: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus4
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <WDC WD1600BEVT-22A23T0 01.01A01> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 152627MB (312581808 512 byte sectors: 16H 63S/T 16383C)
SMP: AP CPU #1 Launched!
Root mount waiting for: usbus4 usbus3 usbus2 usbus1 usbus0
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
uhub2: 2 ports with 2 removable, self powered
uhub3: 2 ports with 2 removable, self powered
Root mount waiting for: usbus4
Root mount waiting for: usbus4
Root mount waiting for: usbus4
uhub4: 8 ports with 8 removable, self powered
ugen4.2: <Suyin> at usbus4
Trying to mount root from ufs:/dev/ada0p2 [rw]...


louna# cat /etc/rc.conf
hostname="louna"
sshd_enable="YES"
ntpd_enable="YES"
powerd_enable="YES"


cloned_interfaces="vlan2 vlan3 vlan4"

ifconfig_alc0="up -tso"
ifconfig_vlan3="vlan 3 vlandev alc0 DHCP"
ifconfig_vlan4="vlan 4 vlandev alc0 192.168.84.254/24"
ifconfig_vlan2="vlan 2 vlandev alc0 192.168.255.254/24"
ifconfig_lo0_alias0="inet 172.31.1.1/32"


mpd_enable="YES"
gateway_enable="YES"
devfs_set_rulesets="/usr/local/etc/unbound/dev=unbound_ruleset"
unbound_enable="YES"
dhcpd_enable="YES"
samba_enable="YES"



kernel config^
louna# cat LOUNA

cpu             HAMMER
ident           LOUNA
options         SCHED_ULE               # ULE scheduler
options         PREEMPTION              # Enable kernel thread preemption
options         INET                    # InterNETworking
options         INET6                   # IPv6 communications protocols
options         SCTP                    # Stream Control Transmission
Protocol
options         FFS                     # Berkeley Fast Filesystem
options         SOFTUPDATES             # Enable FFS soft updates support
options         UFS_ACL                 # Support for access control lists
options         UFS_DIRHASH             # Improve performance on big
directories
options         UFS_GJOURNAL            # Enable gjournal-based UFS
journaling
options         NFSCL                   # New Network Filesystem Client
options         NFSD                    # New Network Filesystem Server
options         NFSLOCKD                # Network Lock Manager
options         NFS_ROOT                # NFS usable as /, requires NFSCL
options         MSDOSFS                 # MSDOS Filesystem
options         PROCFS                  # Process filesystem (requires
PSEUDOFS)
options         PSEUDOFS                # Pseudo-filesystem framework
options         GEOM_PART_GPT           # GUID Partition Tables.
options         GEOM_LABEL              # Provides labelization
options         SCSI_DELAY=5000         # Delay (in ms) before probing SCSI
options         KTRACE                  # ktrace(1) support
options         STACK                   # stack(9) support
options         SYSVSHM                 # SYSV-style shared memory
options         SYSVMSG                 # SYSV-style message queues
options         SYSVSEM                 # SYSV-style semaphores
options         _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time
extensions
options         PRINTF_BUFR_SIZE=128    # Prevent printf output being
interspersed.
options         KBD_INSTALL_CDEV        # install a CDEV entry in /dev
options         HWPMC_HOOKS             # Necessary kernel hooks for
hwpmc(4)
options         AUDIT                   # Security event auditing
options         MAC                     # TrustedBSD MAC Framework
options         INCLUDE_CONFIG_FILE     # Include this file in kernel
options         KDB                     # Kernel debugger related code
options         KDB_TRACE               # Print a stack trace for a panic
options         SMP                     # Symmetric MultiProcessor Kernel
device          cpufreq
device          acpi
device          pci
device          ahci            # AHCI-compatible SATA controllers
device          scbus           # SCSI bus (required for ATA/SCSI)
device          da              # Direct Access (disks)
device          atkbdc          # AT keyboard controller
device          atkbd           # AT keyboard
device          psm             # PS/2 mouse
device          kbdmux          # keyboard multiplexer
device          vga             # VGA video card driver
device          sc
options         SC_PIXEL_MODE   # add support for the raster text mode
device          agp             # support several AGP chipsets
device          uart            # Generic UART driver
device          miibus          # MII bus support
device          alc             # Atheros AR8131/AR8132 Ethernet
device          wlan            # 802.11 support
options         IEEE80211_DEBUG # enable debug msgs
options         IEEE80211_AMPDU_AGE # age frames in AMPDU reorder q's
options         IEEE80211_SUPPORT_MESH  # enable 802.11s draft support
device          wlan_wep        # 802.11 WEP support
device          wlan_ccmp       # 802.11 CCMP support
device          wlan_tkip       # 802.11 TKIP support
device          wlan_amrr       # AMRR transmit rate control algorithm
device          ath             # Atheros NIC's
device          ath_pci         # Atheros pci/cardbus glue
device          ath_hal         # pci/cardbus chip support
options         AH_SUPPORT_AR5416       # enable AR5416 tx/rx descriptors
device          ath_rate_sample # SampleRate tx rate control for ath
options ATH_ENABLE_11N
options ATH_DEBUG
options ATH_DIAGAPI
options IEEE80211_DEBUG
device          loop            # Network loopback
device          random          # Entropy device
device          ether           # Ethernet support
device          vlan            # 802.1Q VLAN support
device          tun             # Packet tunnel.
device          pty             # BSD-style compatibility pseudo ttys
device          md              # Memory "disks"
device          gif             # IPv6 and IPv4 tunneling
device          faith           # IPv6-to-IPv4 relaying (translation)
device          firmware        # firmware assist module
device          bpf             # Berkeley packet filter
options         USB_DEBUG       # enable debug msgs
device          uhci            # UHCI PCI->USB interface
device          ohci            # OHCI PCI->USB interface
device          ehci            # EHCI PCI->USB interface (USB 2.0)
device          xhci            # XHCI PCI->USB interface (USB 3.0)
device          usb             # USB Bus (required)
device          uhid            # "Human Interface Devices"
device          ukbd            # Keyboard
device          umass           # Disks/Mass storage - Requires scbus and da
device          u3g             # USB-based 3G modems (Option, Huawei,
Sierra)
device          uplcom          # Prolific PL-2303 serial adapters
device          uslcom          # SI Labs CP2101/CP2102 serial adapters
device          sound           # Generic sound driver (required)
device          snd_hda         # Intel High Definition Audio
device          snd_ich         # Intel, NVidia and other ICH AC'97 Audio

options         IPFIREWALL
options         IPFIREWALL_DEFAULT_TO_ACCEPT
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_FORWARD

options         DEVICE_POLLING
options         DUMMYNET
options         HZ=1000
options         LIBALIAS

options         NETGRAPH
options         NETGRAPH_IPFW
options         NETGRAPH_PPP
options         NETGRAPH_PPTPGRE
options         NETGRAPH_KSOCKET
options         NETGRAPH_IFACE
options         NETGRAPH_TCPMSS
options         NETGRAPH_CAR
options         NETGRAPH_NAT
options         NETGRAPH_SOCKET
options         NETGRAPH_TEE


device          smbios
device          coretemp
device          cpuctl
louna#


2012/10/18 <yongari@freebsd.org>

> Synopsis: [alc] alc network driver + tso + vlan does not work.
>
> State-Changed-From-To: open->feedback
> State-Changed-By: yongari
> State-Changed-When: Thu Oct 18 01:40:32 UTC 2012
> State-Changed-Why:
> I'm pretty sure TSO over VLAN worked well on my box.
> Could you share your exact network configuration and let me know
> how I can reproduce it?
>
>
> Responsible-Changed-From-To: freebsd-net->yongari
> Responsible-Changed-By: yongari
> Responsible-Changed-When: Thu Oct 18 01:40:32 UTC 2012
> Responsible-Changed-Why:
> Grab.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=171520
>

From owner-freebsd-net@FreeBSD.ORG  Sat Oct 20 13:36:37 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 0D84D37F;
 Sat, 20 Oct 2012 13:36:37 +0000 (UTC)
 (envelope-from universite@ukr.net)
Received: from ffe10.ukr.net (ffe10.ukr.net [195.214.192.60])
 by mx1.freebsd.org (Postfix) with ESMTP id A56708FC0C;
 Sat, 20 Oct 2012 13:36:36 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net;
 s=ffe; 
 h=Date:Message-Id:From:To:References:In-Reply-To:Subject:Cc:Content-Type:Content-Transfer-Encoding:MIME-Version;
 bh=v2rhJHGvJQav7sHTNTPV8saJ0U2EfHXkkOqvwplCpL4=; 
 b=QKn2mPFuC+oMSKKDDEfqcadHNTFwhzEEAH6YagqcbHz6UMByjuhrMGQo+zBKN/VuH/0K87aaEoaYnDe5xZqsWZoHCPOFeiqBJI9lU1eCL/0zTHeLdxccNk11+xytl+vtXuiYO6sEpvlReyJvry+qdbBnu3QtvdT8UESRPxbBTUA=;
Received: from mail by ffe10.ukr.net with local ID 1TPYvN-000KJX-8A
 ; Sat, 20 Oct 2012 16:17:09 +0300
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: binary
Content-Type: text/plain; charset="windows-1251"
Subject: Re[2]: kern/171520: [alc] alc network driver + tso + vlan does not
 work.
In-Reply-To: <CAHtHi9kphRZnX53mKeWzG3HcJoq8-E1Cb1JJTQsrybqLwPGk1g@mail.gmail.com>
References: <201210180141.q9I1f53s052539@freefall.freebsd.org>
 <CAHtHi9kphRZnX53mKeWzG3HcJoq8-E1Cb1JJTQsrybqLwPGk1g@mail.gmail.com>
To: "Nikolay Nevzorov" <nevzorovn@gmail.com>
From: "Vladislav Prodan" <universite@ukr.net>
X-Mailer: freemail.ukr.net 4.0
Message-Id: <76980.1350739029.4178195639830380544@ffe10.ukr.net>
X-Browser: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:15.0) Gecko/20100101 Firefox/15.0.1
Date: Sat, 20 Oct 2012 16:17:09 +0300
Cc: freebsd-net@freebsd.org, yongari@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Oct 2012 13:36:37 -0000


I have worked with options driver alc:

ifconfig_alc0="inet 10.1.0.18/24 -tso media 100baseTX mediaopt full-duplex up"
vlans_alc0="255"
ifconfig_alc0_255="inet ZZZ.YYY.XXX.247/24"



  --- �������� ��������� ---
 �� ����: "Nikolay Nevzorov" <nevzorovn@gmail.com>
 ����: yongari@freebsd.org
 ����: 20 ������� 2012, 16:07:42
 ����: Re: kern/171520: [alc] alc network driver + tso + vlan does not work.
 
 


> On my netbook TSO over VLAN doesn't on generic and my kernel in any network
> config.
> 
> #ifconfig
> alc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> 

-- 
Vladislav V. Prodan            
System & Network Administrator 
http://support.od.ua           
+380 67 4584408, +380 99 4060508
VVP88-RIPE


From owner-freebsd-net@FreeBSD.ORG  Sat Oct 20 20:54:45 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 2235BCD8;
 Sat, 20 Oct 2012 20:54:45 +0000 (UTC)
 (envelope-from egrosbein@rdtc.ru)
Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13::5])
 by mx1.freebsd.org (Postfix) with ESMTP id 6D4BC8FC08;
 Sat, 20 Oct 2012 20:54:43 +0000 (UTC)
Received: from eg.sd.rdtc.ru (localhost [127.0.0.1])
 by eg.sd.rdtc.ru (8.14.5/8.14.5) with ESMTP id q9KKseNZ028325;
 Sun, 21 Oct 2012 03:54:40 +0700 (NOVT)
 (envelope-from egrosbein@rdtc.ru)
Message-ID: <50830F8B.4030204@rdtc.ru>
Date: Sun, 21 Oct 2012 03:54:35 +0700
From: Eugene Grosbein <egrosbein@rdtc.ru>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU;
 rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7
MIME-Version: 1.0
To: Nikolay Nevzorov <nevzorovn@gmail.com>
Subject: Re: kern/171520: [alc] alc network driver + tso + vlan does not work.
References: <201210180141.q9I1f53s052539@freefall.freebsd.org>
 <CAHtHi9kphRZnX53mKeWzG3HcJoq8-E1Cb1JJTQsrybqLwPGk1g@mail.gmail.com>
In-Reply-To: <CAHtHi9kphRZnX53mKeWzG3HcJoq8-E1Cb1JJTQsrybqLwPGk1g@mail.gmail.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, yongari@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Oct 2012 20:54:45 -0000

20.10.2012 20:06, Nikolay Nevzorov wrote:

>> Synopsis: [alc] alc network driver + tso + vlan does not work.

It seems you use libalias-based NAT, don't you?

man ipfw says in the BUGS section:

     Due to the architecture of libalias(3), ipfw nat is not compatible with
     the TCP segmentation offloading (TSO).  Thus, to reliably nat your net-
     work traffic, please disable TSO on your NICs using ifconfig(8).

Eugene Grosbein