From owner-freebsd-net@FreeBSD.ORG Sun Oct 14 13:49:49 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7B2F53A8; Sun, 14 Oct 2012 13:49:49 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id 417FC8FC14; Sun, 14 Oct 2012 13:49:49 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so4351322pad.13 for ; Sun, 14 Oct 2012 06:49:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Zq3rSCSY4EAKrnS4QSnj1pJyFlKnPSKdAJfdwX+pjis=; b=FuLz4ioUpk3zSV5mE3/DsZDRnZLFuc+JnoJbDRvG4z87EJqIYLZGz2FvOJohRG5DAC tCA01MNsEPf9dulbvZZnGLI3dum1MLEdoYKreFs6iXL3ZOUgNWrUAlXmVWUCL9ist+Cn W5Izl5xusH7VELw+H4cFoe+rYP4vq1DNNtq8konEtJokwI66xZZO5cn31MKRdzNPBRb1 IhGwveXjqHxdL9prZxM5IgzEXFjF/EndW4PutqI9zX5yRhNvCan9SiShRjuJd/wdurgS pFhztU4uolfH/vq+2M4DD/Bcq/k2Ovch0DAX/RlmcEZpig4CnCuT0smwuB70AyiNiCLC LR2w== MIME-Version: 1.0 Received: by 10.68.218.226 with SMTP id pj2mr29566933pbc.33.1350222589004; Sun, 14 Oct 2012 06:49:49 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Sun, 14 Oct 2012 06:49:48 -0700 (PDT) In-Reply-To: <2b582820-0095-4dbe-b929-ba5eb9d4e0ee@email.android.com> References: <20121009154128.GU34622@FreeBSD.org> <20121012124640.GW89655@FreeBSD.org> <20121012124709.GX89655@FreeBSD.org> <20121012212151.GB89655@glebius.int.ru> <2b582820-0095-4dbe-b929-ba5eb9d4e0ee@email.android.com> Date: Sun, 14 Oct 2012 06:49:48 -0700 X-Google-Sender-Auth: GZ6TXz2zLdDLiraPr0soyP0cCW8 Message-ID: Subject: Re: [CFT/Review] net byte order for AF_INET From: Adrian Chadd To: Aleksandr Rybalko Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 13:49:49 -0000 .. sounds like the beginning of a wiki page to me, describing the mini project, the latest status and the latest patch. :) Adrian On 13 October 2012 11:32, Aleksandr Rybalko wrote: > Gleb Smirnoff =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0= =D0=BB(=D0=B0): > >>On Fri, Oct 12, 2012 at 05:06:29PM -0400, Adrian Chadd wrote: >>A> On 12 October 2012 08:47, Gleb Smirnoff wrote: >>A> > On Fri, Oct 12, 2012 at 04:46:40PM +0400, Gleb Smirnoff wrote: >>A> > T> Latest version of patch for further review and testing >>A> > T> Changelog: >>A> > T> - Fixed TCP checksums >>A> > T> - Added comment about raw sockets byte ordering. >>A> > T> - More explicit htons(0), when assigning ip_off field. >>A> >>A> I've just eyeballed the patch again: >>A> >>A> * You've patched SCTP and IGMP - have you done any SCTP and IGMP >>testing at all? >>A> * This kind of stuff almost begs for some kind of automated test >>suite >>A> for testing IPv4, IPv6, TCP/UDP/ICMP, IGMP, SCTP, all the tunneling >>A> stuff - is there anything out there like this? I know of the IPv6 >>test >>A> suites that exist; what about being able to regression test the >>other >>A> stuff? >> >>Not tested yet: >> >>SCTP >>IGMP >>IPSEC >>siftr(4) >>mrouting >>pfsync, pf_route() >>stf(4) >>ng_ipfw(4) > > No, ng_ipfw tested :-) > >> >>Tested: >> >>TCP/UDP/ICMP >>ip_fragment/ip_reass >>raw socket >>gre(4) as if_gre and as ng_pptpgre >>gif(4) >>pf(4) >>ipfw(4) >>divert(4) >> >>A> Also whilst I'm nitpicking - do you think there's any performance >>A> issues that may creep up? Remember that "performance issues" to me >>A> don't necessarily mean "on a current generation intel", but mean >>"all >>A> those cache starved ARM/MIPS/PPC/Atom boards out there that aren't >>A> natively in network byte order." Making everything use network byte >>A> order throughout the stack is nice for read-only packet work and >>nice >>A> for cache-happy i386s, but what about the rest of the world? >> >>Well, there may be unmeasurable impact. Just a few instructions per >>packet. Some functions may be optimized to store converted length in >>local variable and perform one or two ntohs() operations less. But >>better as a separate change. We've got much more fat optimization >>targets in stack than this. >> >>A> (Don't get me wrong, I think this tidy-up is very nice and maybe >>quite >>A> needed, I just wonder what other unknown magic is hiding behind the >>A> existing code..) >> >>There is so much magic here, and I want to just wipe it away instead >>of learning it to depths. The motivation to finally start this work and >>get it done is several panics due to packet in wrong byte order, which >>I >>am failing to parse and model out which codepath could lead to them. >>Thus >>I decided to fix that in principle. > > > WBW > ------ > Aleksandr Rybalko > > From owner-freebsd-net@FreeBSD.ORG Sun Oct 14 13:55:47 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 18CC75CF; Sun, 14 Oct 2012 13:55:47 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id D5FE38FC08; Sun, 14 Oct 2012 13:55:46 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so4353414pad.13 for ; Sun, 14 Oct 2012 06:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=42R/G8v5yC7536why14Pqza21s9/btAOdnVaH+UYAjo=; b=Jo+OggeNwMVvHcRTzME03lg9wceGYfgrhxQdglQ9y2aOEvVukLYpJHcEbegh/lKEex lm3hlFiua2RyKkPrgAtQsh6i6I39Q+sRCwPqSk9ff6MDgYlHJ55CQGV0a34R+FPhgGOg JiEWLQ3j6OohxS7jHdMXwowIBbFSjDZqVWTaA/WFqCpz8XmxMgXrretZ8UH5ByBpy322 d3MlraZPlRQz7mPyRJlhmFDKCabJRHbNhq0Y2yRb1LvhH69k9725wSNszhmwt++8kTYG omTmIr9O6ppd/rL3DkFH2IYWbTKxQoHb+LLBX4WW6ZR1VaVq6FMoPfgDKfNM5qIj++pi P1UA== MIME-Version: 1.0 Received: by 10.68.218.226 with SMTP id pj2mr29600629pbc.33.1350222946683; Sun, 14 Oct 2012 06:55:46 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Sun, 14 Oct 2012 06:55:46 -0700 (PDT) In-Reply-To: References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> Date: Sun, 14 Oct 2012 06:55:46 -0700 X-Google-Sender-Auth: g4l8vmBdocUwV5o4qnVjz_8hsaE Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Adrian Chadd To: Jack Vogel Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , Luigi Rizzo , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 13:55:47 -0000 God, yes please. Please please please please. Adrian From owner-freebsd-net@FreeBSD.ORG Sun Oct 14 16:44:38 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 52D89D8B; Sun, 14 Oct 2012 16:44:38 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 234B78FC0A; Sun, 14 Oct 2012 16:44:38 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9EGicMe002255; Sun, 14 Oct 2012 16:44:38 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9EGic0D002251; Sun, 14 Oct 2012 16:44:38 GMT (envelope-from linimon) Date: Sun, 14 Oct 2012 16:44:38 GMT Message-Id: <201210141644.q9EGic0D002251@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/172683: [ip6] Duplicate IPv6 Link Local Addresses X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 16:44:38 -0000 Old Synopsis: Duplicate IPv6 Link Local Addresses New Synopsis: [ip6] Duplicate IPv6 Link Local Addresses Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Sun Oct 14 16:44:21 UTC 2012 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=172683 From owner-freebsd-net@FreeBSD.ORG Sun Oct 14 23:30:51 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4360EEA6; Sun, 14 Oct 2012 23:30:51 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 1395D8FC08; Sun, 14 Oct 2012 23:30:51 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9ENUodX037025; Sun, 14 Oct 2012 23:30:50 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9ENUobR037021; Sun, 14 Oct 2012 23:30:50 GMT (envelope-from linimon) Date: Sun, 14 Oct 2012 23:30:50 GMT Message-Id: <201210142330.q9ENUobR037021@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/171838: [oce] [patch] Possible lock reversal and duplicate locks as reported by Witness X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 23:30:51 -0000 Old Synopsis: Possible lock reversal and duplicate locks as reported by Witness New Synopsis: [oce] [patch] Possible lock reversal and duplicate locks as reported by Witness Responsible-Changed-From-To: freebsd-bugs->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Sun Oct 14 23:30:23 UTC 2012 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=171838 From owner-freebsd-net@FreeBSD.ORG Sun Oct 14 23:50:01 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 146C646D for ; Sun, 14 Oct 2012 23:50:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id F110E8FC08 for ; Sun, 14 Oct 2012 23:50:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9ENo0sg037688 for ; Sun, 14 Oct 2012 23:50:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9ENo0ZT037687; Sun, 14 Oct 2012 23:50:00 GMT (envelope-from gnats) Date: Sun, 14 Oct 2012 23:50:00 GMT Message-Id: <201210142350.q9ENo0ZT037687@freefall.freebsd.org> To: freebsd-net@FreeBSD.org Cc: From: Doug Hardie Subject: Re: kern/172683: [ip6] Duplicate IPv6 Link Local Addresses X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Doug Hardie List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 23:50:01 -0000 The following reply was made to PR kern/172683; it has been noted by GNATS. From: Doug Hardie To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/172683: [ip6] Duplicate IPv6 Link Local Addresses Date: Sun, 14 Oct 2012 16:42:50 -0700 Here is some more interesting information on the issue. RFC 4862 states = that if the link-local address is MAC based then it should bring down = the link and log the duplicate address error. Kurt Jaeger pointed out = in a private email that there is a sysctl that is supposed to control = this behavior: net.inet6.ip6.dad_count. The value 1 is to permit the = interface to continue to operate and the value 2 is to stop operation. = Sure enough net.inet6.ip6.dad_count =3D 1 as the default value. I = changed it to 2 and ran the tests again. There was no change in the = performance. The interface remained in use and nothing was logged in = /var/log/messages. Unfortunately I no longer have a Vista machine to = test with since it generates non-MAC related link-local addresses. XP = and Win 7 both use MAC based addresses. Using Vista talking to FreeBSD = 7.2, the Neighbor Advertisement was returned by FreeBSD and Vista chose = another link-local address.= From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 01:00:19 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EF714813; Mon, 15 Oct 2012 01:00:19 +0000 (UTC) (envelope-from emaste@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id C04228FC0A; Mon, 15 Oct 2012 01:00:19 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9F10JMd048813; Mon, 15 Oct 2012 01:00:19 GMT (envelope-from emaste@freefall.freebsd.org) Received: (from emaste@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9F10J0U048808; Mon, 15 Oct 2012 01:00:19 GMT (envelope-from emaste) Date: Mon, 15 Oct 2012 01:00:19 GMT Message-Id: <201210150100.q9F10J0U048808@freefall.freebsd.org> To: gigabyte.tmn@gmail.com, emaste@FreeBSD.org, freebsd-net@FreeBSD.org From: emaste@FreeBSD.org Subject: Re: kern/140634: [vlan] destroying if_lagg interface with if_vlan members causing 100% usage by ifconfig X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 01:00:20 -0000 Synopsis: [vlan] destroying if_lagg interface with if_vlan members causing 100% usage by ifconfig State-Changed-From-To: open->feedback State-Changed-By: emaste State-Changed-When: Mon Oct 15 00:59:14 UTC 2012 State-Changed-Why: A quick browse of the source suggests this should be fixed as of 8.0, and I can confirm that it doesn't happen on 10-CURRENT. If you're able to test on a more recent version please confirm. http://www.freebsd.org/cgi/query-pr.cgi?pr=140634 From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 02:45:21 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CBCABF55 for ; Mon, 15 Oct 2012 02:45:21 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id 8901E8FC16 for ; Mon, 15 Oct 2012 02:45:21 +0000 (UTC) Received: from lstewart.caia.swin.edu.au (lstewart.caia.swin.edu.au [136.186.229.95]) by lauren.room52.net (Postfix) with ESMTPSA id CE73D7E820; Mon, 15 Oct 2012 13:45:12 +1100 (EST) Message-ID: <507B78B8.2000707@freebsd.org> Date: Mon, 15 Oct 2012 13:45:12 +1100 From: Lawrence Stewart User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120814 Thunderbird/14.0 MIME-Version: 1.0 To: "Eggert, Lars" Subject: Re: FreeBSD & bufferbloat? References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=0.0 required=5.0 tests=UNPARSEABLE_RELAY autolearn=unavailable version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on lauren.room52.net Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 02:45:21 -0000 On 10/12/12 03:25, Eggert, Lars wrote: > Hi, > > is anyone in BSD-land working on de-bufferbloating the kernel, similar to what the Linux folks are currently doing? I'll be committing the CAIA Delay-Gradient (CDG) TCP congestion control algorithm shortly. It's still experimental, but it has some useful characteristics in terms of keeping buffer utilisation minimal whilst achieving acceptable goodput even in the face of competition from loss-based algorithms like NewReno. I've included a few relevant links at the end for anyone who wants to know more. On the larger topic of de-bufferbloating the kernel, I'm not aware of anyone who is systematically identifying buffer points and "fixing" them if they are found to suffer from bloat problems. Cheers, Lawrence http://caia.swin.edu.au/cv/dahayes/content/networking2011-cdg-preprint.pdf www.ietf.org/proceedings/84/slides/slides-84-iccrg-2 http://caia.swin.edu.au/urp/newtcp/tools.html From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 06:52:56 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C25D64AE; Mon, 15 Oct 2012 06:52:56 +0000 (UTC) (envelope-from christian@errxtx.net) Received: from stakka.errxtx.net (stakka.errxtx.net [94.23.249.66]) by mx1.freebsd.org (Postfix) with ESMTP id 83AEF8FC08; Mon, 15 Oct 2012 06:52:56 +0000 (UTC) Received: from ip-109-84-0-66.web.vodafone.de ([109.84.0.66] helo=[10.70.99.66]) by stakka.errxtx.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1TNe8O-0000sg-SN; Mon, 15 Oct 2012 08:26:46 +0200 References: <201210121213.11152.jhb@freebsd.org> Mime-Version: 1.0 (1.0) In-Reply-To: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Message-Id: X-Mailer: iPhone Mail (10A403) From: Christian Meutes Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful Date: Mon, 15 Oct 2012 08:26:36 +0200 To: Jason Wolfe Cc: John Baldwin , "net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 06:52:56 -0000 I find the "hack" more than just strange. Because of other OSes bugs FreeBSD= breaks it's own stack. Don't want to know how many connections suffered fro= m this. (Sorry for top-posting) -- Christian On 14.10.2012, at 00:19, Jason Wolfe wrote: > On Fri, Oct 12, 2012 at 9:13 AM, John Baldwin wrote: >> Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted= SYNs >> starting with the 3rd SYN in this block in tcp_timer.c: >>=20 >> /* >> * Disable rfc1323 if we haven't got any response to >> * our third SYN to work-around some broken terminal servers >> * (most of which have hopefully been retired) that have bad VJ >> * header compression code which trashes TCP segments containing >> * unknown-to-them TCP options. >> */ >> if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D 3= )) >> tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); >>=20 >> There is even a PR for the original bug report: kern/1689 >>=20 >> [..snip..] >>=20 >> The original motivation of this change is to work around broken terminal >> servers that were old when this change was added in 2001. Over 10 years l= ater >> I think we should at least have an option to turn this work-around off, a= nd >> possibly disable it by default. >>=20 >> Thoughts? >>=20 >> -- >> John Baldwin >=20 > Not that it alone merits keeping the code in, but there are some cases > where this comes in handy. I ran into an issue with heavily > trafficked Linux <-> FBSD boxes here - > http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031881.html. >=20 > Linux would deny the connection because in FBSD ithe n and outbound > timestamp randomization isn't sync'd to the same base, so when FBSD > would hit a 2MSL connection Linux would simply ignore the SYN. After > the 3rd SYN FBSD would drop support, and Linux would finally honor the > request. I doubt this is too widespread, but it would probably break > things for a few folks. >=20 > Jason > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 07:51:56 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B941BD6D for ; Mon, 15 Oct 2012 07:51:56 +0000 (UTC) (envelope-from eugene@imedia.ru) Received: from mx2.imedia.ru (mx2.imedia.ru [91.230.26.134]) by mx1.freebsd.org (Postfix) with ESMTP id 06E9A8FC16 for ; Mon, 15 Oct 2012 07:51:55 +0000 (UTC) X-All-Recipients: X-DKIM: OpenDKIM Filter v2.5.0 mx2.imedia.ru q9F7ploJ043338 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=imedia.ru; s=common; t=1350287507; bh=WjlcbDHoOCY7wEXrQ8EVumx2PW809d7e+LddX5z1fek=; h=From:Reply-To:To:Subject:Date; b=vP9BwBS3sMq6EF2Js8AksTpLb62mnwjJO30eqm6e6iaimpnGibsHeQB2Pm7cptCY0 nNDMuc6HkVLAshKsqIuFR0rGQeQNUnqK7wWegcHtXM5lS9aMeLbniPoUSg4yl2fWka XDntNCP14IGhBtldLhH1PwFr8pvnLykWbZ2bsBb8= Received: from badger.imedia.ru (root@badger.imedia.ru [10.167.1.243]) by mx2.imedia.ru (8.14.3/8.14.3/TWINS7_LDAP) with ESMTP id q9F7ploJ043338 for ; Mon, 15 Oct 2012 11:51:47 +0400 (MSK) (envelope-from eugene@imedia.ru) Received: from badger.imedia.ru (eugene@localhost [127.0.0.1]) by badger.imedia.ru (8.14.5/8.14.4) with ESMTP id q9F7plln012346 for ; Mon, 15 Oct 2012 11:51:47 +0400 (MSK) (envelope-from eugene@imedia.ru) Received: from localhost (localhost [[UNIX: localhost]]) by badger.imedia.ru (8.14.5/8.14.4/Submit) id q9F7plLh012345 for net@freebsd.org; Mon, 15 Oct 2012 11:51:47 +0400 (MSK) (envelope-from eugene@imedia.ru) X-Authentication-Warning: badger.imedia.ru: eugene set sender to eugene@imedia.ru using -f From: Eugene Mitrofanov Organization: Sanoma Independent Media To: net@freebsd.org Subject: dev.bce.3.mbuf_alloc_failed_count increases permanently Date: Mon, 15 Oct 2012 11:51:47 +0400 User-Agent: KMail/1.9.10 X-Origin: badger.imedia.ru MIME-Version: 1.0 Content-Disposition: inline Message-Id: <201210151151.47161.eugene@imedia.ru> X-Length: 2179 X-UID: 5642 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0.1 (mx2.imedia.ru [10.167.0.252]); Mon, 15 Oct 2012 11:51:47 +0400 (MSK) X-Virus-Scanned: clamav-milter 0.97.4-exp at lynx.imedia.ru X-Virus-Status: Clean X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Eugene Mitrofanov List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 07:51:56 -0000 Hello list! I have FreeBSD 8.2-p3 and observe a strange behaviour sysctl -a | g bce.3|g -vE '(%|stat)'; echo; sleep 10; sysctl -a | g bce.3| g -vE '(%|stat)'; echo; netstat -m dev.bce.3.l2fhdr_error_count: 0 dev.bce.3.mbuf_alloc_failed_count: 2098854 dev.bce.3.mbuf_frag_count: 2655285 dev.bce.3.dma_map_addr_rx_failed_count: 0 dev.bce.3.dma_map_addr_tx_failed_count: 57 dev.bce.3.unexpected_attention_count: 0 dev.bce.3.com_no_buffers: 0 dev.bce.3.l2fhdr_error_count: 0 dev.bce.3.mbuf_alloc_failed_count: 2098856 dev.bce.3.mbuf_frag_count: 2655288 dev.bce.3.dma_map_addr_rx_failed_count: 0 dev.bce.3.dma_map_addr_tx_failed_count: 57 dev.bce.3.unexpected_attention_count: 0 dev.bce.3.com_no_buffers: 0 3022/18143/21165 mbufs in use (current/cache/total) 2039/9179/11218/65536 mbuf clusters in use (current/cache/total/max) 1678/3731 mbuf+clusters out of packet secondary zone in use (current/cache) 0/1672/1672/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/1763/1763/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 4833K/45448K/50282K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 59058137 requests for I/O initiated by sendfile 0 calls to protocol drain routines Any suggestions? Could You advise me what is the reason of this? -- EVM7-RIPE From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 10:11:55 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5BF89E34 for ; Mon, 15 Oct 2012 10:11:55 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id CD6028FC0C for ; Mon, 15 Oct 2012 10:11:54 +0000 (UTC) Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1TNhhe-000Pex-Dl; Mon, 15 Oct 2012 14:15:18 +0400 Message-ID: <507C1960.6050500@FreeBSD.org> Date: Mon, 15 Oct 2012 18:10:40 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120121 Thunderbird/9.0 MIME-Version: 1.0 To: Jack Vogel Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Luigi Rizzo , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 10:11:55 -0000 On 13.10.2012 23:24, Jack Vogel wrote: > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: >> >> one option could be (same as it is done in the timer >> routine in dummynet) to build a list of all the packets >> that need to be sent to if_input(), and then call >> if_input with the entire list outside the lock. >> >> It would be even easier if we modify the various *_input() >> routines to handle a list of mbufs instead of just one. Bulk processing is generally a good idea we probably should implement. Probably starting from driver queue ending with marked mbufs (OURS/forward/legacy processing (appletalk and similar))? This can minimize an impact for all locks on RX side: L2 * rx PFIL hook L3 (both IPv4 and IPv6) * global IF_ADDR_RLOCK (currently commented out) * Per-interface ADDR_RLOCK * PFIL hook From the first glance, there can be problems with: * Increased latency (we should have some kind of rx_process_limit), but still * reader locks being acquired for much longer amount of time >> >> cheers >> luigi >> >> Very interesting idea Luigi, will have to get that some thought. > > Jack Returning to original post topic: Given 1) we are currently binding ixgbe ithreads to CPU cores 2) RX queue lock is used by (indirectly) in only 2 places: a) ISR routine (msix or legacy irq) b) taskqueue routine which is scheduled if some packets remains in RX queue and rx_process_limit ended OR we need something to TX 3) in practice taskqueue routine is a nightmare for many people since there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after some traffic burst happens: once it is called it starts to schedule itself more and more replacing original ISR routine. Additionally, increasing rx_process_limit does not help since taskqueue is called with the same limit. Finally, currently netisr taskq threads are not bound to any CPU which makes the process even more uncontrollable. Maybe we can rethink taskqueue usage for RX processing? I mean, taskq is called if host fails to process packets in ring fast enough, which can happen when: * traffic burst happens on some (or all) queue * traffic ratio is too high. In former case we have ring buffer size which can be tuned by administrator to fairly big value. For latter case: If all system CPUs are used for RX processing moving some uncontrolled percent of load to random CPU definitely does no good (especially given that ixgbe has AIM and RX indirection table for that purposes which can give much more predictable results) It does even more evil in case of special setups like rx_queues=CPU_COUNT-1 and the last CPU is used by all other processes including control plane one (routing software, various keepalives). If system has more CPUs (24 vs 16 queues, for example) there is standard way for distributing load: netisr and deferred processing. Netisr threads are already CPU-bound, and, more important, splitting packets to different threads can be done by performing some (say, L3+L4) hash computation which will not lead to out-of-order packet processing. > >> So my questions are: >>> >>> Can any real LORs happen in some complex setup? (I can't imagine any). >>> If so: maybe we can somehow avoid/workaround such cases? (and consider >>> removing those locks). >>> >>> >>> >>> -- >>> WBR, Alexander >>> >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 11:06:13 2012 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 67EE7511 for ; Mon, 15 Oct 2012 11:06:13 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 35DED8FC2E for ; Mon, 15 Oct 2012 11:06:13 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9FB6D5f011550 for ; Mon, 15 Oct 2012 11:06:13 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9FB6DS9011549 for freebsd-net@FreeBSD.org; Mon, 15 Oct 2012 11:06:13 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 15 Oct 2012 11:06:13 GMT Message-Id: <201210151106.q9FB6DS9011549@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-net@FreeBSD.org Subject: Current problem reports assigned to freebsd-net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 11:06:13 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 13:04:30 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F2C2717E; Mon, 15 Oct 2012 13:04:29 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id B24B08FC17; Mon, 15 Oct 2012 13:04:29 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id F0DF7B984; Mon, 15 Oct 2012 09:04:28 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: ixgbe & if_igb RX ring locking Date: Mon, 15 Oct 2012 09:04:27 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> In-Reply-To: <507C1960.6050500@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210150904.27567.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Oct 2012 09:04:29 -0400 (EDT) Cc: Luigi Rizzo , "Alexander V. Chernikov" , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 13:04:30 -0000 On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: > On 13.10.2012 23:24, Jack Vogel wrote: > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: > > >> > >> one option could be (same as it is done in the timer > >> routine in dummynet) to build a list of all the packets > >> that need to be sent to if_input(), and then call > >> if_input with the entire list outside the lock. > >> > >> It would be even easier if we modify the various *_input() > >> routines to handle a list of mbufs instead of just one. > > Bulk processing is generally a good idea we probably should implement. > Probably starting from driver queue ending with marked mbufs > (OURS/forward/legacy processing (appletalk and similar))? > > This can minimize an impact for all > locks on RX side: > L2 > * rx PFIL hook > L3 (both IPv4 and IPv6) > * global IF_ADDR_RLOCK (currently commented out) > * Per-interface ADDR_RLOCK > * PFIL hook > > From the first glance, there can be problems with: > * Increased latency (we should have some kind of rx_process_limit), but > still > * reader locks being acquired for much longer amount of time > > >> > >> cheers > >> luigi > >> > >> Very interesting idea Luigi, will have to get that some thought. > > > > Jack > > Returning to original post topic: > > Given > 1) we are currently binding ixgbe ithreads to CPU cores > 2) RX queue lock is used by (indirectly) in only 2 places: > a) ISR routine (msix or legacy irq) > b) taskqueue routine which is scheduled if some packets remains in RX > queue and rx_process_limit ended OR we need something to TX > > 3) in practice taskqueue routine is a nightmare for many people since > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > some traffic burst happens: once it is called it starts to schedule > itself more and more replacing original ISR routine. Additionally, > increasing rx_process_limit does not help since taskqueue is called with > the same limit. Finally, currently netisr taskq threads are not bound to > any CPU which makes the process even more uncontrollable. I think part of the problem here is that the taskqueue in ixgbe(4) is bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should just start transmitting packets directly. I fixed this in igb(4) here: http://svnweb.freebsd.org/base?view=revision&revision=233708 You can try this for ixgbe(4). It also comments out a spurious taskqueue reschedule from the watchdog handler that might also lower the taskqueue usage. You can try changing that #if 0 to an #if 1 to test just the txeof changes: Index: ixgbe.c =================================================================== --- ixgbe.c (revision 241579) +++ ixgbe.c (working copy) @@ -149,7 +149,7 @@ static void ixgbe_enable_intr(struct adapter *); static void ixgbe_disable_intr(struct adapter *); static void ixgbe_update_stats_counters(struct adapter *); -static bool ixgbe_txeof(struct tx_ring *); +static void ixgbe_txeof(struct tx_ring *); static bool ixgbe_rxeof(struct ix_queue *, int); static void ixgbe_rx_checksum(u32, struct mbuf *, u32); static void ixgbe_set_promisc(struct adapter *); @@ -1439,8 +1439,9 @@ struct adapter *adapter = que->adapter; struct ixgbe_hw *hw = &adapter->hw; struct tx_ring *txr = adapter->tx_rings; - bool more_tx, more_rx; - u32 reg_eicr, loop = MAX_LOOP; + struct ifnet *ifp = adapter->ifp; + bool more_rx; + u32 reg_eicr; reg_eicr = IXGBE_READ_REG(hw, IXGBE_EICR); @@ -1454,14 +1455,16 @@ more_rx = ixgbe_rxeof(que, adapter->rx_process_limit); IXGBE_TX_LOCK(txr); - do { - more_tx = ixgbe_txeof(txr); - } while (loop-- && more_tx); + ixgbe_txeof(txr); +#if __FreeBSD_version >= 800000 + if (!drbr_empty(ifp, txr->br)) + ixgbe_mq_start_locked(ifp, txr, NULL); +#else + if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) + ixgbe_start_locked(txr, ifp); +#endif IXGBE_TX_UNLOCK(txr); - if (more_rx || more_tx) - taskqueue_enqueue(que->tq, &que->que_task); - /* Check for fan failure */ if ((hw->phy.media_type == ixgbe_media_type_copper) && (reg_eicr & IXGBE_EICR_GPI_SDP1)) { @@ -1474,7 +1477,10 @@ if (reg_eicr & IXGBE_EICR_LSC) taskqueue_enqueue(adapter->tq, &adapter->link_task); - ixgbe_enable_intr(adapter); + if (more_rx) + taskqueue_enqueue(que->tq, &que->que_task); + else + ixgbe_enable_intr(adapter); return; } @@ -1491,7 +1497,8 @@ struct adapter *adapter = que->adapter; struct tx_ring *txr = que->txr; struct rx_ring *rxr = que->rxr; - bool more_tx, more_rx; + struct ifnet *ifp = adapter->ifp; + bool more_rx; u32 newitr = 0; ixgbe_disable_queue(adapter, que->msix); @@ -1500,18 +1507,14 @@ more_rx = ixgbe_rxeof(que, adapter->rx_process_limit); IXGBE_TX_LOCK(txr); - more_tx = ixgbe_txeof(txr); - /* - ** Make certain that if the stack - ** has anything queued the task gets - ** scheduled to handle it. - */ -#if __FreeBSD_version < 800000 - if (!IFQ_DRV_IS_EMPTY(&adapter->ifp->if_snd)) + ixgbe_txeof(txr); +#if __FreeBSD_version >= 800000 + if (!drbr_empty(ifp, txr->br)) + ixgbe_mq_start_locked(ifp, txr, NULL); #else - if (!drbr_empty(adapter->ifp, txr->br)) + if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) + ixgbe_start_locked(txr, ifp); #endif - more_tx = 1; IXGBE_TX_UNLOCK(txr); /* Do AIM now? */ @@ -1565,7 +1568,7 @@ rxr->packets = 0; no_calc: - if (more_tx || more_rx) + if (more_rx) taskqueue_enqueue(que->tq, &que->que_task); else /* Reenable this interrupt */ ixgbe_enable_queue(adapter, que->msix); @@ -2049,8 +2052,10 @@ ++hung; if (txr->queue_status & IXGBE_QUEUE_DEPLETED) ++busy; +#if 0 if ((txr->queue_status & IXGBE_QUEUE_IDLE) == 0) taskqueue_enqueue(que->tq, &que->que_task); +#endif } /* Only truely watchdog if all queues show hung */ if (hung == adapter->num_queues) @@ -3548,7 +3556,7 @@ * tx_buffer is put back on the free queue. * **********************************************************************/ -static bool +static void ixgbe_txeof(struct tx_ring *txr) { struct adapter *adapter = txr->adapter; @@ -3597,13 +3605,13 @@ IXGBE_CORE_UNLOCK(adapter); IXGBE_TX_LOCK(txr); } - return FALSE; + return; } #endif /* DEV_NETMAP */ if (txr->tx_avail == adapter->num_tx_desc) { txr->queue_status = IXGBE_QUEUE_IDLE; - return FALSE; + return; } processed = 0; @@ -3613,7 +3621,7 @@ tx_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[first]; last = tx_buffer->eop_index; if (last == -1) - return FALSE; + return; eop_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[last]; /* @@ -3693,12 +3701,8 @@ if (txr->tx_avail > IXGBE_TX_CLEANUP_THRESHOLD) txr->queue_status &= ~IXGBE_QUEUE_DEPLETED; - if (txr->tx_avail == adapter->num_tx_desc) { + if (txr->tx_avail == adapter->num_tx_desc) txr->queue_status = IXGBE_QUEUE_IDLE; - return (FALSE); - } - - return TRUE; } /********************************************************************* -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 13:04:30 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F2C2717E; Mon, 15 Oct 2012 13:04:29 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id B24B08FC17; Mon, 15 Oct 2012 13:04:29 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id F0DF7B984; Mon, 15 Oct 2012 09:04:28 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: ixgbe & if_igb RX ring locking Date: Mon, 15 Oct 2012 09:04:27 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> In-Reply-To: <507C1960.6050500@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210150904.27567.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Oct 2012 09:04:29 -0400 (EDT) Cc: Luigi Rizzo , "Alexander V. Chernikov" , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 13:04:30 -0000 On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: > On 13.10.2012 23:24, Jack Vogel wrote: > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: > > >> > >> one option could be (same as it is done in the timer > >> routine in dummynet) to build a list of all the packets > >> that need to be sent to if_input(), and then call > >> if_input with the entire list outside the lock. > >> > >> It would be even easier if we modify the various *_input() > >> routines to handle a list of mbufs instead of just one. > > Bulk processing is generally a good idea we probably should implement. > Probably starting from driver queue ending with marked mbufs > (OURS/forward/legacy processing (appletalk and similar))? > > This can minimize an impact for all > locks on RX side: > L2 > * rx PFIL hook > L3 (both IPv4 and IPv6) > * global IF_ADDR_RLOCK (currently commented out) > * Per-interface ADDR_RLOCK > * PFIL hook > > From the first glance, there can be problems with: > * Increased latency (we should have some kind of rx_process_limit), but > still > * reader locks being acquired for much longer amount of time > > >> > >> cheers > >> luigi > >> > >> Very interesting idea Luigi, will have to get that some thought. > > > > Jack > > Returning to original post topic: > > Given > 1) we are currently binding ixgbe ithreads to CPU cores > 2) RX queue lock is used by (indirectly) in only 2 places: > a) ISR routine (msix or legacy irq) > b) taskqueue routine which is scheduled if some packets remains in RX > queue and rx_process_limit ended OR we need something to TX > > 3) in practice taskqueue routine is a nightmare for many people since > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > some traffic burst happens: once it is called it starts to schedule > itself more and more replacing original ISR routine. Additionally, > increasing rx_process_limit does not help since taskqueue is called with > the same limit. Finally, currently netisr taskq threads are not bound to > any CPU which makes the process even more uncontrollable. I think part of the problem here is that the taskqueue in ixgbe(4) is bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should just start transmitting packets directly. I fixed this in igb(4) here: http://svnweb.freebsd.org/base?view=revision&revision=233708 You can try this for ixgbe(4). It also comments out a spurious taskqueue reschedule from the watchdog handler that might also lower the taskqueue usage. You can try changing that #if 0 to an #if 1 to test just the txeof changes: Index: ixgbe.c =================================================================== --- ixgbe.c (revision 241579) +++ ixgbe.c (working copy) @@ -149,7 +149,7 @@ static void ixgbe_enable_intr(struct adapter *); static void ixgbe_disable_intr(struct adapter *); static void ixgbe_update_stats_counters(struct adapter *); -static bool ixgbe_txeof(struct tx_ring *); +static void ixgbe_txeof(struct tx_ring *); static bool ixgbe_rxeof(struct ix_queue *, int); static void ixgbe_rx_checksum(u32, struct mbuf *, u32); static void ixgbe_set_promisc(struct adapter *); @@ -1439,8 +1439,9 @@ struct adapter *adapter = que->adapter; struct ixgbe_hw *hw = &adapter->hw; struct tx_ring *txr = adapter->tx_rings; - bool more_tx, more_rx; - u32 reg_eicr, loop = MAX_LOOP; + struct ifnet *ifp = adapter->ifp; + bool more_rx; + u32 reg_eicr; reg_eicr = IXGBE_READ_REG(hw, IXGBE_EICR); @@ -1454,14 +1455,16 @@ more_rx = ixgbe_rxeof(que, adapter->rx_process_limit); IXGBE_TX_LOCK(txr); - do { - more_tx = ixgbe_txeof(txr); - } while (loop-- && more_tx); + ixgbe_txeof(txr); +#if __FreeBSD_version >= 800000 + if (!drbr_empty(ifp, txr->br)) + ixgbe_mq_start_locked(ifp, txr, NULL); +#else + if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) + ixgbe_start_locked(txr, ifp); +#endif IXGBE_TX_UNLOCK(txr); - if (more_rx || more_tx) - taskqueue_enqueue(que->tq, &que->que_task); - /* Check for fan failure */ if ((hw->phy.media_type == ixgbe_media_type_copper) && (reg_eicr & IXGBE_EICR_GPI_SDP1)) { @@ -1474,7 +1477,10 @@ if (reg_eicr & IXGBE_EICR_LSC) taskqueue_enqueue(adapter->tq, &adapter->link_task); - ixgbe_enable_intr(adapter); + if (more_rx) + taskqueue_enqueue(que->tq, &que->que_task); + else + ixgbe_enable_intr(adapter); return; } @@ -1491,7 +1497,8 @@ struct adapter *adapter = que->adapter; struct tx_ring *txr = que->txr; struct rx_ring *rxr = que->rxr; - bool more_tx, more_rx; + struct ifnet *ifp = adapter->ifp; + bool more_rx; u32 newitr = 0; ixgbe_disable_queue(adapter, que->msix); @@ -1500,18 +1507,14 @@ more_rx = ixgbe_rxeof(que, adapter->rx_process_limit); IXGBE_TX_LOCK(txr); - more_tx = ixgbe_txeof(txr); - /* - ** Make certain that if the stack - ** has anything queued the task gets - ** scheduled to handle it. - */ -#if __FreeBSD_version < 800000 - if (!IFQ_DRV_IS_EMPTY(&adapter->ifp->if_snd)) + ixgbe_txeof(txr); +#if __FreeBSD_version >= 800000 + if (!drbr_empty(ifp, txr->br)) + ixgbe_mq_start_locked(ifp, txr, NULL); #else - if (!drbr_empty(adapter->ifp, txr->br)) + if (!IFQ_DRV_IS_EMPTY(&ifp->if_snd)) + ixgbe_start_locked(txr, ifp); #endif - more_tx = 1; IXGBE_TX_UNLOCK(txr); /* Do AIM now? */ @@ -1565,7 +1568,7 @@ rxr->packets = 0; no_calc: - if (more_tx || more_rx) + if (more_rx) taskqueue_enqueue(que->tq, &que->que_task); else /* Reenable this interrupt */ ixgbe_enable_queue(adapter, que->msix); @@ -2049,8 +2052,10 @@ ++hung; if (txr->queue_status & IXGBE_QUEUE_DEPLETED) ++busy; +#if 0 if ((txr->queue_status & IXGBE_QUEUE_IDLE) == 0) taskqueue_enqueue(que->tq, &que->que_task); +#endif } /* Only truely watchdog if all queues show hung */ if (hung == adapter->num_queues) @@ -3548,7 +3556,7 @@ * tx_buffer is put back on the free queue. * **********************************************************************/ -static bool +static void ixgbe_txeof(struct tx_ring *txr) { struct adapter *adapter = txr->adapter; @@ -3597,13 +3605,13 @@ IXGBE_CORE_UNLOCK(adapter); IXGBE_TX_LOCK(txr); } - return FALSE; + return; } #endif /* DEV_NETMAP */ if (txr->tx_avail == adapter->num_tx_desc) { txr->queue_status = IXGBE_QUEUE_IDLE; - return FALSE; + return; } processed = 0; @@ -3613,7 +3621,7 @@ tx_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[first]; last = tx_buffer->eop_index; if (last == -1) - return FALSE; + return; eop_desc = (struct ixgbe_legacy_tx_desc *)&txr->tx_base[last]; /* @@ -3693,12 +3701,8 @@ if (txr->tx_avail > IXGBE_TX_CLEANUP_THRESHOLD) txr->queue_status &= ~IXGBE_QUEUE_DEPLETED; - if (txr->tx_avail == adapter->num_tx_desc) { + if (txr->tx_avail == adapter->num_tx_desc) txr->queue_status = IXGBE_QUEUE_IDLE; - return (FALSE); - } - - return TRUE; } /********************************************************************* -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 15:17:27 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6564781B; Mon, 15 Oct 2012 15:17:27 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 364EE8FC0A; Mon, 15 Oct 2012 15:17:27 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 917D4B911; Mon, 15 Oct 2012 11:17:26 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful Date: Mon, 15 Oct 2012 09:08:36 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <201210121213.11152.jhb@freebsd.org> In-Reply-To: <201210121213.11152.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210150908.36498.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Oct 2012 11:17:26 -0400 (EDT) Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 15:17:27 -0000 On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote: > Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted SYNs > starting with the 3rd SYN in this block in tcp_timer.c: > > /* > * Disable rfc1323 if we haven't got any response to > * our third SYN to work-around some broken terminal servers > * (most of which have hopefully been retired) that have bad VJ > * header compression code which trashes TCP segments containing > * unknown-to-them TCP options. > */ > if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3)) > tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP); > > There is even a PR for the original bug report: kern/1689 > > However, there is an unintended consequence of this change that can be > disastrous. Specifically, suppose you have a FreeBSD client connecting to a > server, and that the SYNs are arriving at the server successfully, but the > first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it through > and the connection is established. > > The server (based on the first SYN it saw) believes it has negotiated window > scaling with the client. The client, however, has broken what it promised in > that first SYN and believes it is not using any window scaling at all. This > causes two forms of breakage: > > 1) When the server advertises a scaled window (e.g. '8' for a 64k window > scaled at 13), the client thinks it is an unscaled window ('8') and > sends data to the server very slowly. > > 2) When the client advertises an unscaled window (e.g. '65535' for a 64k > window), the server thinks it has a huge window (65535 << 13 == 511MB) > to send into. > > I'm not sure that 2) is a problem per se, but I have definitely seen instances > of 1) (and examined the 'struct tcpcb' in kgdb on both the server and client > end of the connections to verify they disagreed on the scaling). > > The original motivation of this change is to work around broken terminal > servers that were old when this change was added in 2001. Over 10 years later > I think we should at least have an option to turn this work-around off, and > possibly disable it by default. > > Thoughts? How about this: Index: tcp_timer.c =================================================================== --- tcp_timer.c (revision 241579) +++ tcp_timer.c (working copy) @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, CTLFL /* max idle probes */ int tcp_maxpersistidle; +static int tcp_rexmit_drop_options = 0; +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW, + &tcp_rexmit_drop_options, 0, + "Drop TCP options from 3rd and later retransmitted SYN"); + static int per_cpu_timers = 0; SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW, &per_cpu_timers , 0, "run tcp timers on all cpus"); @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp) * header compression code which trashes TCP segments containing * unknown-to-them TCP options. */ - if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3)) + if (tcp_rexmit_drop_options && (tp->t_state == TCPS_SYN_SENT) && + (tp->t_rxtshift == 3)) tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP); /* * If we backed off this far, our srtt estimate is probably bogus. Any other suggestions on the sysctl name? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 15:17:27 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6564781B; Mon, 15 Oct 2012 15:17:27 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 364EE8FC0A; Mon, 15 Oct 2012 15:17:27 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 917D4B911; Mon, 15 Oct 2012 11:17:26 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful Date: Mon, 15 Oct 2012 09:08:36 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <201210121213.11152.jhb@freebsd.org> In-Reply-To: <201210121213.11152.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210150908.36498.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Oct 2012 11:17:26 -0400 (EDT) Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 15:17:27 -0000 On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote: > Back in 2001 FreeBSD added a hack to strip TCP options from retransmitted SYNs > starting with the 3rd SYN in this block in tcp_timer.c: > > /* > * Disable rfc1323 if we haven't got any response to > * our third SYN to work-around some broken terminal servers > * (most of which have hopefully been retired) that have bad VJ > * header compression code which trashes TCP segments containing > * unknown-to-them TCP options. > */ > if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3)) > tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP); > > There is even a PR for the original bug report: kern/1689 > > However, there is an unintended consequence of this change that can be > disastrous. Specifically, suppose you have a FreeBSD client connecting to a > server, and that the SYNs are arriving at the server successfully, but the > first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it through > and the connection is established. > > The server (based on the first SYN it saw) believes it has negotiated window > scaling with the client. The client, however, has broken what it promised in > that first SYN and believes it is not using any window scaling at all. This > causes two forms of breakage: > > 1) When the server advertises a scaled window (e.g. '8' for a 64k window > scaled at 13), the client thinks it is an unscaled window ('8') and > sends data to the server very slowly. > > 2) When the client advertises an unscaled window (e.g. '65535' for a 64k > window), the server thinks it has a huge window (65535 << 13 == 511MB) > to send into. > > I'm not sure that 2) is a problem per se, but I have definitely seen instances > of 1) (and examined the 'struct tcpcb' in kgdb on both the server and client > end of the connections to verify they disagreed on the scaling). > > The original motivation of this change is to work around broken terminal > servers that were old when this change was added in 2001. Over 10 years later > I think we should at least have an option to turn this work-around off, and > possibly disable it by default. > > Thoughts? How about this: Index: tcp_timer.c =================================================================== --- tcp_timer.c (revision 241579) +++ tcp_timer.c (working copy) @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, CTLFL /* max idle probes */ int tcp_maxpersistidle; +static int tcp_rexmit_drop_options = 0; +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW, + &tcp_rexmit_drop_options, 0, + "Drop TCP options from 3rd and later retransmitted SYN"); + static int per_cpu_timers = 0; SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW, &per_cpu_timers , 0, "run tcp timers on all cpus"); @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp) * header compression code which trashes TCP segments containing * unknown-to-them TCP options. */ - if ((tp->t_state == TCPS_SYN_SENT) && (tp->t_rxtshift == 3)) + if (tcp_rexmit_drop_options && (tp->t_state == TCPS_SYN_SENT) && + (tp->t_rxtshift == 3)) tp->t_flags &= ~(TF_REQ_SCALE|TF_REQ_TSTMP); /* * If we backed off this far, our srtt estimate is probably bogus. Any other suggestions on the sysctl name? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 16:29:28 2012 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8262FC8E; Mon, 15 Oct 2012 16:29:28 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id ED7628FC14; Mon, 15 Oct 2012 16:29:27 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGTQx0020725; Mon, 15 Oct 2012 20:29:26 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGTQCf020724; Mon, 15 Oct 2012 20:29:26 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 15 Oct 2012 20:29:26 +0400 From: Gleb Smirnoff To: "Alexander V. Chernikov" Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <20121015162926.GV89655@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <5079A9A1.4070403@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Jack Vogel , net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:29:28 -0000 On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote: A> Packets receiving code for both ixgbe and if_igb looks like the following: A> ixgbe_msix_que A> A> -- ixgbe_rxeof() A> { A> IXGBE_RX_LOCK(rxr); A> while A> { A> get_packet; A> A> -- ixgbe_rx_input() A> { A> ++ IXGBE_RX_UNLOCK(rxr); A> if_input(packet); A> ++ IXGBE_RX_LOCK(rxr); A> } A> A> } A> IXGBE_RX_UNLOCK(rxr); A> } A> A> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe). A> A> These lines probably do LORs masking (if any) well. A> However, such change introduce quite significant performance drop: A> A> On my routing setup (nearly the same from previous -Intel 10G thread in A> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is A> nearly 20%. A> A> So my questions are: A> A> Can any real LORs happen in some complex setup? (I can't imagine any). A> If so: maybe we can somehow avoid/workaround such cases? (and consider A> removing those locks). To me this unlock/lock looks like a legacy from times, when the driver had a single mutex for both TX and RX parts. And removing this re-locking in foo_rxeof() was one of the aims for separate TX/RX locking. Really, lurking through history shows that once driver had split its locking to separate RX and TX part, these unlock/lock was removed. However, later this unlock/lock was added back: http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup , without any comments for the reason it is added back. -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 16:32:12 2012 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 951D4F8D; Mon, 15 Oct 2012 16:32:12 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id 02F0C8FC0C; Mon, 15 Oct 2012 16:32:11 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGWA3Y020752; Mon, 15 Oct 2012 20:32:10 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGWAT9020751; Mon, 15 Oct 2012 20:32:10 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 15 Oct 2012 20:32:10 +0400 From: Gleb Smirnoff To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <20121015163210.GW89655@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <201210150904.27567.jhb@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-net@FreeBSD.org, "Alexander V. Chernikov" , Luigi Rizzo , Jack Vogel , net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:32:12 -0000 On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: J> > 3) in practice taskqueue routine is a nightmare for many people since J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after J> > some traffic burst happens: once it is called it starts to schedule J> > itself more and more replacing original ISR routine. Additionally, J> > increasing rx_process_limit does not help since taskqueue is called with J> > the same limit. Finally, currently netisr taskq threads are not bound to J> > any CPU which makes the process even more uncontrollable. J> J> I think part of the problem here is that the taskqueue in ixgbe(4) is J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should J> just start transmitting packets directly. J> J> I fixed this in igb(4) here: J> J> http://svnweb.freebsd.org/base?view=revision&revision=233708 The problem Alexander describes in 3) definitely wasn't fixed in r233708. It is still present in head/, and it prevents me to do good benchmarking of pf(4) on igb(4). The problem is related to RX handling, so I don't see how r233708 could fix it. -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 16:32:12 2012 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 951D4F8D; Mon, 15 Oct 2012 16:32:12 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id 02F0C8FC0C; Mon, 15 Oct 2012 16:32:11 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGWA3Y020752; Mon, 15 Oct 2012 20:32:10 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGWAT9020751; Mon, 15 Oct 2012 20:32:10 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 15 Oct 2012 20:32:10 +0400 From: Gleb Smirnoff To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <20121015163210.GW89655@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <201210150904.27567.jhb@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-net@FreeBSD.org, "Alexander V. Chernikov" , Luigi Rizzo , Jack Vogel , net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:32:12 -0000 On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: J> > 3) in practice taskqueue routine is a nightmare for many people since J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after J> > some traffic burst happens: once it is called it starts to schedule J> > itself more and more replacing original ISR routine. Additionally, J> > increasing rx_process_limit does not help since taskqueue is called with J> > the same limit. Finally, currently netisr taskq threads are not bound to J> > any CPU which makes the process even more uncontrollable. J> J> I think part of the problem here is that the taskqueue in ixgbe(4) is J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should J> just start transmitting packets directly. J> J> I fixed this in igb(4) here: J> J> http://svnweb.freebsd.org/base?view=revision&revision=233708 The problem Alexander describes in 3) definitely wasn't fixed in r233708. It is still present in head/, and it prevents me to do good benchmarking of pf(4) on igb(4). The problem is related to RX handling, so I don't see how r233708 could fix it. -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 16:39:26 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 007C22DB; Mon, 15 Oct 2012 16:39:25 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 768448FC1B; Mon, 15 Oct 2012 16:39:25 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so6975292vbm.13 for ; Mon, 15 Oct 2012 09:39:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=fCEc0qdOMh4veUmcKP6NL+cBTZTtL7osxYL4tQsWYPo=; b=DB8n0AD7ueC51bAjJKoNBrLpFeiRKj/7AZhVe4AFs/fcymNDWrpmmCq8qgoPv9nOBA Ar/duWzq+N7AQMp83PTIe95gs2HE5iTZiSo3WOmjaN1SrPY6VGbOqrJiyrZea+HpX3Cw 8lG31GqcUaAizpQtQjsEoAUrfC1knuOsrTFs6V9CSaQHbsEB6Vi/GfLXL7VBy2UIrAJd mzOFfkpolsr00DMNgUhGM3TqnSq6UczUY8I8Re6lY8QBFcV+v8+kArMGrnaFJqxlq93f 3WF7uJAviu/U2r1Szl6qQ3aZSzwujQukr52mG0zUcHz4/8UcSInsP4c7+8DUyzME9QBI VHrg== MIME-Version: 1.0 Received: by 10.58.1.101 with SMTP id 5mr7344899vel.40.1350319164396; Mon, 15 Oct 2012 09:39:24 -0700 (PDT) Received: by 10.58.68.8 with HTTP; Mon, 15 Oct 2012 09:39:24 -0700 (PDT) In-Reply-To: <20121015162926.GV89655@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> Date: Mon, 15 Oct 2012 09:39:24 -0700 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Jack Vogel To: Gleb Smirnoff Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:39:26 -0000 On Mon, Oct 15, 2012 at 9:29 AM, Gleb Smirnoff wrote: > On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote: > A> Packets receiving code for both ixgbe and if_igb looks like the > following: > A> ixgbe_msix_que > A> > A> -- ixgbe_rxeof() > A> { > A> IXGBE_RX_LOCK(rxr); > A> while > A> { > A> get_packet; > A> > A> -- ixgbe_rx_input() > A> { > A> ++ IXGBE_RX_UNLOCK(rxr); > A> if_input(packet); > A> ++ IXGBE_RX_LOCK(rxr); > A> } > A> > A> } > A> IXGBE_RX_UNLOCK(rxr); > A> } > A> > A> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe). > A> > A> These lines probably do LORs masking (if any) well. > A> However, such change introduce quite significant performance drop: > A> > A> On my routing setup (nearly the same from previous -Intel 10G thread in > A> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is > A> nearly 20%. > A> > A> So my questions are: > A> > A> Can any real LORs happen in some complex setup? (I can't imagine any). > A> If so: maybe we can somehow avoid/workaround such cases? (and consider > A> removing those locks). > > To me this unlock/lock looks like a legacy from times, when the driver > had a single mutex for both TX and RX parts. > > And removing this re-locking in foo_rxeof() was one of the aims for > separate > TX/RX locking. > > Really, lurking through history shows that once driver had split its > locking > to separate RX and TX part, these unlock/lock was removed. However, later > this unlock/lock was added back: > > > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup > > , without any comments for the reason it is added back. > > I did not want to add it back, there were problems that constrained me to do so, although its been some time, I'd be happy to do some testing again without and see. Jack From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 16:50:20 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 0C81C5F0; Mon, 15 Oct 2012 16:50:20 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id 226B83B655C; Mon, 15 Oct 2012 16:50:18 +0000 (UTC) Message-ID: <507C3E8B.1000307@FreeBSD.org> Date: Mon, 15 Oct 2012 20:49:15 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: Jack Vogel Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:50:20 -0000 On 15.10.2012 20:39, Jack Vogel wrote: > > > > I did not want to add it back, there were problems that constrained me > to do so, although its > been some time, I'd be happy to do some testing again without and see. > We've got more than hundred routers/firewalls running under heavy load without this lock (pre- 2.3.8 version, modified drivers) on both ixgbe / igb. > Jack > > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 16:58:30 2012 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 895EE87F; Mon, 15 Oct 2012 16:58:30 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id F31658FC08; Mon, 15 Oct 2012 16:58:29 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9FGwS3d020954; Mon, 15 Oct 2012 20:58:28 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9FGwSTn020953; Mon, 15 Oct 2012 20:58:28 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 15 Oct 2012 20:58:28 +0400 From: Gleb Smirnoff To: Jack Vogel Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <20121015165828.GX89655@glebius.int.ru> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "Alexander V. Chernikov" , net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 16:58:30 -0000 On Mon, Oct 15, 2012 at 09:39:24AM -0700, Jack Vogel wrote: J> > To me this unlock/lock looks like a legacy from times, when the driver J> > had a single mutex for both TX and RX parts. J> > J> > And removing this re-locking in foo_rxeof() was one of the aims for J> > separate J> > TX/RX locking. J> > J> > Really, lurking through history shows that once driver had split its J> > locking J> > to separate RX and TX part, these unlock/lock was removed. However, later J> > this unlock/lock was added back: J> > J> > J> > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup J> > J> > , without any comments for the reason it is added back. J> > J> > I did not want to add it back, there were problems that constrained me to J> do so, although its J> been some time, I'd be happy to do some testing again without and see. Can you please dig through mail archives to identify these problems? I can't imagine any. -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 17:27:13 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9CD2BBF6; Mon, 15 Oct 2012 17:27:13 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 24B7B8FC1A; Mon, 15 Oct 2012 17:27:13 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so7047457vbm.13 for ; Mon, 15 Oct 2012 10:27:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=4MNchQ5xQdhoam5MbxDbY7jC8+4pI2r+OrtoDo+4/0c=; b=wb2tnoQoG2bVaTe/QkKUArXmvc0KY+WKYdF4Glo/OAfBZx0zXSRJpizyYF7VvpklTA hnGncBqxXcvnr22r9yevNNKEa6BtcrlEmg5Fu6Kt/G9/CEWm5RQF45lz0xdYFNG/7Xe7 ykKs1XzpgvdmU8XydjGX1tcMmtmrD/esJq5YTRffnt2GhErsHVdi1rtgyUdAUvyhMWvS zS0SDd/EccH/GaTzrY9gmMoHH/+CYntyLM6FWKFO6C3Dl1tE2PXCTVBjmpQeY57oAnH+ dE3bqWLQwkM1yov3sBwEIoujg7n2mS9RMGBKD6J60lXoMIglVGFPdOJdHuo4IR3pJz3Z zXNw== MIME-Version: 1.0 Received: by 10.52.65.147 with SMTP id x19mr5858273vds.113.1350322032593; Mon, 15 Oct 2012 10:27:12 -0700 (PDT) Received: by 10.58.68.8 with HTTP; Mon, 15 Oct 2012 10:27:12 -0700 (PDT) In-Reply-To: <20121015165828.GX89655@glebius.int.ru> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> <20121015165828.GX89655@glebius.int.ru> Date: Mon, 15 Oct 2012 10:27:12 -0700 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Jack Vogel To: Gleb Smirnoff Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "Alexander V. Chernikov" , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 17:27:13 -0000 On Mon, Oct 15, 2012 at 9:58 AM, Gleb Smirnoff wrote: > On Mon, Oct 15, 2012 at 09:39:24AM -0700, Jack Vogel wrote: > J> > To me this unlock/lock looks like a legacy from times, when the driver > J> > had a single mutex for both TX and RX parts. > J> > > J> > And removing this re-locking in foo_rxeof() was one of the aims for > J> > separate > J> > TX/RX locking. > J> > > J> > Really, lurking through history shows that once driver had split its > J> > locking > J> > to separate RX and TX part, these unlock/lock was removed. However, > later > J> > this unlock/lock was added back: > J> > > J> > > J> > > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup > J> > > J> > , without any comments for the reason it is added back. > J> > > J> > I did not want to add it back, there were problems that constrained > me to > J> do so, although its > J> been some time, I'd be happy to do some testing again without and see. > > Can you please dig through mail archives to identify these problems? I > can't imagine any. > > It may not be in email, there were tests going on internally here that I often was working with... At this point it doesn't matter, Alexander says its running without, I will have some more testing on current code and go from there. Jack From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 19:23:26 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E7535319; Mon, 15 Oct 2012 19:23:26 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id B7F6A8FC0A; Mon, 15 Oct 2012 19:23:26 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1BBEBB911; Mon, 15 Oct 2012 15:23:26 -0400 (EDT) From: John Baldwin To: Gleb Smirnoff Subject: Re: ixgbe & if_igb RX ring locking Date: Mon, 15 Oct 2012 14:14:27 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org> In-Reply-To: <20121015163210.GW89655@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <201210151414.27318.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Oct 2012 15:23:26 -0400 (EDT) Cc: freebsd-net@freebsd.org, "Alexander V. Chernikov" , Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 19:23:27 -0000 On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote: > On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: > J> > 3) in practice taskqueue routine is a nightmare for many people since > J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > J> > some traffic burst happens: once it is called it starts to schedule > J> > itself more and more replacing original ISR routine. Additionally, > J> > increasing rx_process_limit does not help since taskqueue is called with > J> > the same limit. Finally, currently netisr taskq threads are not bound to > J> > any CPU which makes the process even more uncontrollable. > J> > J> I think part of the problem here is that the taskqueue in ixgbe(4) is > J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > J> just start transmitting packets directly. > J> > J> I fixed this in igb(4) here: > J> > J> http://svnweb.freebsd.org/base?view=revision&revision=233708 > > The problem Alexander describes in 3) definitely wasn't fixed in r233708. > > It is still present in head/, and it prevents me to do good benchmarking > of pf(4) on igb(4). > > The problem is related to RX handling, so I don't see how r233708 could > fix it. Before 233708, if you had a single TX packet waiting to go out and an RX interrupt arrived, the task queue would be constantly reschedule causing it to effectively spin at 100% until the TX packet was completely transmitted and the hardware had updated the descriptor to mark it as complete. In fact, as long as you have any pending TX packets at all it will keep spinning until it gets into a state where you have no pending TX packets (so a steady stream of TX packets, including, say ACKs would cause the taskqueue to run forever). In general I think that with MSI-X you should just use an RX processing limit of -1. Anything else is just adding overhead in the form of extra context switches. Neither the task or the MSI-X interrupt handler are on a thread that is shared with any other tasks or handlers, so all that scheduling (or rescheduling) the task will do is result in the task being immediately run (after either a context switch or returning back to the main loop of the taskqueue thread). If you look at the drivers, if a burst of RX traffic ends, the taskqueue should stop running and stop polling the hardware. It is only the TX side that gets stuck needlessly polling. The watchdog timer rescheduling the handler once a second when there is no watchdog condition doesn't help matters either, but I think that is unique to ixgbe(4). It would be good if you could determine exactly why igb thinks it needs to reschedule the taskqueue in your test case on igb(4) post 233708. -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 19:23:26 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E7535319; Mon, 15 Oct 2012 19:23:26 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id B7F6A8FC0A; Mon, 15 Oct 2012 19:23:26 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1BBEBB911; Mon, 15 Oct 2012 15:23:26 -0400 (EDT) From: John Baldwin To: Gleb Smirnoff Subject: Re: ixgbe & if_igb RX ring locking Date: Mon, 15 Oct 2012 14:14:27 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org> In-Reply-To: <20121015163210.GW89655@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <201210151414.27318.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 15 Oct 2012 15:23:26 -0400 (EDT) Cc: freebsd-net@freebsd.org, "Alexander V. Chernikov" , Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 19:23:27 -0000 On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote: > On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: > J> > 3) in practice taskqueue routine is a nightmare for many people since > J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > J> > some traffic burst happens: once it is called it starts to schedule > J> > itself more and more replacing original ISR routine. Additionally, > J> > increasing rx_process_limit does not help since taskqueue is called with > J> > the same limit. Finally, currently netisr taskq threads are not bound to > J> > any CPU which makes the process even more uncontrollable. > J> > J> I think part of the problem here is that the taskqueue in ixgbe(4) is > J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > J> just start transmitting packets directly. > J> > J> I fixed this in igb(4) here: > J> > J> http://svnweb.freebsd.org/base?view=revision&revision=233708 > > The problem Alexander describes in 3) definitely wasn't fixed in r233708. > > It is still present in head/, and it prevents me to do good benchmarking > of pf(4) on igb(4). > > The problem is related to RX handling, so I don't see how r233708 could > fix it. Before 233708, if you had a single TX packet waiting to go out and an RX interrupt arrived, the task queue would be constantly reschedule causing it to effectively spin at 100% until the TX packet was completely transmitted and the hardware had updated the descriptor to mark it as complete. In fact, as long as you have any pending TX packets at all it will keep spinning until it gets into a state where you have no pending TX packets (so a steady stream of TX packets, including, say ACKs would cause the taskqueue to run forever). In general I think that with MSI-X you should just use an RX processing limit of -1. Anything else is just adding overhead in the form of extra context switches. Neither the task or the MSI-X interrupt handler are on a thread that is shared with any other tasks or handlers, so all that scheduling (or rescheduling) the task will do is result in the task being immediately run (after either a context switch or returning back to the main loop of the taskqueue thread). If you look at the drivers, if a burst of RX traffic ends, the taskqueue should stop running and stop polling the hardware. It is only the TX side that gets stuck needlessly polling. The watchdog timer rescheduling the handler once a second when there is no watchdog condition doesn't help matters either, but I think that is unique to ixgbe(4). It would be good if you could determine exactly why igb thinks it needs to reschedule the taskqueue in your test case on igb(4) post 233708. -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 20:48:35 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AE5D6F29; Mon, 15 Oct 2012 20:48:35 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3ACF38FC08; Mon, 15 Oct 2012 20:48:34 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fw7so8162445vcb.13 for ; Mon, 15 Oct 2012 13:48:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=9vwLYR0bQ5U/NIC0hJlBagJoyCUEI16v7Zr36M4Sugk=; b=No1NjvzBl1SSS9FhzOz9Ad3Ce0hr8ff9bN6LjgsWns9+NWylt3/j/2w48GH3ukwMNA n5GRYDsVeTkEQRKup1qhoPQItKjYTcFTq17qG88Hxy/GQil/HUNfrdB5MN3vkBW4vMVy rliXgPoT5fgGkKjXlWRHOi5225mcULdLKH/Na+udYLlv+maa1Sm7z1BnfUsEtJRltMof 8f9wK0nVciDqTkRu+MVbgJmq7BxAT8RDQx3nDVx1l3bPmFFIjEABjrb/3RjJhRpiYq+s X4VwZ4nKF2ZeDSBhx34I9G9+tleaatGXPo2lvn7nIXpLIVWVHS4FV9qjiKKOYOtoZkEv yAHQ== MIME-Version: 1.0 Received: by 10.52.155.199 with SMTP id vy7mr6121885vdb.54.1350334114196; Mon, 15 Oct 2012 13:48:34 -0700 (PDT) Received: by 10.58.207.114 with HTTP; Mon, 15 Oct 2012 13:48:34 -0700 (PDT) In-Reply-To: <20121015162926.GV89655@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> Date: Mon, 15 Oct 2012 16:48:34 -0400 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Ryan Stone To: Gleb Smirnoff Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 20:48:35 -0000 On Mon, Oct 15, 2012 at 12:29 PM, Gleb Smirnoff wrote: > To me this unlock/lock looks like a legacy from times, when the driver > had a single mutex for both TX and RX parts. > > And removing this re-locking in foo_rxeof() was one of the aims for separate > TX/RX locking. > > Really, lurking through history shows that once driver had split its locking > to separate RX and TX part, these unlock/lock was removed. However, later > this unlock/lock was added back: > > http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup > > , without any comments for the reason it is added back. There's a convoluted LOR if you call into the stack with the RX lock held which is described here: http://lists.freebsd.org/pipermail/freebsd-net/2012-September/033371.html From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 22:36:58 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 18D8B52B; Mon, 15 Oct 2012 22:36:58 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id C171B8FC0A; Mon, 15 Oct 2012 22:36:57 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id rp8so5740118pbb.13 for ; Mon, 15 Oct 2012 15:36:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=NBeUqkxXqFANk8PmvxmkSkAK7q/Xh5Ri1zXO4hpRjz4=; b=q4DUF1Hs0bpzF44xS+NaYIkjvnNknLLQkoUlWn7lEQsaw7NsbCjv5iQz3pyqVpiJS3 MqEishawiHO4OCp11L/jcojcbDAtxuYNicKNPbkKr2kAgsgYZlTj6BM0tXnohiZtzTFU mCdqfXFPLuFM1oPuSKRVqsWyth1ugSyYAYFW5IRXeLGtEN26fuY+ir2SnD3iJozCPgAq 9wukRk5pI/1ZyKdJwcmyzkk5obX1oR1yydgwj9/TXlCBRctFCx4xhIOgPRvPjPNikZbh 4WdW7XVYMGqc+dXMRJKCESvoVJV3FKVPuC1HNJAJ/m48INUsNJ316D+5oJf8cZKaOIG0 4NXg== MIME-Version: 1.0 Received: by 10.66.80.133 with SMTP id r5mr10706792pax.24.1350340617342; Mon, 15 Oct 2012 15:36:57 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Mon, 15 Oct 2012 15:36:57 -0700 (PDT) In-Reply-To: <201210151414.27318.jhb@freebsd.org> References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> Date: Mon, 15 Oct 2012 15:36:57 -0700 X-Google-Sender-Auth: KyTp3ym2n1JjTzY8n01J-YeRRuM Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , freebsd-net@freebsd.org, Jack Vogel , net@freebsd.org, Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 22:36:58 -0000 The reason why I've started moving net80211 and ath _away_ from using direct dispatch (for now) and to using a taskqueue for TX (and RX) is because it's too freaking annoying right now to deal with all the crazy long-held locks to guarantee consistency between multiple transmitting threads. Considering that the driver and net80211 stack: * sometimes is PCI, sometimes is USB (with all the differing thread models that exist there); * sometimes bridge traffic, sometimes route traffic, sometimes source or terminate TCP/UDP connections; * sometimes has one sender, sometimes has multiple senders, with some other modules in between (bridge, pf, ipfw, etc) with locks being held here and there; * since the stack(s) like doing direct dispatch, RX very often causes TX to occur, which for some drivers will block on a long-held driver lock (with all the LORs that occur) - and drivers that do this (eg iwn) will simply drop the lock before passing the packet up. Dropping the lock before passing net80211_input*() .. is just plain silly. Now, I'd _like_ to eventually make net80211/ath support direct dispatch, but that also requires making sure only -one- transmitter is working at once. I'd like to not have the extra context switch overhead, but I haven't seen a better way of doing it yet. It's fun to see the gige/10ge driver have lots of long held locks with lots of concurrent sender processes possibly blocking until TX completes.. so I wonder if that has scaling issues for lots of connections/sending processes. Adrian From owner-freebsd-net@FreeBSD.ORG Mon Oct 15 22:36:58 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 18D8B52B; Mon, 15 Oct 2012 22:36:58 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id C171B8FC0A; Mon, 15 Oct 2012 22:36:57 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id rp8so5740118pbb.13 for ; Mon, 15 Oct 2012 15:36:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=NBeUqkxXqFANk8PmvxmkSkAK7q/Xh5Ri1zXO4hpRjz4=; b=q4DUF1Hs0bpzF44xS+NaYIkjvnNknLLQkoUlWn7lEQsaw7NsbCjv5iQz3pyqVpiJS3 MqEishawiHO4OCp11L/jcojcbDAtxuYNicKNPbkKr2kAgsgYZlTj6BM0tXnohiZtzTFU mCdqfXFPLuFM1oPuSKRVqsWyth1ugSyYAYFW5IRXeLGtEN26fuY+ir2SnD3iJozCPgAq 9wukRk5pI/1ZyKdJwcmyzkk5obX1oR1yydgwj9/TXlCBRctFCx4xhIOgPRvPjPNikZbh 4WdW7XVYMGqc+dXMRJKCESvoVJV3FKVPuC1HNJAJ/m48INUsNJ316D+5oJf8cZKaOIG0 4NXg== MIME-Version: 1.0 Received: by 10.66.80.133 with SMTP id r5mr10706792pax.24.1350340617342; Mon, 15 Oct 2012 15:36:57 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Mon, 15 Oct 2012 15:36:57 -0700 (PDT) In-Reply-To: <201210151414.27318.jhb@freebsd.org> References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> Date: Mon, 15 Oct 2012 15:36:57 -0700 X-Google-Sender-Auth: KyTp3ym2n1JjTzY8n01J-YeRRuM Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , freebsd-net@freebsd.org, Jack Vogel , net@freebsd.org, Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2012 22:36:58 -0000 The reason why I've started moving net80211 and ath _away_ from using direct dispatch (for now) and to using a taskqueue for TX (and RX) is because it's too freaking annoying right now to deal with all the crazy long-held locks to guarantee consistency between multiple transmitting threads. Considering that the driver and net80211 stack: * sometimes is PCI, sometimes is USB (with all the differing thread models that exist there); * sometimes bridge traffic, sometimes route traffic, sometimes source or terminate TCP/UDP connections; * sometimes has one sender, sometimes has multiple senders, with some other modules in between (bridge, pf, ipfw, etc) with locks being held here and there; * since the stack(s) like doing direct dispatch, RX very often causes TX to occur, which for some drivers will block on a long-held driver lock (with all the LORs that occur) - and drivers that do this (eg iwn) will simply drop the lock before passing the packet up. Dropping the lock before passing net80211_input*() .. is just plain silly. Now, I'd _like_ to eventually make net80211/ath support direct dispatch, but that also requires making sure only -one- transmitter is working at once. I'd like to not have the extra context switch overhead, but I haven't seen a better way of doing it yet. It's fun to see the gige/10ge driver have lots of long held locks with lots of concurrent sender processes possibly blocking until TX completes.. so I wonder if that has scaling issues for lots of connections/sending processes. Adrian From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 02:11:38 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5AC6F212; Tue, 16 Oct 2012 02:11:38 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id 152158FC0A; Tue, 16 Oct 2012 02:11:36 +0000 (UTC) Received: from pool-96-250-5-62.nycmny.fios.verizon.net ([96.250.5.62]:58711 helo=minion.home) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1TNwd6-0008Nm-Jt; Mon, 15 Oct 2012 22:11:36 -0400 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful From: George Neville-Neil In-Reply-To: <201210150908.36498.jhb@freebsd.org> Date: Mon, 15 Oct 2012 22:11:41 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <70397C6E-202A-4FAE-AF53-6A5A1D89FAAC@neville-neil.com> References: <201210121213.11152.jhb@freebsd.org> <201210150908.36498.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1499) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com Cc: freebsd-net@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 02:11:38 -0000 On Oct 15, 2012, at 09:08 , John Baldwin wrote: > On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote: >> Back in 2001 FreeBSD added a hack to strip TCP options from = retransmitted SYNs=20 >> starting with the 3rd SYN in this block in tcp_timer.c: >>=20 >> /* >> * Disable rfc1323 if we haven't got any response to >> * our third SYN to work-around some broken terminal servers >> * (most of which have hopefully been retired) that have bad VJ >> * header compression code which trashes TCP segments containing >> * unknown-to-them TCP options. >> */ >> if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D = 3)) >> tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); >>=20 >> There is even a PR for the original bug report: kern/1689 >>=20 >> However, there is an unintended consequence of this change that can = be=20 >> disastrous. Specifically, suppose you have a FreeBSD client = connecting to a=20 >> server, and that the SYNs are arriving at the server successfully, = but the=20 >> first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it = through=20 >> and the connection is established. >>=20 >> The server (based on the first SYN it saw) believes it has negotiated = window=20 >> scaling with the client. The client, however, has broken what it = promised in=20 >> that first SYN and believes it is not using any window scaling at = all. This=20 >> causes two forms of breakage: >>=20 >> 1) When the server advertises a scaled window (e.g. '8' for a 64k = window >> scaled at 13), the client thinks it is an unscaled window ('8') = and >> sends data to the server very slowly. >>=20 >> 2) When the client advertises an unscaled window (e.g. '65535' for a = 64k >> window), the server thinks it has a huge window (65535 << 13 =3D=3D = 511MB) >> to send into. >>=20 >> I'm not sure that 2) is a problem per se, but I have definitely seen = instances=20 >> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and = client=20 >> end of the connections to verify they disagreed on the scaling). >>=20 >> The original motivation of this change is to work around broken = terminal=20 >> servers that were old when this change was added in 2001. Over 10 = years later=20 >> I think we should at least have an option to turn this work-around = off, and=20 >> possibly disable it by default. >>=20 >> Thoughts? >=20 > How about this: >=20 > Index: tcp_timer.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- tcp_timer.c (revision 241579) > +++ tcp_timer.c (working copy) > @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, = CTLFL > /* max idle probes */ > int tcp_maxpersistidle; >=20 > +static int tcp_rexmit_drop_options =3D 0; > +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW, > + &tcp_rexmit_drop_options, 0, > + "Drop TCP options from 3rd and later retransmitted SYN"); > + > static int per_cpu_timers =3D 0; > SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW, > &per_cpu_timers , 0, "run tcp timers on all cpus"); > @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp) > * header compression code which trashes TCP segments containing > * unknown-to-them TCP options. > */ > - if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D = 3)) > + if (tcp_rexmit_drop_options && (tp->t_state =3D=3D = TCPS_SYN_SENT) && > + (tp->t_rxtshift =3D=3D 3)) > tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); > /* > * If we backed off this far, our srtt estimate is probably = bogus. >=20 > Any other suggestions on the sysctl name? The name's fine. Commit that sucker and turn it off. Best, George From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 02:11:38 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5AC6F212; Tue, 16 Oct 2012 02:11:38 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id 152158FC0A; Tue, 16 Oct 2012 02:11:36 +0000 (UTC) Received: from pool-96-250-5-62.nycmny.fios.verizon.net ([96.250.5.62]:58711 helo=minion.home) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from ) id 1TNwd6-0008Nm-Jt; Mon, 15 Oct 2012 22:11:36 -0400 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Dropping TCP options from retransmitted SYNs considered harmful From: George Neville-Neil In-Reply-To: <201210150908.36498.jhb@freebsd.org> Date: Mon, 15 Oct 2012 22:11:41 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <70397C6E-202A-4FAE-AF53-6A5A1D89FAAC@neville-neil.com> References: <201210121213.11152.jhb@freebsd.org> <201210150908.36498.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1499) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - neville-neil.com Cc: freebsd-net@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 02:11:38 -0000 On Oct 15, 2012, at 09:08 , John Baldwin wrote: > On Friday, October 12, 2012 12:13:11 pm John Baldwin wrote: >> Back in 2001 FreeBSD added a hack to strip TCP options from = retransmitted SYNs=20 >> starting with the 3rd SYN in this block in tcp_timer.c: >>=20 >> /* >> * Disable rfc1323 if we haven't got any response to >> * our third SYN to work-around some broken terminal servers >> * (most of which have hopefully been retired) that have bad VJ >> * header compression code which trashes TCP segments containing >> * unknown-to-them TCP options. >> */ >> if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D = 3)) >> tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); >>=20 >> There is even a PR for the original bug report: kern/1689 >>=20 >> However, there is an unintended consequence of this change that can = be=20 >> disastrous. Specifically, suppose you have a FreeBSD client = connecting to a=20 >> server, and that the SYNs are arriving at the server successfully, = but the=20 >> first few return SYN/ACKs are dropped. Eventually a SYN/ACK makes it = through=20 >> and the connection is established. >>=20 >> The server (based on the first SYN it saw) believes it has negotiated = window=20 >> scaling with the client. The client, however, has broken what it = promised in=20 >> that first SYN and believes it is not using any window scaling at = all. This=20 >> causes two forms of breakage: >>=20 >> 1) When the server advertises a scaled window (e.g. '8' for a 64k = window >> scaled at 13), the client thinks it is an unscaled window ('8') = and >> sends data to the server very slowly. >>=20 >> 2) When the client advertises an unscaled window (e.g. '65535' for a = 64k >> window), the server thinks it has a huge window (65535 << 13 =3D=3D = 511MB) >> to send into. >>=20 >> I'm not sure that 2) is a problem per se, but I have definitely seen = instances=20 >> of 1) (and examined the 'struct tcpcb' in kgdb on both the server and = client=20 >> end of the connections to verify they disagreed on the scaling). >>=20 >> The original motivation of this change is to work around broken = terminal=20 >> servers that were old when this change was added in 2001. Over 10 = years later=20 >> I think we should at least have an option to turn this work-around = off, and=20 >> possibly disable it by default. >>=20 >> Thoughts? >=20 > How about this: >=20 > Index: tcp_timer.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- tcp_timer.c (revision 241579) > +++ tcp_timer.c (working copy) > @@ -118,6 +118,11 @@ SYSCTL_INT(_net_inet_tcp, OID_AUTO, keepcnt, = CTLFL > /* max idle probes */ > int tcp_maxpersistidle; >=20 > +static int tcp_rexmit_drop_options =3D 0; > +SYSCTL_INT(_net_inet_tcp, OID_AUTO, rexmit_drop_options, CTLFLAG_RW, > + &tcp_rexmit_drop_options, 0, > + "Drop TCP options from 3rd and later retransmitted SYN"); > + > static int per_cpu_timers =3D 0; > SYSCTL_INT(_net_inet_tcp, OID_AUTO, per_cpu_timers, CTLFLAG_RW, > &per_cpu_timers , 0, "run tcp timers on all cpus"); > @@ -578,7 +583,8 @@ tcp_timer_rexmt(void * xtp) > * header compression code which trashes TCP segments containing > * unknown-to-them TCP options. > */ > - if ((tp->t_state =3D=3D TCPS_SYN_SENT) && (tp->t_rxtshift =3D=3D = 3)) > + if (tcp_rexmit_drop_options && (tp->t_state =3D=3D = TCPS_SYN_SENT) && > + (tp->t_rxtshift =3D=3D 3)) > tp->t_flags &=3D ~(TF_REQ_SCALE|TF_REQ_TSTMP); > /* > * If we backed off this far, our srtt estimate is probably = bogus. >=20 > Any other suggestions on the sysctl name? The name's fine. Commit that sucker and turn it off. Best, George From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 03:42:17 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 65B958FD for ; Tue, 16 Oct 2012 03:42:17 +0000 (UTC) (envelope-from vcfdser@yahoo.com) Received: from s7.send-out.co.cc (s7.send-out.co.cc [41.32.158.117]) by mx1.freebsd.org (Postfix) with ESMTP id 2C5A48FC0C for ; Tue, 16 Oct 2012 03:42:15 +0000 (UTC) Received: from PC2 ([41.32.158.117]) by s7.send-out.co.cc with Microsoft SMTPSVC(6.0.2600.2096); Mon, 15 Oct 2012 21:51:45 +0200 From: "vcfdser@yahoo.com" To: freebsd-net@freebsd.org Subject: Message to whole mankind X-Mailer: SendBlaster.1.6.0 Date: Mon, 15 Oct 2012 21:51:45 +0200 Message-ID: <295263127008937716989@PC2> X-OriginalArrivalTime: 15 Oct 2012 19:51:45.0906 (UTC) FILETIME=[81CADD20:01CDAB0E] MIME-Version: 1.0 Content-Type: text/plain X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: vcfdser@yahoo.com List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 03:42:17 -0000 The submission of man to His Creator is the essence of Islam. The name "Islam" is chosen by God (Allah) and not by man. It is the same unifying Message revealed to all the Prophets and Messengers by Allah and which they spread amongst their respective nations. In its Final form it was revealed to Muhammad (Peace & Mercy of Allah be upon him) as a complete Message to whole mankind. The Lord, Allah, is the True and Only Creator that deserves to be worshipped. No worship is worthy of being given to a stone, statue, a cross, a triangle, Khomeini, Farakhan, Eliajahs, Malcom's X or Y, Ghandi, Krishna, Guru, Buddha, Mahatma, Emperor, Joseph Smith, Sun, Moon (not to that from Korea too!), Light, Fire, rivers, cows, Rama, Temples, Prophets, Messengers (Yes! Muslims do not worship Muhammad-peace be upon him), Saints, Priests, Monks, Movie Stars, Sheiks, etc.!!! All are created beings or things. ALLAH, is the Name of the One True God. His Name is not chosen by man and does not have a number or gender. It is known that Allah is the Name of God in Aramaic, the language of our beloved Prophet Jesus and a sister language of Arabic. The Name "Allah" has been used by all previous Prophets starting with Adam and by the last and final Prophet, Muhammad (Peace be upon them all). The Innate Nature in man recognizes what is good and bad, what is true and false. It recognizes that the Attributes of Allah must be True, Unique, and All-Perfect. It does not feel comfortable towards any kind of degradation of His Attributes not does it qualities to the Creator. Many who became "discontent with God" did so because of the practices of the Church in medieval Europe and because of the claims of "god dwelling in a son" and the concept of the "original sin". However, they "escaped" into worshipping a new theory called "mother nature" as well as the "material" World. With the advancement of materialistic technology others from different religions adopted the concept of "forgetting about God" and "let us live this life and enjoy it!", not realizing that they have chosen the worship of the "original god" of Rome: Desire!. NOW we can see that all of this materialistic and secular progress produced a spiritual vacuum that led to complex social, economical, political, and psychological problems. Many of those who "fled" their "religions" are in search again. Some try to "escape" the complexity of their daily lives via various means. Those who had the chance to examine the Qur'an and Islam, proceed with a complete way of life that relates man to establish a purpose for his presence on earth. This is well recognized in the Attributes of Allah and what does He require from man. He does not want man to be enslaved to any false deity: nature, drugs, lust, money, other man, desire, or sex. He provides man with the proofs that He is the One who can redeem so that man can free himself from the slavery to any form of creation and to turn to his Creator Alone. THIS Creator Has Perfect Attributes. He is the First, nothing is before Him, the Ever Living. To Him is the Final Return where everyone will be dealt with in the Most Perfect and Just way. He does not begot nor He is begotten. Those who attribute Divinity to Jesus forget or ignore the fact that Jesus was in a mother's womb. He needed nutrition; he was born and grew up to be a man. He was trusted with the Gospel as a Message to the Children of Israel: "For there is One God, and one mediator (i.e. a messenger) between God and men (the Children of Israel), the man Christ Jesus) (I Timothy 2:5). A man-messenger calling his nation not to worship him: "But in vain they do worship me!" (Mathew 15:9). A man who needs to eat, walk, sleed, rest, etc.. cannot have Divine Attributes because he is in need and God (Allah) is Self-Sufficient. AS far as Buddhism, Hinduism, Zoroastrianism, Marxism, and Capitalism, there is the devotion of worshipping created being/things in one form or another. Jews had attributed a "Nationalistic" belonging to Allah. They labeled Him "The Tribal God" for the Children of Israel. Men and women following these "religions" were born with the natural inclination of submission to their Creator, Allah. It is their parents who had driven them into their respective traditions. However, once people are exposed to the Signs of Allah around them, or in the Qur'an or to someone who triggers thei Fitra (natural inclination to worship Allah Alone), the reverting process begins and that is why we see a universal spreading of Islam. In the West and despite tha many distortions of Islam in the Media, many admit that Islam may be the fastest growing Faith. No sense of fairness can be achieved without a genuine attempt to know the Word of Allah in the Qur'an and not on the 30-min-Evening News. This is the real challenge for those who seek the Truth. Man is created for a purpose: to live a life in accordance with Allah's way. Why Not? Do we posses the air we breath? Did we create ourselves or others? Or were we ourselves the Creators? We are limited and weak. So is our right to ignore our Creator where we all need Him? ISLAM is the submission in worship to Allah Alone and it is the essence of all the Messages sent to all nations before us. Allah is All-Just and All-Wise. He does not intend confusion for His Creation. The religion accepted to Him is the one chosen by Him. Its essence must be One, because He is One. It is free from geographical, racist, and status oriented concepts. It is Perfect and it is the complete way of life. All these qualities are chosen by Allah in His Only Religion: Islam. Its details are in in the Qur'an, read it and come with an open heart because none can expose better than the World of Allah. The Qur'an was revealed to Prophet Muhammad. He did not author it. He was unlettered. Its translation is available in many languages in bookstores or in an Islamic Center close to you. Take the time to read it and come/call the Islamic Center, or speak to someone who re-verted and submitted to Allah Alone. Prepared by Dr. Saleh As-Saleh From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 10:06:25 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 516A049B for ; Tue, 16 Oct 2012 10:06:25 +0000 (UTC) (envelope-from ozkan.kirik@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id EFAFF8FC1A for ; Tue, 16 Oct 2012 10:06:24 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so7980738vbm.13 for ; Tue, 16 Oct 2012 03:06:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LZBs9dsfL8c7GiE5aOo8ya46oQOR3I6OEpiLpdCxO1s=; b=yd4zi5/Nbqk6Vzbu2C/3Jt8l7S8swkf3Tno/owe3Wt+Nc6NpS8mqxWhLfCRf+5atP0 WaXTPogfirB6SXAFDIvAoP7lVbbe57ORpLbXRLu7y/8/Awa56jqtkvSWMgTB50AHo3lg /Fy7/KQCMACBFjVURYuQ9SpYnjqWyrq9btoptXQuCsvSwP8kIIu/CL1vn5jNn4mbVOqL AV/gzl8pZA1Ui4EscyQMbc0gf3kGHjD1/gyQnEe4EmEGp4gKlyM8t40VR/TENnPZ2fdX /DgIcCebiEhPRU84N2GnSbpkjaHswKLsoUmRu2r//2XySNJs3vOiCJa1wg29QZvL47yN C70Q== MIME-Version: 1.0 Received: by 10.220.150.14 with SMTP id w14mr8366656vcv.13.1350381984272; Tue, 16 Oct 2012 03:06:24 -0700 (PDT) Received: by 10.58.56.135 with HTTP; Tue, 16 Oct 2012 03:06:24 -0700 (PDT) In-Reply-To: <506EEE46.1000604@airnet.opole.pl> References: <2DE61B0869B7484997BCA012845482C7EBE8E280DB@WIN2008.Domnt.abi.ca> <5068AC17.8020704@FreeBSD.org> <2DE61B0869B7484997BCA012845482C7EBE8E280DC@WIN2008.Domnt.abi.ca> <5068ADCC.5030105@FreeBSD.org> <2DE61B0869B7484997BCA012845482C7EBE8E280DD@WIN2008.Domnt.abi.ca> <5068B48E.2070303@FreeBSD.org> <20121004160240.GA1967@funkthat.com> <506DC933.7080307@airnet.opole.pl> <20121004222327.GA40357@in-addr.com> <506E7BE7.2080104@airnet.opole.pl> <506ED2AD.8000408@airnet.opole.pl> <2DE61B0869B7484997BCA012845482C7EBE8E28162@WIN2008.Domnt.abi.ca> <506EEE46.1000604@airnet.opole.pl> Date: Tue, 16 Oct 2012 13:06:24 +0300 Message-ID: Subject: Re: Default route destination changing without warning follow-up From: =?ISO-8859-1?Q?=D6zkan_KIRIK?= To: Krzysztof Barcikowski Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 10:06:25 -0000 I was reported this behaviour before. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157796 On Fri, Oct 5, 2012 at 5:27 PM, Krzysztof Barcikowski wrote: > W dniu 2012-10-05 16:22, Dominic Blais pisze: > >> Hi, >> >> I'm using GENERIC. Everything else is added as loaded module. >> >> Here's my kldstat: >> >> > > I forgot about modules, here they are: > > Id Refs Address Size Name > 1 13 0xffffffff80200000 12200c8 kernel > 2 1 0xffffffff81421000 215f8 geom_mirror.ko > 3 1 0xffffffff81443000 29e8 coretemp.ko > 4 1 0xffffffff81446000 17450 dummynet.ko > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:13:08 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 6A9293A5; Tue, 16 Oct 2012 12:13:08 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id C7F9A3B5C58; Tue, 16 Oct 2012 12:13:06 +0000 (UTC) Message-ID: <507D4F11.2030704@FreeBSD.org> Date: Tue, 16 Oct 2012 16:12:01 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: Ryan Stone Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:13:08 -0000 On 16.10.2012 00:48, Ryan Stone wrote: > On Mon, Oct 15, 2012 at 12:29 PM, Gleb Smirnoff wrote: >> To me this unlock/lock looks like a legacy from times, when the driver >> had a single mutex for both TX and RX parts. >> >> And removing this re-locking in foo_rxeof() was one of the aims for separate >> TX/RX locking. >> >> Really, lurking through history shows that once driver had split its locking >> to separate RX and TX part, these unlock/lock was removed. However, later >> this unlock/lock was added back: >> >> http://svnweb.freebsd.org/base/head/sys/dev/e1000/if_igb.c?revision=209068&view=markup >> >> , without any comments for the reason it is added back. > > There's a convoluted LOR if you call into the stack with the RX lock > held which is described here: > > http://lists.freebsd.org/pipermail/freebsd-net/2012-September/033371.html Are you using stock ixgbe driver? lock order reversal:^M^M 1st 0xffffff800153c138 ix:rx (ix:rx) @ src/sys/dev/ixgbe/ixgbe.c:7113^M^M 2nd 0xffffffff80af9c48 udp (udp) @ src/sys/netinet/udp_usrreq.c:471^M^M It seems to me than ixgbe.c was always like ~5.5k lines of code, line 7113 seems a bit suspicious. 2nd 0xffffff8001539400 ixgbe0 (IXGBE Core Lock) @ src/sys/dev/ixgbe/ixgbe.c:1725 Nearest IXGBE_CORE_LOCK() in r217917 (8.2-R) resides at line 905. Maybe I'm missing something obvious? > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:40:49 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8F1E1BFB; Tue, 16 Oct 2012 12:40:49 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 043958FC19; Tue, 16 Oct 2012 12:40:48 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so8170582vbm.13 for ; Tue, 16 Oct 2012 05:40:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=zwyWvomooeuEE0Zoo4Y3zt4wEDvk3xDhL397lId9CSg=; b=zy5Ts+0hjnsjtWYIKd+nkIMsFCIaqXW2lMk4s7bAXO04QSkgCRkxhHCsUJ63umtrTz MLNljwjZXTtPFVUF4VT8Of8vo+BJQShCPD20C4Kplrs3zNGCCHiQlqe8osRiQSTrkqBu 9Qooc35yC/zRFaS0oB3/cHa24IooV21g/JplzObiE58Sin7nrNWbE/ELvYKr7OhF96OE DxY/eSnMcU6Ay9DraJFFcGZ7wWhFCktexU6ZqQAyUVRUqf5M7s6eKFjOm9BBZXYFGgAL Bz6O+QrzspHMDV7ZQ/UIkO4NyiVPXRjvj6mIE7SNnH9f3KWZoRPtXylGLPG9rTlgTdEg z3sw== MIME-Version: 1.0 Received: by 10.220.154.6 with SMTP id m6mr8478035vcw.51.1350391247952; Tue, 16 Oct 2012 05:40:47 -0700 (PDT) Received: by 10.58.207.114 with HTTP; Tue, 16 Oct 2012 05:40:47 -0700 (PDT) In-Reply-To: <507D4F11.2030704@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> <507D4F11.2030704@FreeBSD.org> Date: Tue, 16 Oct 2012 08:40:47 -0400 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Ryan Stone To: "Alexander V. Chernikov" Content-Type: text/plain; charset=ISO-8859-1 Cc: Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:40:49 -0000 On Tue, Oct 16, 2012 at 8:12 AM, Alexander V. Chernikov wrote: > Are you using stock ixgbe driver? Pay no attention to the line numbers behind the curtain. :) I don't believe that I've changed the locking order at all in the driver, but you are right, that wasn't taken from the stock driver. From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:47:36 2012 Return-Path: Delivered-To: net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 737D7EB3; Tue, 16 Oct 2012 12:47:36 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.64.117]) by mx1.freebsd.org (Postfix) with ESMTP id DE4F28FC0C; Tue, 16 Oct 2012 12:47:35 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id q9GClYep033268; Tue, 16 Oct 2012 16:47:34 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id q9GClXZh033267; Tue, 16 Oct 2012 16:47:33 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Tue, 16 Oct 2012 16:47:33 +0400 From: Gleb Smirnoff To: Ryan Stone Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <20121016124733.GC89655@glebius.int.ru> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> <507D4F11.2030704@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "Alexander V. Chernikov" , Jack Vogel , net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:47:36 -0000 On Tue, Oct 16, 2012 at 08:40:47AM -0400, Ryan Stone wrote: R> > Are you using stock ixgbe driver? R> R> Pay no attention to the line numbers behind the curtain. :) R> R> I don't believe that I've changed the locking order at all in the R> driver, but you are right, that wasn't taken from the stock driver. Can you please provide hints how can SIOCADDMULTI lead to obtaining RX lock in the stock driver? Sorry if I miss obvious. -- Totus tuus, Glebius. From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:47:57 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 1AC86F4B; Tue, 16 Oct 2012 12:47:57 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id 80DEC3B4C86; Tue, 16 Oct 2012 12:47:55 +0000 (UTC) Message-ID: <507D5739.70509@FreeBSD.org> Date: Tue, 16 Oct 2012 16:46:49 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> In-Reply-To: <201210151414.27318.jhb@freebsd.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:47:57 -0000 On 15.10.2012 22:14, John Baldwin wrote: > On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote: >> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: >> J> > 3) in practice taskqueue routine is a nightmare for many people since >> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >> J> > some traffic burst happens: once it is called it starts to schedule >> J> > itself more and more replacing original ISR routine. Additionally, >> J> > increasing rx_process_limit does not help since taskqueue is called with >> J> > the same limit. Finally, currently netisr taskq threads are not bound to >> J> > any CPU which makes the process even more uncontrollable. >> J> >> J> I think part of the problem here is that the taskqueue in ixgbe(4) is >> J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >> J> just start transmitting packets directly. >> J> >> J> I fixed this in igb(4) here: >> J> >> J> http://svnweb.freebsd.org/base?view=revision&revision=233708 >> >> The problem Alexander describes in 3) definitely wasn't fixed in r233708. >> >> It is still present in head/, and it prevents me to do good benchmarking >> of pf(4) on igb(4). >> >> The problem is related to RX handling, so I don't see how r233708 could >> fix it. > > Before 233708, if you had a single TX packet waiting to go out and an RX > interrupt arrived, the task queue would be constantly reschedule causing > it to effectively spin at 100% until the TX packet was completely transmitted > and the hardware had updated the descriptor to mark it as complete. In fact, > as long as you have any pending TX packets at all it will keep spinning until > it gets into a state where you have no pending TX packets (so a steady stream > of TX packets, including, say ACKs would cause the taskqueue to run forever). > > In general I think that with MSI-X you should just use an RX processing limit > of -1. Anything else is just adding overhead in the form of extra context Yes, this is the obvious next step after binding threads to CPUs. > switches. Neither the task or the MSI-X interrupt handler are on a thread > that is shared with any other tasks or handlers, so all that scheduling (or > rescheduling) the task will do is result in the task being immediately run > (after either a context switch or returning back to the main loop of the > taskqueue thread). > > If you look at the drivers, if a burst of RX traffic ends, the taskqueue It is questionable if this behavior is good during burst: 1) Due to RX locking taskq eats signifficant (if not all) RX packets from given queue 2) Tasq can run on any cpu so this introduces possible out-of-order packets within connection which is bad for forwarding (and there were some problems in our TCP stack in the past). Additionally, this behavior is totally uncontrollable and unscalable (we run _one_ task _instead_ of RX handler) and leads to significant performance flapping on heavy-loaded forwarding setups. > should stop running and stop polling the hardware. It is only the TX side > that gets stuck needlessly polling. The watchdog timer rescheduling the Unfortunately, until at least single call from driver to this function remains, it is possible that potential traffic burst can be consumed by tasq (especially if large rx_processing_limit is set). If there are reasons not to change tasq RX processing behavior, maybe adding additional sysctl like: ix.0.loop_forever = 1 can be a compromise? e.g. main processing loop does not decrease 'count' variable if this loop_forever is set, and tasq invocation limit remains controlled by current rx_processing_limit. Nothing is changed by default, but people wishing to get predicable results simply set loop_forever to 1 and rx_processing_limit to 1 (or 0). > handler once a second when there is no watchdog condition doesn't help > matters either, but I think that is unique to ixgbe(4). > > It would be good if you could determine exactly why igb thinks it needs to > reschedule the taskqueue in your test case on igb(4) post 233708. > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:47:57 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 1AC86F4B; Tue, 16 Oct 2012 12:47:57 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id 80DEC3B4C86; Tue, 16 Oct 2012 12:47:55 +0000 (UTC) Message-ID: <507D5739.70509@FreeBSD.org> Date: Tue, 16 Oct 2012 16:46:49 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <20121015163210.GW89655@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> In-Reply-To: <201210151414.27318.jhb@freebsd.org> Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:47:57 -0000 On 15.10.2012 22:14, John Baldwin wrote: > On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote: >> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: >> J> > 3) in practice taskqueue routine is a nightmare for many people since >> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >> J> > some traffic burst happens: once it is called it starts to schedule >> J> > itself more and more replacing original ISR routine. Additionally, >> J> > increasing rx_process_limit does not help since taskqueue is called with >> J> > the same limit. Finally, currently netisr taskq threads are not bound to >> J> > any CPU which makes the process even more uncontrollable. >> J> >> J> I think part of the problem here is that the taskqueue in ixgbe(4) is >> J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >> J> just start transmitting packets directly. >> J> >> J> I fixed this in igb(4) here: >> J> >> J> http://svnweb.freebsd.org/base?view=revision&revision=233708 >> >> The problem Alexander describes in 3) definitely wasn't fixed in r233708. >> >> It is still present in head/, and it prevents me to do good benchmarking >> of pf(4) on igb(4). >> >> The problem is related to RX handling, so I don't see how r233708 could >> fix it. > > Before 233708, if you had a single TX packet waiting to go out and an RX > interrupt arrived, the task queue would be constantly reschedule causing > it to effectively spin at 100% until the TX packet was completely transmitted > and the hardware had updated the descriptor to mark it as complete. In fact, > as long as you have any pending TX packets at all it will keep spinning until > it gets into a state where you have no pending TX packets (so a steady stream > of TX packets, including, say ACKs would cause the taskqueue to run forever). > > In general I think that with MSI-X you should just use an RX processing limit > of -1. Anything else is just adding overhead in the form of extra context Yes, this is the obvious next step after binding threads to CPUs. > switches. Neither the task or the MSI-X interrupt handler are on a thread > that is shared with any other tasks or handlers, so all that scheduling (or > rescheduling) the task will do is result in the task being immediately run > (after either a context switch or returning back to the main loop of the > taskqueue thread). > > If you look at the drivers, if a burst of RX traffic ends, the taskqueue It is questionable if this behavior is good during burst: 1) Due to RX locking taskq eats signifficant (if not all) RX packets from given queue 2) Tasq can run on any cpu so this introduces possible out-of-order packets within connection which is bad for forwarding (and there were some problems in our TCP stack in the past). Additionally, this behavior is totally uncontrollable and unscalable (we run _one_ task _instead_ of RX handler) and leads to significant performance flapping on heavy-loaded forwarding setups. > should stop running and stop polling the hardware. It is only the TX side > that gets stuck needlessly polling. The watchdog timer rescheduling the Unfortunately, until at least single call from driver to this function remains, it is possible that potential traffic burst can be consumed by tasq (especially if large rx_processing_limit is set). If there are reasons not to change tasq RX processing behavior, maybe adding additional sysctl like: ix.0.loop_forever = 1 can be a compromise? e.g. main processing loop does not decrease 'count' variable if this loop_forever is set, and tasq invocation limit remains controlled by current rx_processing_limit. Nothing is changed by default, but people wishing to get predicable results simply set loop_forever to 1 and rx_processing_limit to 1 (or 0). > handler once a second when there is no watchdog condition doesn't help > matters either, but I think that is unique to ixgbe(4). > > It would be good if you could determine exactly why igb thinks it needs to > reschedule the taskqueue in your test case on igb(4) post 233708. > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:49:19 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7455311C; Tue, 16 Oct 2012 12:49:19 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 42D758FC14; Tue, 16 Oct 2012 12:49:19 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 8BF06B980; Tue, 16 Oct 2012 08:49:18 -0400 (EDT) From: John Baldwin To: Adrian Chadd Subject: Re: ixgbe & if_igb RX ring locking Date: Tue, 16 Oct 2012 08:38:17 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210160838.17741.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 16 Oct 2012 08:49:18 -0400 (EDT) Cc: "Alexander V. Chernikov" , freebsd-net@freebsd.org, Jack Vogel , net@freebsd.org, Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:49:19 -0000 On Monday, October 15, 2012 6:36:57 pm Adrian Chadd wrote: > The reason why I've started moving net80211 and ath _away_ from using > direct dispatch (for now) and to using a taskqueue for TX (and RX) is > because it's too freaking annoying right now to deal with all the > crazy long-held locks to guarantee consistency between multiple > transmitting threads. > > Considering that the driver and net80211 stack: > > * sometimes is PCI, sometimes is USB (with all the differing thread > models that exist there); > * sometimes bridge traffic, sometimes route traffic, sometimes source > or terminate TCP/UDP connections; > * sometimes has one sender, sometimes has multiple senders, with some > other modules in between (bridge, pf, ipfw, etc) with locks being held > here and there; > * since the stack(s) like doing direct dispatch, RX very often causes > TX to occur, which for some drivers will block on a long-held driver > lock (with all the LORs that occur) - and drivers that do this (eg > iwn) will simply drop the lock before passing the packet up. Dropping > the lock before passing net80211_input*() .. is just plain silly. > > Now, I'd _like_ to eventually make net80211/ath support direct > dispatch, but that also requires making sure only -one- transmitter is > working at once. I'd like to not have the extra context switch > overhead, but I haven't seen a better way of doing it yet. > > It's fun to see the gige/10ge driver have lots of long held locks with > lots of concurrent sender processes possibly blocking until TX > completes.. so I wonder if that has scaling issues for lots of > connections/sending processes. I don't follow how this is related to this thread at all (which has more to do with ixgbe scheduling duplicate work). However, is your issue that the stack locks (e.g. socket and protocol layer locks) are held across if_start/if_transmit? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 12:49:19 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7455311C; Tue, 16 Oct 2012 12:49:19 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 42D758FC14; Tue, 16 Oct 2012 12:49:19 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 8BF06B980; Tue, 16 Oct 2012 08:49:18 -0400 (EDT) From: John Baldwin To: Adrian Chadd Subject: Re: ixgbe & if_igb RX ring locking Date: Tue, 16 Oct 2012 08:38:17 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210160838.17741.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 16 Oct 2012 08:49:18 -0400 (EDT) Cc: "Alexander V. Chernikov" , freebsd-net@freebsd.org, Jack Vogel , net@freebsd.org, Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 12:49:19 -0000 On Monday, October 15, 2012 6:36:57 pm Adrian Chadd wrote: > The reason why I've started moving net80211 and ath _away_ from using > direct dispatch (for now) and to using a taskqueue for TX (and RX) is > because it's too freaking annoying right now to deal with all the > crazy long-held locks to guarantee consistency between multiple > transmitting threads. > > Considering that the driver and net80211 stack: > > * sometimes is PCI, sometimes is USB (with all the differing thread > models that exist there); > * sometimes bridge traffic, sometimes route traffic, sometimes source > or terminate TCP/UDP connections; > * sometimes has one sender, sometimes has multiple senders, with some > other modules in between (bridge, pf, ipfw, etc) with locks being held > here and there; > * since the stack(s) like doing direct dispatch, RX very often causes > TX to occur, which for some drivers will block on a long-held driver > lock (with all the LORs that occur) - and drivers that do this (eg > iwn) will simply drop the lock before passing the packet up. Dropping > the lock before passing net80211_input*() .. is just plain silly. > > Now, I'd _like_ to eventually make net80211/ath support direct > dispatch, but that also requires making sure only -one- transmitter is > working at once. I'd like to not have the extra context switch > overhead, but I haven't seen a better way of doing it yet. > > It's fun to see the gige/10ge driver have lots of long held locks with > lots of concurrent sender processes possibly blocking until TX > completes.. so I wonder if that has scaling issues for lots of > connections/sending processes. I don't follow how this is related to this thread at all (which has more to do with ixgbe scheduling duplicate work). However, is your issue that the stack locks (e.g. socket and protocol layer locks) are held across if_start/if_transmit? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 13:20:05 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D591ADD5; Tue, 16 Oct 2012 13:20:05 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 531228FC0C; Tue, 16 Oct 2012 13:20:05 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so8234287vbm.13 for ; Tue, 16 Oct 2012 06:20:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1bT0WmRPa07jbTKhIxzJLWQzJTt2QP7ibYzX+c+Afc8=; b=FclWLbXi1zlsY+fNTQ8D8SNHM7dlHiQk1TxSVacObLCpzo1YPBJ03dLpuLxbgC8TQB qk/1IYFWnHU+CN2NkoCmYghG/GRXrm9oLVxwzX7MGsG2CdLB5pz+bT/XW13liMYyIvT5 /P7qCwICj46d6o369cmR14Vu894XFDmHdspmGWAHwn5T48ZeDcLCoh4YPI6uPDQFr/mI a4rgS2KLck44C8dYlEOx/yizJnGKuX0kK33kK5Srzk3Z5rx87h041vCZo66oLLZUhAnH lnVC8wxoOsKAmlIIMLU4X8/CXaRYkkH+jns4sNpYGTDzUn/CbSx7IQekxM5eXugvjzJR Vl2w== MIME-Version: 1.0 Received: by 10.52.68.7 with SMTP id r7mr7101958vdt.96.1350393604422; Tue, 16 Oct 2012 06:20:04 -0700 (PDT) Received: by 10.58.207.114 with HTTP; Tue, 16 Oct 2012 06:20:04 -0700 (PDT) In-Reply-To: <20121016124733.GC89655@glebius.int.ru> References: <5079A9A1.4070403@FreeBSD.org> <20121015162926.GV89655@FreeBSD.org> <507D4F11.2030704@FreeBSD.org> <20121016124733.GC89655@glebius.int.ru> Date: Tue, 16 Oct 2012 09:20:04 -0400 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Ryan Stone To: Gleb Smirnoff Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 13:20:05 -0000 On Tue, Oct 16, 2012 at 8:47 AM, Gleb Smirnoff wrote: > Can you please provide hints how can SIOCADDMULTI lead to obtaining RX > lock in the stock driver? It doesn't. But it does acquire the core lock, and the core lock is acquired before the RX lock (in ixgbe_init, for instance). From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 15:27:34 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B1896E1D for ; Tue, 16 Oct 2012 15:27:34 +0000 (UTC) (envelope-from s.khanchi@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6F82E8FC17 for ; Tue, 16 Oct 2012 15:27:34 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id k10so13112009iea.13 for ; Tue, 16 Oct 2012 08:27:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=eAbprt1hU2bXTmYJ3M2Sf1efHtGNNBI92a5uMrb483A=; b=FFdmbyOCko1Cy2yH3WDdJ+J2cIjjV232UzVU56H/jfV4ZAkGQ9ApQeoelbT6vCeBtj HDoBo2gid0+6d54RiFIqKpdvHiDn7RtW5lEhJEG9FOakDp9CQIseOT8bCEekL9lSnZER cn5j/lzmMc9XDG8EW2AXWBfv6qsd9aZ/Zbf3k8hz0M1HMiaToom5r9zdbDg9Lo+z/Dw5 64n/aJl9OVHbI8YWqjwflHV8pwoUMJYSmoq3dJ6lDMtNWVJIAVUkBEwF1t0xF+wk9Ln5 7HD3vEDBAO0IdL1jzCaVTSnUeWmV+KWwKSHAu7IKQ4oG1rbvhrkDwxr+xJDYyOeqp2ip DfEw== Received: by 10.50.171.5 with SMTP id aq5mr680433igc.36.1350401250787; Tue, 16 Oct 2012 08:27:30 -0700 (PDT) MIME-Version: 1.0 Sender: s.khanchi@gmail.com Received: by 10.64.51.234 with HTTP; Tue, 16 Oct 2012 08:27:10 -0700 (PDT) From: h bagade Date: Tue, 16 Oct 2012 18:57:10 +0330 X-Google-Sender-Auth: Q6-xiu0KfuqcM3BJWPqnE6-Vhos Message-ID: Subject: TCP_DROP_SYNFIN kernel option side effects?! To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 15:27:34 -0000 Hi all, I need to add this option to kernel in order to defeating Nmap OS-Fingerprinting. My system is running as Web Server and also it is the gateway on the network. I want to know if setting this option has any side effects on other parts of the system? Is there any situation that SYN and FIN bits are set both in TCP packets? Is it a normal situation? Any helps or comments are really appreciated. From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 16:17:31 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C99F3D02; Tue, 16 Oct 2012 16:17:31 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 701C28FC17; Tue, 16 Oct 2012 16:17:31 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id CF5A2B911; Tue, 16 Oct 2012 12:17:30 -0400 (EDT) From: John Baldwin To: "Alexander V. Chernikov" Subject: Re: ixgbe & if_igb RX ring locking Date: Tue, 16 Oct 2012 12:09:55 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> <507D5739.70509@FreeBSD.org> In-Reply-To: <507D5739.70509@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <201210161209.55979.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 16 Oct 2012 12:17:30 -0400 (EDT) Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 16:17:31 -0000 On Tuesday, October 16, 2012 8:46:49 am Alexander V. Chernikov wrote: > On 15.10.2012 22:14, John Baldwin wrote: > > On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote: > >> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: > >> J> > 3) in practice taskqueue routine is a nightmare for many people since > >> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > >> J> > some traffic burst happens: once it is called it starts to schedule > >> J> > itself more and more replacing original ISR routine. Additionally, > >> J> > increasing rx_process_limit does not help since taskqueue is called with > >> J> > the same limit. Finally, currently netisr taskq threads are not bound to > >> J> > any CPU which makes the process even more uncontrollable. > >> J> > >> J> I think part of the problem here is that the taskqueue in ixgbe(4) is > >> J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > >> J> just start transmitting packets directly. > >> J> > >> J> I fixed this in igb(4) here: > >> J> > >> J> http://svnweb.freebsd.org/base?view=revision&revision=233708 > >> > >> The problem Alexander describes in 3) definitely wasn't fixed in r233708. > >> > >> It is still present in head/, and it prevents me to do good benchmarking > >> of pf(4) on igb(4). > >> > >> The problem is related to RX handling, so I don't see how r233708 could > >> fix it. > > > > Before 233708, if you had a single TX packet waiting to go out and an RX > > interrupt arrived, the task queue would be constantly reschedule causing > > it to effectively spin at 100% until the TX packet was completely transmitted > > and the hardware had updated the descriptor to mark it as complete. In fact, > > as long as you have any pending TX packets at all it will keep spinning until > > it gets into a state where you have no pending TX packets (so a steady stream > > of TX packets, including, say ACKs would cause the taskqueue to run forever). > > > > In general I think that with MSI-X you should just use an RX processing limit > > of -1. Anything else is just adding overhead in the form of extra context > Yes, this is the obvious next step after binding threads to CPUs. > > switches. Neither the task or the MSI-X interrupt handler are on a thread > > that is shared with any other tasks or handlers, so all that scheduling (or > > rescheduling) the task will do is result in the task being immediately run > > (after either a context switch or returning back to the main loop of the > > taskqueue thread). > > > > > If you look at the drivers, if a burst of RX traffic ends, the taskqueue > It is questionable if this behavior is good during burst: > > 1) Due to RX locking taskq eats signifficant (if not all) RX packets > from given queue > 2) Tasq can run on any cpu so this introduces possible out-of-order > packets within connection which is bad for forwarding (and there were > some problems in our TCP stack in the past). Additionally, this behavior > is totally uncontrollable and unscalable (we run _one_ task _instead_ of > RX handler) and leads to significant performance flapping on > heavy-loaded forwarding setups. The taskqueue and interrupt handler should never run concurrently. If they are doing so now, that is a _bug_ and my patch fixes some of those already. Just as r233708 fixed similar bugs in igb. Normally the interrupt handler should disable the specific MSI-X interrupt when it schedules the task, and the interrupt is not re-enabled until the task decides it doesn't need to reschedule itself. If this is done correctly, then you shouldn't see RX lock contention unless someone is doing 'ifconfig' or something else that triggers an ioctl. Anything else is just papering over these bugs (which are quite bad since they result in out-of-order handling besides the lock contention). In fact, my original motivation for using a separate TX-only task for the if_transmit case for igb was specifically to avoid out-of-order processing on RX, not to prevent lock contention. Can you describe the specific situation in which you now see both the task and the interrupt handler running concurrently? Do you have KTR traces from KTR_SCHED perhaps? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 16:17:31 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C99F3D02; Tue, 16 Oct 2012 16:17:31 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 701C28FC17; Tue, 16 Oct 2012 16:17:31 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id CF5A2B911; Tue, 16 Oct 2012 12:17:30 -0400 (EDT) From: John Baldwin To: "Alexander V. Chernikov" Subject: Re: ixgbe & if_igb RX ring locking Date: Tue, 16 Oct 2012 12:09:55 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> <507D5739.70509@FreeBSD.org> In-Reply-To: <507D5739.70509@FreeBSD.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="koi8-r" Content-Transfer-Encoding: 7bit Message-Id: <201210161209.55979.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 16 Oct 2012 12:17:30 -0400 (EDT) Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 16:17:31 -0000 On Tuesday, October 16, 2012 8:46:49 am Alexander V. Chernikov wrote: > On 15.10.2012 22:14, John Baldwin wrote: > > On Monday, October 15, 2012 12:32:10 pm Gleb Smirnoff wrote: > >> On Mon, Oct 15, 2012 at 09:04:27AM -0400, John Baldwin wrote: > >> J> > 3) in practice taskqueue routine is a nightmare for many people since > >> J> > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > >> J> > some traffic burst happens: once it is called it starts to schedule > >> J> > itself more and more replacing original ISR routine. Additionally, > >> J> > increasing rx_process_limit does not help since taskqueue is called with > >> J> > the same limit. Finally, currently netisr taskq threads are not bound to > >> J> > any CPU which makes the process even more uncontrollable. > >> J> > >> J> I think part of the problem here is that the taskqueue in ixgbe(4) is > >> J> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > >> J> just start transmitting packets directly. > >> J> > >> J> I fixed this in igb(4) here: > >> J> > >> J> http://svnweb.freebsd.org/base?view=revision&revision=233708 > >> > >> The problem Alexander describes in 3) definitely wasn't fixed in r233708. > >> > >> It is still present in head/, and it prevents me to do good benchmarking > >> of pf(4) on igb(4). > >> > >> The problem is related to RX handling, so I don't see how r233708 could > >> fix it. > > > > Before 233708, if you had a single TX packet waiting to go out and an RX > > interrupt arrived, the task queue would be constantly reschedule causing > > it to effectively spin at 100% until the TX packet was completely transmitted > > and the hardware had updated the descriptor to mark it as complete. In fact, > > as long as you have any pending TX packets at all it will keep spinning until > > it gets into a state where you have no pending TX packets (so a steady stream > > of TX packets, including, say ACKs would cause the taskqueue to run forever). > > > > In general I think that with MSI-X you should just use an RX processing limit > > of -1. Anything else is just adding overhead in the form of extra context > Yes, this is the obvious next step after binding threads to CPUs. > > switches. Neither the task or the MSI-X interrupt handler are on a thread > > that is shared with any other tasks or handlers, so all that scheduling (or > > rescheduling) the task will do is result in the task being immediately run > > (after either a context switch or returning back to the main loop of the > > taskqueue thread). > > > > > If you look at the drivers, if a burst of RX traffic ends, the taskqueue > It is questionable if this behavior is good during burst: > > 1) Due to RX locking taskq eats signifficant (if not all) RX packets > from given queue > 2) Tasq can run on any cpu so this introduces possible out-of-order > packets within connection which is bad for forwarding (and there were > some problems in our TCP stack in the past). Additionally, this behavior > is totally uncontrollable and unscalable (we run _one_ task _instead_ of > RX handler) and leads to significant performance flapping on > heavy-loaded forwarding setups. The taskqueue and interrupt handler should never run concurrently. If they are doing so now, that is a _bug_ and my patch fixes some of those already. Just as r233708 fixed similar bugs in igb. Normally the interrupt handler should disable the specific MSI-X interrupt when it schedules the task, and the interrupt is not re-enabled until the task decides it doesn't need to reschedule itself. If this is done correctly, then you shouldn't see RX lock contention unless someone is doing 'ifconfig' or something else that triggers an ioctl. Anything else is just papering over these bugs (which are quite bad since they result in out-of-order handling besides the lock contention). In fact, my original motivation for using a separate TX-only task for the if_transmit case for igb was specifically to avoid out-of-order processing on RX, not to prevent lock contention. Can you describe the specific situation in which you now see both the task and the interrupt handler running concurrently? Do you have KTR traces from KTR_SCHED perhaps? -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 20:36:02 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 30CA5AB7 for ; Tue, 16 Oct 2012 20:36:02 +0000 (UTC) (envelope-from mariano.cediel@gmail.com) Received: from mail-ye0-f182.google.com (mail-ye0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id E1DA28FC0A for ; Tue, 16 Oct 2012 20:36:01 +0000 (UTC) Received: by mail-ye0-f182.google.com with SMTP id l8so163477yen.13 for ; Tue, 16 Oct 2012 13:35:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=liH4heV9VsDVN0fvTbaoZibpsxBSHJeM7DnI7WLwJFY=; b=tr+ptc1ZMzaV09qJuQkyga1oFpcmC4Vj1X+AJAWErw3uidIrfrF/BEvy9g+Qfw+rEH JSd22gfQ9TLFVL4Yuro0ezSNxCBzOvexodMHtx2W9z7Sn3ibyqZGp88gM+ZUQJkLAjW4 N8LtBZrk+9zXHC4P6H2mKC6ydOmwcgrc5X8JTP0UAyBJreihFlnOGxeL0oAviTqfRsMf mF96zWsRoE5RDOwlZbqimm1FJjA2r9SuuzeBfxLI+xmqv9Uef7GQZSn49xLYe86Ggcc/ 2Fp7AHcoFBw9qYm5fKLnGQ2Oe539xFS2Mf6+OehYslUzyCtqWMDkRX834SX9Xduwmyyt a8bQ== MIME-Version: 1.0 Received: by 10.52.75.72 with SMTP id a8mr7518537vdw.66.1350419755026; Tue, 16 Oct 2012 13:35:55 -0700 (PDT) Received: by 10.58.102.197 with HTTP; Tue, 16 Oct 2012 13:35:55 -0700 (PDT) Date: Tue, 16 Oct 2012 22:35:55 +0200 Message-ID: Subject: one physical interface -> n virtual interfaces From: Mariano Cediel To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 20:36:02 -0000 How do I create, from a physical interface, n virtual interfaces, but all effects are real, their MAC different, on which we can do individually NAT, etc, etc.? I need one external interface has 2 public IPs, and I'll do every NAT over every (with ipfw and divert) individually (each of them has its own gateway) A little help to start researching ..... Greetings. (sorry for my poor english) -- [o - - - - - - (\ | u d t ( \_('> c c s (__(=_) s o ? -"= From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 21:54:54 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0571BE92; Tue, 16 Oct 2012 21:54:54 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from aussmtpmrkpc120.us.dell.com (aussmtpmrkpc120.us.dell.com [143.166.82.159]) by mx1.freebsd.org (Postfix) with ESMTP id BEBE58FC16; Tue, 16 Oct 2012 21:54:53 +0000 (UTC) X-Loopcount0: from 64.238.244.148 X-IronPort-AV: E=Sophos;i="4.80,595,1344229200"; d="scan'208";a="7198298" Message-ID: <507DD768.7000803@vangyzen.net> Date: Tue, 16 Oct 2012 16:53:44 -0500 From: Eric van Gyzen User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120822 Thunderbird/14.0 MIME-Version: 1.0 To: net@FreeBSD.org, "Bjoern A. Zeeb" Subject: Tahi "Redirected On-link" Test Case Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 21:54:54 -0000 I am currently working on a fix for kern/152791 (Tahi IPv6 Ready Logo test case #169: Redirected On-link). I have a change to add the host route, and it works for test case 169. However, the route never gets removed, so all subsequent test cases fail (because they first verify that the Node Under Test thinks the destination is off-link). How/When should I clean up the route? Each test case runs a common cleanup procedure, which sends a RA with a Router Lifetime of zero and a Prefix Information option with a Valid Lifetime and Preferred Lifetime of zero. This deprecates the NUT's only global address, by which it reaches the newly-on-link destination. However, it doesn't seem rational to use this event to trigger a cleanup of the route. The only other trigger I can imagine is the transition of the Destination Cache entry to the Stale state. That also doesn't make complete sense. (It probably also wouldn't work, since in my testing, test case 170 begins immediately after test case 169 ends.) I'm assuming a certain amount of familiarity (on your part) with these tests. If you'd like, I can explain them in more detail. Thanks in advance for any advice, Eric From owner-freebsd-net@FreeBSD.ORG Tue Oct 16 22:03:46 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5CB381B7 for ; Tue, 16 Oct 2012 22:03:46 +0000 (UTC) (envelope-from pprocacci@datapipe.com) Received: from EXFESMQ04.datapipe-corp.net (exfesmq04.datapipe.com [64.27.120.68]) by mx1.freebsd.org (Postfix) with ESMTP id 101BC8FC1A for ; Tue, 16 Oct 2012 22:03:45 +0000 (UTC) Received: from nat.myhome (192.168.128.103) by EXFESMQ04.datapipe-corp.net (192.168.128.29) with Microsoft SMTP Server (TLS) id 14.2.318.1; Tue, 16 Oct 2012 18:02:35 -0400 Date: Tue, 16 Oct 2012 17:02:59 -0500 From: "Paul A. Procacci" To: Mariano Cediel Subject: Re: one physical interface -> n virtual interfaces Message-ID: <20121016220258.GI7125@nat.myhome> References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Originating-IP: [192.168.128.103] Content-Transfer-Encoding: quoted-printable Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Oct 2012 22:03:46 -0000 On Tue, Oct 16, 2012 at 10:35:55PM +0200, Mariano Cediel wrote: > How do I create, from a physical interface, n virtual interfaces, but > all effects are real, their MAC different, on which we can do > individually NAT, etc, etc.? > > I need one external interface has 2 public IPs, and I'll do every NAT > over every (with ipfw and divert) > individually (each of them has its own gateway) > > A little help to start researching ..... > Greetings. http://freebsd.1045724.n5.nabble.com/Virtual-Network-Interface-Card-td40051= 09.html The above was posted in late 2010. It has one example of creating vitual i= nterfaces using the netgraph module. 3rd post from the top. I'm not entirely sure if this is the current _correct_ way, but I imagine i= s still accurate and can be used to get you started. ~Paul ________________________________ This message may contain confidential or privileged information. If you are= not the intended recipient, please advise us immediately and delete this m= essage. See http://www.datapipe.com/legal/email_disclaimer/ for further inf= ormation on confidentiality and the risks of non-secure electronic communic= ation. If you cannot access these links, please notify us by reply message = and we will send the contents to you. From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 01:26:41 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F410B436; Wed, 17 Oct 2012 01:26:40 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id A73438FC0C; Wed, 17 Oct 2012 01:26:40 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so6983971pad.13 for ; Tue, 16 Oct 2012 18:26:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=bmRWDQgAacYcgW3VizvX9o+evO04s1b8d0cJgL5So04=; b=MHVlLiuSQdqTM3ZE8/tTznTHRf0zABwVBNh9LhNoOozPYrugsgabD9PpyZ4SfsO4ZZ ebU5mpNi/VeeHJIAvDHeln4VGQCsEtNYSMr1Eh5DWF0GD4LSJtpBAuPaljYmFIjKZLdc Lj+Nzb4fmdKVvfg5Nz8XJn/lo5jDCfKGsQY61DyvqlN7Xf764jvfD9uz1+3qSJdNvHlg HBnts6BlDoxG/p7qecU2S0r5XRjlCWFnc0i0qvBwMUXASpiN+MvDQDypaplcDHuPKMW/ PgiWMqvs5TWt0HQSTjAjDZ2MSSde6dSdieLTHSzNMG92+wzEHK0YqC6C0TAOC7vApwUi rfrA== MIME-Version: 1.0 Received: by 10.66.86.129 with SMTP id p1mr6638350paz.39.1350437200422; Tue, 16 Oct 2012 18:26:40 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Tue, 16 Oct 2012 18:26:40 -0700 (PDT) In-Reply-To: <201210160838.17741.jhb@freebsd.org> References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> <201210160838.17741.jhb@freebsd.org> Date: Tue, 16 Oct 2012 18:26:40 -0700 X-Google-Sender-Auth: GoMxUKHM6DlvJHVOZ6ge9-Pu1BU Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , freebsd-net@freebsd.org, Jack Vogel , net@freebsd.org, Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 01:26:41 -0000 On 16 October 2012 05:38, John Baldwin wrote: > I don't follow how this is related to this thread at all (which has more to do > with ixgbe scheduling duplicate work). However, is your issue that the stack > locks (e.g. socket and protocol layer locks) are held across > if_start/if_transmit? It's a comment on the larger scale architectural problem. Since if_transmit and if_start are called from multiple thread contexts, the current ways drivers implement this are: * support direct dispatch to hardware, but wrap the whole sending process in one enormous lock, to prevent packet reordering issues; or * drop TX and TX completion into a TX taskqueue (or multiple, one per hardware send queue) and push frames into that taskqueue via some queue and then wake said taskqueue up; or * some bastardised version of both. For the intel drivers, the locks are held for a (potentially) very long time. Both igb and ixgb both hold the locks for the entirety of the TX process. It's not protecting something like a queue operation, it's effectively serialising the entirety of the TX and TX completion process. That works ok-ish for ethernet drivers which are "send and ignore", but for wireless drivers where the stack implements a lot more state, it really does quite suck. And since wireless drivers have a top level idea of sequence and encryption (ie, it's not per-TCP stream, it's across multiple sending streams to a given node), I can't model the locking and serialisation on what the TCP/UDP code does. I wish we had a better way of implementing "serialisation without long, long held locks" but short of stuffing everything into a taskqueue and only locking the send queue involved, I can't really think of anything. Adrian From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 01:26:41 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F410B436; Wed, 17 Oct 2012 01:26:40 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id A73438FC0C; Wed, 17 Oct 2012 01:26:40 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so6983971pad.13 for ; Tue, 16 Oct 2012 18:26:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=bmRWDQgAacYcgW3VizvX9o+evO04s1b8d0cJgL5So04=; b=MHVlLiuSQdqTM3ZE8/tTznTHRf0zABwVBNh9LhNoOozPYrugsgabD9PpyZ4SfsO4ZZ ebU5mpNi/VeeHJIAvDHeln4VGQCsEtNYSMr1Eh5DWF0GD4LSJtpBAuPaljYmFIjKZLdc Lj+Nzb4fmdKVvfg5Nz8XJn/lo5jDCfKGsQY61DyvqlN7Xf764jvfD9uz1+3qSJdNvHlg HBnts6BlDoxG/p7qecU2S0r5XRjlCWFnc0i0qvBwMUXASpiN+MvDQDypaplcDHuPKMW/ PgiWMqvs5TWt0HQSTjAjDZ2MSSde6dSdieLTHSzNMG92+wzEHK0YqC6C0TAOC7vApwUi rfrA== MIME-Version: 1.0 Received: by 10.66.86.129 with SMTP id p1mr6638350paz.39.1350437200422; Tue, 16 Oct 2012 18:26:40 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Tue, 16 Oct 2012 18:26:40 -0700 (PDT) In-Reply-To: <201210160838.17741.jhb@freebsd.org> References: <5079A9A1.4070403@FreeBSD.org> <201210151414.27318.jhb@freebsd.org> <201210160838.17741.jhb@freebsd.org> Date: Tue, 16 Oct 2012 18:26:40 -0700 X-Google-Sender-Auth: GoMxUKHM6DlvJHVOZ6ge9-Pu1BU Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Cc: "Alexander V. Chernikov" , freebsd-net@freebsd.org, Jack Vogel , net@freebsd.org, Luigi Rizzo X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 01:26:41 -0000 On 16 October 2012 05:38, John Baldwin wrote: > I don't follow how this is related to this thread at all (which has more to do > with ixgbe scheduling duplicate work). However, is your issue that the stack > locks (e.g. socket and protocol layer locks) are held across > if_start/if_transmit? It's a comment on the larger scale architectural problem. Since if_transmit and if_start are called from multiple thread contexts, the current ways drivers implement this are: * support direct dispatch to hardware, but wrap the whole sending process in one enormous lock, to prevent packet reordering issues; or * drop TX and TX completion into a TX taskqueue (or multiple, one per hardware send queue) and push frames into that taskqueue via some queue and then wake said taskqueue up; or * some bastardised version of both. For the intel drivers, the locks are held for a (potentially) very long time. Both igb and ixgb both hold the locks for the entirety of the TX process. It's not protecting something like a queue operation, it's effectively serialising the entirety of the TX and TX completion process. That works ok-ish for ethernet drivers which are "send and ignore", but for wireless drivers where the stack implements a lot more state, it really does quite suck. And since wireless drivers have a top level idea of sequence and encryption (ie, it's not per-TCP stream, it's across multiple sending streams to a given node), I can't model the locking and serialisation on what the TCP/UDP code does. I wish we had a better way of implementing "serialisation without long, long held locks" but short of stuffing everything into a taskqueue and only locking the send queue involved, I can't really think of anything. Adrian From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 03:18:38 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 06D37768 for ; Wed, 17 Oct 2012 03:18:38 +0000 (UTC) (envelope-from rfg@tristatelogic.com) Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id B3BFE8FC14 for ; Wed, 17 Oct 2012 03:18:37 +0000 (UTC) Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1]) by segfault.tristatelogic.com (Postfix) with ESMTP id 7FA275081A for ; Tue, 16 Oct 2012 20:18:29 -0700 (PDT) To: freebsd-net@freebsd.org Subject: Wireless Networking Bug(s) in 9.1-RC2 (?) Date: Tue, 16 Oct 2012 20:18:29 -0700 Message-ID: <15066.1350443909@tristatelogic.com> From: "Ronald F. Guilmette" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 03:18:38 -0000 Greerings, I am currently running 9.1-RC2 on my laptop, and I'm wondering what the proper procedure is for reporting bugs in not-yet-released releases. Could somebody please tell me? Should I just file a regular PR? (I've never done this before for anything that's not an official -RELEASE, and I don't want to be busting anybody's chops over something that isn't considered ready-for-prine-time anyway.) So anyway, I'll give the issue to you in a nutshell... This laptop has both wired ethernet and wireless (11{b,g,n}) capabilities. I have a Linksys E1000 which I had this thing successfully talking to/with (using 11n) under 9.0-RELEASE. (The Linksys is set to speak `N-Only'.) Now however, it does appear to me that in 9.1-RC2 there may perhaps be a problem which is causing the iwn0 interface to want to speak to the Linksys using 11b, of all things. (I would have though that if it was giving up on `N' it would have fallen back to `G' next.) I include below relevant portions of my /etc/rc.conf file and the output I am now getting from ifconfig -a. Guidance would be appreciated. Should I be filing a PR? Is my rc.conf goofed? Regards, rfg P.S. Actually, I've never tried running _both_ the wired & wireless stuff on this laptop in parallel before now. Is that part of the problem? And anyway, how exactly does the system establish a default route to 192.168.1.1 when there are two (or more) ways to get there from here? rc.conf: ============================================================================= hostname="slim.tristatelogic.com" ifconfig_re0="inet 192.168.1.23 netmask 255.255.255.0" defaultrouter="192.168.1.1" # wlans_iwn0="wlan0" ifconfig_wlan0="WPA inet 192.168.1.21 netmask 255.255.255.0 ssid ronair2-1" ============================================================================= ifconfig -a: ============================================================================= re0: flags=8843 metric 0 mtu 1500 options=8209b ether 00:24:21:65:ad:a0 inet 192.168.1.23 netmask 0xffffff00 broadcast 192.168.1.255 inet6 fe80::224:21ff:fe65:ada0%re0 prefixlen 64 scopeid 0x4 nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active iwn0: flags=8803 metric 0 mtu 2290 ether 00:22:fb:76:6d:18 nd6 options=29 media: IEEE 802.11 Wireless Ethernet autoselect mode 11b status: associated lo0: flags=8049 metric 0 mtu 16384 options=600003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa inet 127.0.0.1 netmask 0xff000000 nd6 options=21 wlan0: flags=8803 metric 0 mtu 1500 ether 00:22:fb:76:6d:18 inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255 inet6 fe80::222:fbff:fe76:6d18%wlan0 prefixlen 64 tentative scopeid 0xb nd6 options=29 media: IEEE 802.11 Wireless Ethernet autoselect (autoselect) status: no carrier ssid ronair2-1 channel 1 (2412 MHz 11b) country US authmode WPA1+WPA2/802.11i privacy OFF txpower 15 bmiss 10 scanvalid 450 bgscan bgscanintvl 300 bgscanidle 250 roam:rssi 7 roam:rate 1 wme roaming MANUAL bintval 0 ============================================================================= From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 04:21:27 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1763E2EC for ; Wed, 17 Oct 2012 04:21:27 +0000 (UTC) (envelope-from kob6558@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 9EF018FC08 for ; Wed, 17 Oct 2012 04:21:26 +0000 (UTC) Received: by mail-we0-f182.google.com with SMTP id x43so5203396wey.13 for ; Tue, 16 Oct 2012 21:21:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=EVk0+JpVZR97dDKwYc1f/u2nwwRLlFzR5orOm0fhA6U=; b=TNr064IOj8i1KDLR2GBc2omcLI0CVueiztYi+RXJv9L6JX+dYEiIKf7msoM6ppDuv3 kDNE6gNnnphyWDldNrgc5wqVE1hWHTm9acqbgTm0LRl72rNoIF0In1fB7NFjhoWVLEeV QI24y6zJdsrmvU/VEU224CsuZNBpoc/Rfm6ztiJMNEg5RdnKAH5acG3UqN5gPO62RA7A wqhbcmBfezYoz5kfL5fagMfE4KUNS8ML8b40P3d7og7Vmh2B8yTcxZpgnbK3QGxib9jL HbogWnHCrU1oMJaO6v7YhY4CUjpgg8pW5zqY4V14Y1lPiPKeyCNORk25dxLmafyWf46b R6Lw== MIME-Version: 1.0 Received: by 10.216.197.104 with SMTP id s82mr10013089wen.62.1350447685564; Tue, 16 Oct 2012 21:21:25 -0700 (PDT) Received: by 10.223.66.194 with HTTP; Tue, 16 Oct 2012 21:21:25 -0700 (PDT) In-Reply-To: <15066.1350443909@tristatelogic.com> References: <15066.1350443909@tristatelogic.com> Date: Tue, 16 Oct 2012 21:21:25 -0700 Message-ID: Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) From: Kevin Oberman To: "Ronald F. Guilmette" Content-Type: text/plain; charset=UTF-8 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 04:21:27 -0000 On Tue, Oct 16, 2012 at 8:18 PM, Ronald F. Guilmette wrote: > > > Greerings, > > I am currently running 9.1-RC2 on my laptop, and I'm wondering what the > proper procedure is for reporting bugs in not-yet-released releases. > Could somebody please tell me? Should I just file a regular PR? (I've > never done this before for anything that's not an official -RELEASE, > and I don't want to be busting anybody's chops over something that isn't > considered ready-for-prine-time anyway.) I think stable@ is probably the best choice. wireless@ would also be an appropriate place. > So anyway, I'll give the issue to you in a nutshell... This laptop has > both wired ethernet and wireless (11{b,g,n}) capabilities. I have a > Linksys E1000 which I had this thing successfully talking to/with > (using 11n) under 9.0-RELEASE. (The Linksys is set to speak `N-Only'.) > > Now however, it does appear to me that in 9.1-RC2 there may perhaps be > a problem which is causing the iwn0 interface to want to speak to the > Linksys using 11b, of all things. (I would have though that if it was > giving up on `N' it would have fallen back to `G' next.) > > I include below relevant portions of my /etc/rc.conf file and the output > I am now getting from ifconfig -a. > > Guidance would be appreciated. Should I be filing a PR? Is my rc.conf > goofed? > > > Regards, > rfg > > > P.S. Actually, I've never tried running _both_ the wired & wireless stuff > on this laptop in parallel before now. Is that part of the problem? And > anyway, how exactly does the system establish a default route to 192.168.1.1 > when there are two (or more) ways to get there from here? > > > rc.conf: > ============================================================================= > hostname="slim.tristatelogic.com" > ifconfig_re0="inet 192.168.1.23 netmask 255.255.255.0" > defaultrouter="192.168.1.1" > # > wlans_iwn0="wlan0" > ifconfig_wlan0="WPA inet 192.168.1.21 netmask 255.255.255.0 ssid ronair2-1" > ============================================================================= > > ifconfig -a: > ============================================================================= > re0: flags=8843 metric 0 mtu 1500 > options=8209b > ether 00:24:21:65:ad:a0 > inet 192.168.1.23 netmask 0xffffff00 broadcast 192.168.1.255 > inet6 fe80::224:21ff:fe65:ada0%re0 prefixlen 64 scopeid 0x4 > nd6 options=29 > media: Ethernet autoselect (100baseTX ) > status: active > iwn0: flags=8803 metric 0 mtu 2290 > ether 00:22:fb:76:6d:18 > nd6 options=29 > media: IEEE 802.11 Wireless Ethernet autoselect mode 11b > status: associated > lo0: flags=8049 metric 0 mtu 16384 > options=600003 > inet6 ::1 prefixlen 128 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa > inet 127.0.0.1 netmask 0xff000000 > nd6 options=21 > wlan0: flags=8803 metric 0 mtu 1500 > ether 00:22:fb:76:6d:18 > inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255 > inet6 fe80::222:fbff:fe76:6d18%wlan0 prefixlen 64 tentative scopeid 0xb > nd6 options=29 > media: IEEE 802.11 Wireless Ethernet autoselect (autoselect) > status: no carrier > ssid ronair2-1 channel 1 (2412 MHz 11b) > country US authmode WPA1+WPA2/802.11i privacy OFF txpower 15 bmiss 10 > scanvalid 450 bgscan bgscanintvl 300 bgscanidle 250 roam:rssi 7 > roam:rate 1 wme roaming MANUAL bintval 0 > ============================================================================= I don't see any real issue with your configuration, but I do see something odd and it may be tied to the problem you are seeing. FWIW, I also have an agn iwn card, but I only have a G access point at this time and it runs fine in G. The oddity is that you specify your ssid in the rc.conf file while using WPA. I've never seen that before. It's in my wpa_supplicant.conf file. It seems more reasonable for a laptop that may need to associate with a home and a work SSID as well as ones at conferences and, in my case alternate work and home SSIDs. When it is in the rc.conf file, it requires change with every relocation. in any case, you might try moving the SID into the wpa_supplicant.conf file, but my bet is it is N specific. Paging Adrian. -- R. Kevin Oberman, Network Engineer E-mail: kob6558@gmail.com From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 07:41:15 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0C3A58F for ; Wed, 17 Oct 2012 07:41:15 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id D0A2F8FC14 for ; Wed, 17 Oct 2012 07:41:14 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id rp8so7380627pbb.13 for ; Wed, 17 Oct 2012 00:41:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=NDN9LWemtqHNDLHQUZL1CygDjX8I0jBqLtFZX+POpvo=; b=u/Yo3xhdYTI+pqV7aIdgsQY282lvPopLUyhiJzFF8ZhTnNjOCeEtYc7vVnAomgTcwR ObFoOBCLdDEF2KFARTHil0yLVSLsq6jcXRe2cUAc0CJsXR4VUfTyS7+WwJLryg2Um6xo lNBE5Cnc3l/+HD0un1kU6vwX/PMPpHFJe9TD6kxc/9keQQYKIc8OUGn0YqnMKQSaG7Hb eXRDQyIawIJuik8YqEEv9bi8KWdq4LM9oi+zz7ozmJqeVMzylnylBWyuTVt4UKWwpW1W 7N6DgkohxWsFRiT1lLbVLnNETU5iGB8mnz21jCtbktpFgXNCezkr7rnKCqtTgZeURDay 2Y7w== MIME-Version: 1.0 Received: by 10.68.218.226 with SMTP id pj2mr54538138pbc.33.1350459674258; Wed, 17 Oct 2012 00:41:14 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Wed, 17 Oct 2012 00:41:14 -0700 (PDT) In-Reply-To: References: <15066.1350443909@tristatelogic.com> Date: Wed, 17 Oct 2012 00:41:14 -0700 X-Google-Sender-Auth: sN8-E_rIn3uQ_tOtbmi5XHWS9zk Message-ID: Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) From: Adrian Chadd To: Kevin Oberman Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, "Ronald F. Guilmette" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 07:41:15 -0000 for wifi - you need to configure /etc/wpa_supplicant.conf as well, right? You don't need the ssid in the ifconfig line; wpa_supplicant will scan and find your AP. The driver should call back to non-n and non-g if needs be. As for the config - erm, you have two interfaces on the same L2. That's going to confuse things, right? What's 'netstat -rn' show? Adrian From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 07:42:07 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4DA7B13F for ; Wed, 17 Oct 2012 07:42:07 +0000 (UTC) (envelope-from remi.pauchet@netasq.com) Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1]) by mx1.freebsd.org (Postfix) with ESMTP id C01798FC16 for ; Wed, 17 Oct 2012 07:42:06 +0000 (UTC) Received: from [10.2.9.2] (unknown [91.212.116.2]) by work.netasq.com (Postfix) with ESMTPSA id 8917027053AC; Wed, 17 Oct 2012 09:42:04 +0200 (CEST) Subject: Re: ixgbe and ixgbevf drivers are not working in virtualization environment Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: multipart/signed; boundary="Apple-Mail=_1D8C5446-7BBA-41AB-B251-E7E24D126B93"; protocol="application/pkcs7-signature"; micalg=sha1 From: =?iso-8859-1?Q?R=E9mi_Pauchet?= In-Reply-To: Date: Wed, 17 Oct 2012 09:42:03 +0200 Message-Id: References: <792D5931-19E7-4239-A3E8-5D2BC90F03FD@netasq.com> To: Jack Vogel X-Mailer: Apple Mail (2.1283) X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 07:42:07 -0000 --Apple-Mail=_1D8C5446-7BBA-41AB-B251-E7E24D126B93 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Hi My interface is configured, UP and running and I still can't get a link Can you help me with this issue ? Regards, R=E9mi Le 12 oct. 2012 =E0 09:38, R=E9mi Pauchet a =E9crit : > Hi, >=20 > Unfortunately not: >=20 > ix0: flags=3D8843 metric 0 mtu = 1500 > = options=3D401bb > ether 00:e0:ed:1c:99:4e > inet 172.16.255.254 netmask 0xffff0000 broadcast 172.16.255.255 > inet6 fe80::2e0:edff:fe1c:994e%ix0 prefixlen 64 scopeid 0x2=20 > nd6 options=3D29 > media: Ethernet autoselect > status: no carrier > ix1: flags=3D8843 metric 0 mtu = 1500 > = options=3D401bb > ether 00:e0:ed:1c:99:4f > inet 172.17.255.254 netmask 0xffff0000 broadcast 172.17.255.255 > inet6 fe80::2e0:edff:fe1c:994f%ix1 prefixlen 64 scopeid 0x3=20 > nd6 options=3D29 > media: Ethernet autoselect > status: no carrier >=20 > Regards, > R=E9mi >=20 > Le 11 oct. 2012 =E0 18:25, Jack Vogel a =E9crit : >=20 >> The ixgbe device will not get link until you have run init, so assign = it an address or just do an ifconfig up. >>=20 >> I have never used the driver using a passthru type setup but I = believe its been done successfully if >> memory serves. >>=20 >> Jack >>=20 >>=20 >> On Thu, Oct 11, 2012 at 8:39 AM, R=E9mi Pauchet = wrote: >> Hi, >>=20 >> I'm trying to use the ixgbe (10Gb) driver in a FreeBSD virtual = machine on an esxi 5 using DirectPath (PCI Passthrough) and the card is = detected, but I can't get a link (status: no carrier) >>=20 >> ix0: mem 0xd2420000-0xd243ffff,0xd2400000-0xd2403fff irq 18 at device = 0.0 on pci3 >>=20 >> ix0: flags=3D8802 metric 0 mtu 1500 >> = options=3D401bb >> ether 00:e0:ed:1c:99:4e >> nd6 options=3D29 >> media: Ethernet autoselect >> status: no carrier >>=20 >> I have also tested with XenServer 6, using SR-IOV (ixgbevf driver) = with the same result: the driver is loading, but no link detected. >>=20 >> In both case (VMWare DirectPath and XenServer SR-IOV), I tested Linux = with success. >>=20 >>=20 >> The card is an Intel 82599EB, the motherboard is an Intel X58 = (supermicro X8ST3) with a Xeon W3680 and I've tested FreeBSD 8.3 and 9.0 >>=20 >> I've found a forum thread with the same issue: = http://forums.freebsd.org/showthread.php?t=3D29855 and no answer :) >>=20 >>=20 >> Please find in attachment the dmesg (boot -v) with the ix driver = compiled with DEBUG flags using vmware. >>=20 >>=20 >> Can anyone provide feedback about this issue ? >>=20 >> Regards, >> R=E9mi Pauchet >>=20 >>=20 >>=20 >=20 --Apple-Mail=_1D8C5446-7BBA-41AB-B251-E7E24D126B93-- From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 07:59:18 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8A5A378B for ; Wed, 17 Oct 2012 07:59:18 +0000 (UTC) (envelope-from rfg@tristatelogic.com) Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id 5D5768FC14 for ; Wed, 17 Oct 2012 07:59:18 +0000 (UTC) Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1]) by segfault.tristatelogic.com (Postfix) with ESMTP id B17725081A; Wed, 17 Oct 2012 00:59:15 -0700 (PDT) To: Kevin Oberman Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) In-Reply-To: Date: Wed, 17 Oct 2012 00:59:15 -0700 Message-ID: <16376.1350460755@tristatelogic.com> From: "Ronald F. Guilmette" Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 07:59:18 -0000 In message , you wrote: >I wrote: >> P.S. Actually, I've never tried running _both_ the wired & wireless stuff >> on this laptop in parallel before now. Is that part of the problem? And >> anyway, how exactly does the system establish a default route to 192.168.1.1 >> when there are two (or more) ways to get there from here? >>... >I don't see any real issue with your configuration, but I do see >something odd and it may be tied to the problem you are seeing. FWIW, >I also have an agn iwn card, but I only have a G access point at this >time and it runs fine in G. Yes, as I mentioned, when I was running 9.0-RELEASE, my iwn0 was talking just fine to my Linksys. (That was mostly `N', but I think that I may have had the two playing nice together with `G' also.) >The oddity is that you specify your ssid in the rc.conf file while >using WPA. I've never seen that before. Well, see, the instructions on this page: http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-wireless.html are not really all that clear. Some of the examples have the ssid clause in the ifconfig_XXX= lines in the rc.conf file, while others don't. One example that I would have liked very much to have seen in there would have been an example showing what to put in rc.conf in the case where one wants to do WPA, but with static IPs, rather than DHCP. The closest thing to that is under Section 32.3.3.1.2.4, and in the example there, as you can see, there is an ssid clause in the ifconfig_wlan0= line. (I assumed that was necessary in case there were multiple ssid/password pairs within the wpa_supplicant.conf file, and obviously, in such a case, set up of the interface has to pick one of them from among the available alternatives.) What is correct? Beats the hell out of me! I am not in any sense an expert of this stuff. All I can say is that the examples on this page are confusing. >It's in my wpa_supplicant.conf file. Yes, I have the ssid name in there too. >It seems more reasonable for a laptop that may need to associate >with a home and a work SSID as well as ones at conferences and... Well, no. Actually, at the moment, I *only* have an interest in connecting to my own local Linksys... nothing else. (That part of why I'm using a static IP... this is effectively just a static connection... minus the wires and the drilling of holes through the walls.) >in any case, you might try moving the SID into the wpa_supplicant.conf file. That kinda remind me of that old Ragu spagetti sauce TV commercial... "It's in there!" :-) >but my bet is it is N specific. I doubt it. I think I had the same questionable setup when I was running `G' on 9.0-RELEASE. But I would like to find out what the Right Answer is also. >Paging Adrian. Yes, please. Regards, rfg P.S. What about my routing question? If I have one machine and it has two independent connections to 192.168.1.1 and the rc.conf file says: defaultrouter="192.168.1.1" then how does FreeBSD decide (or figure out) which of the two interfaces packets going to some random IPv4 address elsewhere will flow out of? For me at least, this is really puzzling. From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 08:19:04 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3DACCE17; Wed, 17 Oct 2012 08:19:04 +0000 (UTC) (envelope-from rfg@tristatelogic.com) Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id 0D5958FC08; Wed, 17 Oct 2012 08:19:03 +0000 (UTC) Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1]) by segfault.tristatelogic.com (Postfix) with ESMTP id 4DA5F5081B; Wed, 17 Oct 2012 01:19:03 -0700 (PDT) To: Adrian Chadd Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) In-Reply-To: Date: Wed, 17 Oct 2012 01:19:03 -0700 Message-ID: <16534.1350461943@tristatelogic.com> From: "Ronald F. Guilmette" Cc: freebsd-net@freebsd.org, Kevin Oberman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 08:19:04 -0000 In message , you wrote: >for wifi - you need to configure /etc/wpa_supplicant.conf as well, >right? Did that. Yes. >You don't need the ssid in the ifconfig line; OK. If you say so. (See my prior e-mail where I wondered aloud if there are circumstances where the ssid might have to appear in both places.) wpa_supplicant 9 >will scan and find your AP. > >The driver should call back to non-n and non-g if needs be. > >As for the config - erm, you have two interfaces on the same L2. >That's going to confuse things, right? Well, I can't speak for the hardware, but it sure as hell does confuse *me*. (1/2 :-) >What's 'netstat -rn' show? Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 192.168.1.1 UGS 0 104122 re0 127.0.0.1 link#10 UH 0 0 lo0 192.168.1.0/24 link#4 U 0 23515 re0 192.168.1.21 link#11 UHS 0 0 lo0 192.168.1.23 link#4 UHS 0 0 lo0 Internet6: Destination Gateway Flags Netif Expire ::/96 ::1 UGRS lo0 ::1 link#10 UH lo0 ::ffff:0.0.0.0/96 ::1 UGRS lo0 fe80::/10 ::1 UGRS lo0 fe80::%re0/64 link#4 U re0 fe80::224:21ff:fe65:ada0%re0 link#4 UHS lo0 fe80::%lo0/64 link#10 U lo0 fe80::1%lo0 link#10 UHS lo0 fe80::%wlan0/64 link#11 U wlan0 fe80::222:fbff:fe76:6d18%wlan0 link#11 UHS lo0 ff01::%re0/32 fe80::224:21ff:fe65:ada0%re0 U re0 ff01::%lo0/32 ::1 U lo0 ff01::%wlan0/32 fe80::222:fbff:fe76:6d18%wlan0 U wlan0 ff02::/16 ::1 UGRS lo0 ff02::%re0/32 fe80::224:21ff:fe65:ada0%re0 U re0 ff02::%lo0/32 ::1 U lo0 ff02::%wlan0/32 fe80::222:fbff:fe76:6d18%wlan0 U wlan0 P.S. I ain't using IPv6... like not at all. From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 13:58:44 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B6FC01B8; Wed, 17 Oct 2012 13:58:44 +0000 (UTC) (envelope-from guy.helmer@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id 585628FC1C; Wed, 17 Oct 2012 13:58:44 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id k10so15445297iea.13 for ; Wed, 17 Oct 2012 06:58:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=uBkU6J6yz2QtLFjXXnukFNjQ++pfiG7l0wHcQUIVeEU=; b=EqBfrGxUafgz1idFOfr+iqdSMibRQdUJBVRzPbz6bk5oTqbJXkfkIwSlTXadzFEH7J sRW4tWqoLJFOjJr2Pj447JWP0E6bAhd0rVEI+EQKrdvSLpW90vKVWimNvPyK3CC6XAu6 TUmaNWN1lGyhz0IaQku2ALThX1wnbjQP9dRudkcwbcvdfv+vauG102Ob7f+5IkaLtVUk 5FafdIvi3y/szeR8VrCv7vBzdrOgrFHqlX4c+C8HisiYPQt4wGDCSGFmZ4DNYusNl79Y 0Q5eZC2OGZfkUoO5xTGA0QkZX2cQskEIzs6bjujnKLUZDiedSp2Bq4xP5aobf68KPwSt GRQg== Received: by 10.50.190.232 with SMTP id gt8mr1578197igc.69.1350482321182; Wed, 17 Oct 2012 06:58:41 -0700 (PDT) Received: from guysmbp.dyn.palisadesys.com ([216.81.189.9]) by mx.google.com with ESMTPS id az4sm3015212igb.2.2012.10.17.06.58.39 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 17 Oct 2012 06:58:40 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 8.3: kernel panic in bpf.c catchpacket() From: Guy Helmer In-Reply-To: <1EDA1615-2CDE-405A-A725-AF7CC7D3E273@gmail.com> Date: Wed, 17 Oct 2012 08:58:42 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <381E3EEC-7EDB-428B-A724-434443E51A53@gmail.com> References: <4B5399BF-4EE0-4182-8297-3BB97C4AA884@gmail.com> <59F9A36E-3DB2-4F6F-BB2A-A4C9DA76A70C@gmail.com> <5075C05E.9070800@FreeBSD.org> <1EDA1615-2CDE-405A-A725-AF7CC7D3E273@gmail.com> To: "Alexander V. Chernikov" X-Mailer: Apple Mail (2.1499) Cc: freebsd-net@freebsd.org, FreeBSD Stable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 13:58:44 -0000 On Oct 12, 2012, at 8:54 AM, Guy Helmer wrote: >=20 > On Oct 10, 2012, at 1:37 PM, Alexander V. Chernikov = wrote: >=20 >> On 10.10.2012 00:36, Guy Helmer wrote: >>>=20 >>> On Oct 8, 2012, at 8:09 AM, Guy Helmer wrote: >>>=20 >>>> I'm seeing a consistent new kernel panic in FreeBSD 8.3: >>>> I'm not seeing how bd_sbuf would be NULL here. Any ideas? >>>=20 >>> Since I've not had any replies, I hope nobody minds if I reply with = more information. >>>=20 >>> This panic seems to be occasionally triggered now that my user land = code is changing the packet filter a while after the bpd device has been = opened and an initial packet filter was set (previously, my code did not = change the filter after it was initially set). >>>=20 >>> I'm focusing on bpf_setf() since that seems to be the place that = could be tickling a problem, and I see that bpf_setf() calls reset_d(d) = to clear the hold buffer. I have manually verified that the BPFD lock is = held during the call to reset_d(), and the lock is held every other = place that the buffers are manipulated, so I haven't been able to find = any place that seems vulnerable to losing one of the bpf buffers. Still = searching, but any help would be appreciated. >>=20 >> Can you please check this code on -current? >> Locking has changed quite significantly some time ago, so there is = good chance that you can get rid of this panic (or discover different = one which is really "new") :). >=20 > I'm not ready to run this app on current, so I have merged revs = 229898, 233937, 233938, 233946, 235744, 235745, 235746, 235747, 236231, = 236251, 236261, 236262, 236559, and 236806 to my 8.3 checkout to get = code that should be virtually identical to current without the timestamp = changes. >=20 > Unfortunately, I have only been able to trigger the panic in my test = lab once -- so I'm not sure whether a lack of problems with the updated = code will be indicative of likely success in the field where this has = been trigged regularly at some sites=85 >=20 > Thanks, > Guy >=20 FWIW, I was able to trigger the panic with the original 8.3 code again = in my test lab. With these changes resulting from merging the revs = mentioned above, I have not seen any panics in my test lab setup in two = days of load testing, and AFAIK, packet capturing seems to be working = fine. I've included the diffs for reference for anyone encountering the issue. Thanks, Alexander! Guy Index: net/bpf.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- net/bpf.c (revision 239830) +++ net/bpf.c (working copy) @@ -43,6 +43,8 @@ =20 #include #include +#include +#include #include #include #include @@ -66,6 +68,7 @@ #include =20 #include +#define BPF_INTERNAL #include #include #ifdef BPF_JITTER @@ -139,6 +142,7 @@ =20 static void bpf_attachd(struct bpf_d *, struct bpf_if *); static void bpf_detachd(struct bpf_d *); +static void bpf_detachd_locked(struct bpf_d *); static void bpf_freed(struct bpf_d *); static int bpf_movein(struct uio *, int, struct ifnet *, struct = mbuf **, struct sockaddr *, int *, struct bpf_insn *); @@ -150,7 +154,7 @@ void (*)(struct bpf_d *, caddr_t, u_int, void *, = u_int), struct timeval *); static void reset_d(struct bpf_d *); -static int bpf_setf(struct bpf_d *, struct bpf_program *, u_long = cmd); +static int bpf_setf(struct bpf_d *, struct bpf_program *, u_long = cmd); static int bpf_getdltlist(struct bpf_d *, struct bpf_dltlist *); static int bpf_setdlt(struct bpf_d *, u_int); static void filt_bpfdetach(struct knote *); @@ -168,6 +172,12 @@ SYSCTL_NODE(_net_bpf, OID_AUTO, stats, CTLFLAG_MPSAFE | CTLFLAG_RW, bpf_stats_sysctl, "bpf statistics portal"); =20 +static VNET_DEFINE(int, bpf_optimize_writers) =3D 0; +#define V_bpf_optimize_writers VNET(bpf_optimize_writers) +SYSCTL_VNET_INT(_net_bpf, OID_AUTO, optimize_writers, + CTLFLAG_RW, &VNET_NAME(bpf_optimize_writers), 0, + "Do not send packets until BPF program is set"); + static d_open_t bpfopen; static d_read_t bpfread; static d_write_t bpfwrite; @@ -189,7 +199,38 @@ static struct filterops bpfread_filtops =3D { 1, NULL, filt_bpfdetach, filt_bpfread }; =20 +eventhandler_tag bpf_ifdetach_cookie =3D NULL; + /* + * LOCKING MODEL USED BY BPF: + * Locks: + * 1) global lock (BPF_LOCK). Mutex, used to protect interface = addition/removal, + * some global counters and every bpf_if reference. + * 2) Interface lock. Rwlock, used to protect list of BPF descriptors = and their filters. + * 3) Descriptor lock. Mutex, used to protect BPF buffers and various = structure fields + * used by bpf_mtap code. + * + * Lock order: + * + * Global lock, interface lock, descriptor lock + * + * We have to acquire interface lock before descriptor main lock due to = BPF_MTAP[2] + * working model. In many places (like bpf_detachd) we start with BPF = descriptor + * (and we need to at least rlock it to get reliable interface = pointer). This + * gives us potential LOR. As a result, we use global lock to protect = from bpf_if + * change in every such place. + * + * Changing d->bd_bif is protected by 1) global lock, 2) interface lock = and + * 3) descriptor main wlock. + * Reading bd_bif can be protected by any of these locks, typically = global lock. + * + * Changing read/write BPF filter is protected by the same three locks, + * the same applies for reading. + * + * Sleeping in global lock is not allowed due to bpfdetach() using it. + */ + +/* * Wrapper functions for various buffering methods. If the set of = buffer * modes expands, we will probably want to introduce a switch data = structure * similar to protosw, et. @@ -282,7 +323,6 @@ static int bpf_canwritebuf(struct bpf_d *d) { - BPFD_LOCK_ASSERT(d); =20 switch (d->bd_bufmode) { @@ -561,18 +601,93 @@ static void bpf_attachd(struct bpf_d *d, struct bpf_if *bp) { + int op_w; + + BPF_LOCK_ASSERT(); + /* - * Point d at bp, and add d to the interface's list of = listeners. - * Finally, point the driver's bpf cookie at the interface so - * it will divert packets to bpf. + * Save sysctl value to protect from sysctl change + * between reads */ - BPFIF_LOCK(bp); + op_w =3D V_bpf_optimize_writers; + + if (d->bd_bif !=3D NULL) + bpf_detachd_locked(d); + /* + * Point d at bp, and add d to the interface's list. + * Since there are many applicaiotns using BPF for + * sending raw packets only (dhcpd, cdpd are good examples) + * we can delay adding d to the list of active listeners until + * some filter is configured. + */ + + BPFIF_WLOCK(bp); + BPFD_LOCK(d); + d->bd_bif =3D bp; - LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); =20 + if (op_w !=3D 0) { + /* Add to writers-only list */ + LIST_INSERT_HEAD(&bp->bif_wlist, d, bd_next); + /* + * We decrement bd_writer on every filter set operation. + * First BIOCSETF is done by pcap_open_live() to set up + * snap length. After that appliation usually sets its = own filter + */ + d->bd_writer =3D 2; + } else + LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); + + BPFD_UNLOCK(d); + BPFIF_WUNLOCK(bp); + bpf_bpfd_cnt++; - BPFIF_UNLOCK(bp); =20 + CTR3(KTR_NET, "%s: bpf_attach called by pid %d, adding to %s = list", + __func__, d->bd_pid, d->bd_writer ? "writer" : "active"); + + if (op_w =3D=3D 0) + EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, = 1); +} + +/* + * Add d to the list of active bp filters. + * Reuqires bpf_attachd() to be called before + */ +static void +bpf_upgraded(struct bpf_d *d) +{ + struct bpf_if *bp; + + BPF_LOCK_ASSERT(); + + bp =3D d->bd_bif; + + /* + * Filter can be set several times without specifying interface. + * Mark d as reader and exit. + */ + if (bp =3D=3D NULL) { + BPFD_LOCK(d); + d->bd_writer =3D 0; + BPFD_UNLOCK(d); + return; + } + + BPFIF_WLOCK(bp); + BPFD_LOCK(d); + + /* Remove from writers-only list */ + LIST_REMOVE(d, bd_next); + LIST_INSERT_HEAD(&bp->bif_dlist, d, bd_next); + /* Mark d as reader */ + d->bd_writer =3D 0; + + BPFD_UNLOCK(d); + BPFIF_WUNLOCK(bp); + + CTR2(KTR_NET, "%s: upgrade required by pid %d", __func__, = d->bd_pid); + EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1); } =20 @@ -582,27 +697,48 @@ static void bpf_detachd(struct bpf_d *d) { + BPF_LOCK(); + bpf_detachd_locked(d); + BPF_UNLOCK(); +} + +static void +bpf_detachd_locked(struct bpf_d *d) +{ int error; struct bpf_if *bp; struct ifnet *ifp; =20 - bp =3D d->bd_bif; - BPFIF_LOCK(bp); + CTR2(KTR_NET, "%s: detach required by pid %d", __func__, = d->bd_pid); + + BPF_LOCK_ASSERT(); + + /* Check if descriptor is attached */ + if ((bp =3D d->bd_bif) =3D=3D NULL) + return; + + BPFIF_WLOCK(bp); BPFD_LOCK(d); - ifp =3D d->bd_bif->bif_ifp; =20 + /* Save bd_writer value */ + error =3D d->bd_writer; + /* * Remove d from the interface's descriptor list. */ LIST_REMOVE(d, bd_next); =20 - bpf_bpfd_cnt--; + ifp =3D bp->bif_ifp; d->bd_bif =3D NULL; BPFD_UNLOCK(d); - BPFIF_UNLOCK(bp); + BPFIF_WUNLOCK(bp); =20 - EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0); + bpf_bpfd_cnt--; =20 + /* Call event handler iff d is attached */ + if (error =3D=3D 0) + EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0); + /* * Check if this descriptor had requested promiscuous mode. * If so, turn it off. @@ -640,10 +776,7 @@ d->bd_state =3D BPF_IDLE; BPFD_UNLOCK(d); funsetown(&d->bd_sigio); - mtx_lock(&bpf_mtx); - if (d->bd_bif) - bpf_detachd(d); - mtx_unlock(&bpf_mtx); + bpf_detachd(d); #ifdef MAC mac_bpfdesc_destroy(d); #endif /* MAC */ @@ -663,7 +796,7 @@ bpfopen(struct cdev *dev, int flags, int fmt, struct thread *td) { struct bpf_d *d; - int error; + int error, size; =20 d =3D malloc(sizeof(*d), M_BPF, M_WAITOK | M_ZERO); error =3D devfs_set_cdevpriv(d, bpf_dtor); @@ -681,15 +814,19 @@ d->bd_bufmode =3D BPF_BUFMODE_BUFFER; d->bd_sig =3D SIGIO; d->bd_direction =3D BPF_D_INOUT; - d->bd_pid =3D td->td_proc->p_pid; + BPF_PID_REFRESH(d, td); #ifdef MAC mac_bpfdesc_init(d); mac_bpfdesc_create(td->td_ucred, d); #endif - mtx_init(&d->bd_mtx, devtoname(dev), "bpf cdev lock", MTX_DEF); - callout_init_mtx(&d->bd_callout, &d->bd_mtx, 0); - knlist_init_mtx(&d->bd_sel.si_note, &d->bd_mtx); + mtx_init(&d->bd_lock, devtoname(dev), "bpf cdev lock", MTX_DEF); + callout_init_mtx(&d->bd_callout, &d->bd_lock, 0); + knlist_init_mtx(&d->bd_sel.si_note, &d->bd_lock); =20 + /* Allocate default buffers */ + size =3D d->bd_bufsize; + bpf_buffer_ioctl_sblen(d, &size); + return (0); } =20 @@ -718,7 +855,7 @@ non_block =3D ((ioflag & O_NONBLOCK) !=3D 0); =20 BPFD_LOCK(d); - d->bd_pid =3D curthread->td_proc->p_pid; + BPF_PID_REFRESH_CUR(d); if (d->bd_bufmode !=3D BPF_BUFMODE_BUFFER) { BPFD_UNLOCK(d); return (EOPNOTSUPP); @@ -764,7 +901,7 @@ BPFD_UNLOCK(d); return (EWOULDBLOCK); } - error =3D msleep(d, &d->bd_mtx, PRINET|PCATCH, + error =3D msleep(d, &d->bd_lock, PRINET|PCATCH, "bpf", d->bd_rtout); if (error =3D=3D EINTR || error =3D=3D ERESTART) { BPFD_UNLOCK(d); @@ -881,8 +1018,9 @@ if (error !=3D 0) return (error); =20 - d->bd_pid =3D curthread->td_proc->p_pid; + BPF_PID_REFRESH_CUR(d); d->bd_wcount++; + /* XXX: locking required */ if (d->bd_bif =3D=3D NULL) { d->bd_wdcount++; return (ENXIO); @@ -903,6 +1041,7 @@ bzero(&dst, sizeof(dst)); m =3D NULL; hlen =3D 0; + /* XXX: bpf_movein() can sleep */ error =3D bpf_movein(uio, (int)d->bd_bif->bif_dlt, ifp, &m, &dst, &hlen, d->bd_wfilter); if (error) { @@ -962,7 +1101,7 @@ reset_d(struct bpf_d *d) { =20 - mtx_assert(&d->bd_mtx, MA_OWNED); + BPFD_LOCK_ASSERT(d); =20 if ((d->bd_hbuf !=3D NULL) && (d->bd_bufmode !=3D BPF_BUFMODE_ZBUF || bpf_canfreebuf(d))) = { @@ -1028,7 +1167,7 @@ * Refresh PID associated with this descriptor. */ BPFD_LOCK(d); - d->bd_pid =3D td->td_proc->p_pid; + BPF_PID_REFRESH(d, td); if (d->bd_state =3D=3D BPF_WAITING) callout_stop(&d->bd_callout); d->bd_state =3D BPF_IDLE; @@ -1079,7 +1218,9 @@ case BIOCGDLTLIST32: case BIOCGRTIMEOUT32: case BIOCSRTIMEOUT32: + BPFD_LOCK(d); d->bd_compat32 =3D 1; + BPFD_UNLOCK(d); } #endif =20 @@ -1124,7 +1265,9 @@ * Get buffer len [for read()]. */ case BIOCGBLEN: + BPFD_LOCK(d); *(u_int *)addr =3D d->bd_bufsize; + BPFD_UNLOCK(d); break; =20 /* @@ -1179,10 +1322,12 @@ * Get current data link type. */ case BIOCGDLT: + BPF_LOCK(); if (d->bd_bif =3D=3D NULL) error =3D EINVAL; else *(u_int *)addr =3D d->bd_bif->bif_dlt; + BPF_UNLOCK(); break; =20 /* @@ -1197,6 +1342,7 @@ list32 =3D (struct bpf_dltlist32 *)addr; dltlist.bfl_len =3D list32->bfl_len; dltlist.bfl_list =3D PTRIN(list32->bfl_list); + BPF_LOCK(); if (d->bd_bif =3D=3D NULL) error =3D EINVAL; else { @@ -1204,31 +1350,37 @@ if (error =3D=3D 0) list32->bfl_len =3D = dltlist.bfl_len; } + BPF_UNLOCK(); break; } #endif =20 case BIOCGDLTLIST: + BPF_LOCK(); if (d->bd_bif =3D=3D NULL) error =3D EINVAL; else error =3D bpf_getdltlist(d, (struct bpf_dltlist = *)addr); + BPF_UNLOCK(); break; =20 /* * Set data link type. */ case BIOCSDLT: + BPF_LOCK(); if (d->bd_bif =3D=3D NULL) error =3D EINVAL; else error =3D bpf_setdlt(d, *(u_int *)addr); + BPF_UNLOCK(); break; =20 /* * Get interface name. */ case BIOCGETIF: + BPF_LOCK(); if (d->bd_bif =3D=3D NULL) error =3D EINVAL; else { @@ -1238,13 +1390,16 @@ strlcpy(ifr->ifr_name, ifp->if_xname, sizeof(ifr->ifr_name)); } + BPF_UNLOCK(); break; =20 /* * Set interface. */ case BIOCSETIF: + BPF_LOCK(); error =3D bpf_setif(d, (struct ifreq *)addr); + BPF_UNLOCK(); break; =20 /* @@ -1327,7 +1482,9 @@ * Set immediate mode. */ case BIOCIMMEDIATE: + BPFD_LOCK(d); d->bd_immediate =3D *(u_int *)addr; + BPFD_UNLOCK(d); break; =20 case BIOCVERSION: @@ -1343,21 +1500,27 @@ * Get "header already complete" flag */ case BIOCGHDRCMPLT: + BPFD_LOCK(d); *(u_int *)addr =3D d->bd_hdrcmplt; + BPFD_UNLOCK(d); break; =20 /* * Set "header already complete" flag */ case BIOCSHDRCMPLT: + BPFD_LOCK(d); d->bd_hdrcmplt =3D *(u_int *)addr ? 1 : 0; + BPFD_UNLOCK(d); break; =20 /* * Get packet direction flag */ case BIOCGDIRECTION: + BPFD_LOCK(d); *(u_int *)addr =3D d->bd_direction; + BPFD_UNLOCK(d); break; =20 /* @@ -1372,7 +1535,9 @@ case BPF_D_IN: case BPF_D_INOUT: case BPF_D_OUT: + BPFD_LOCK(d); d->bd_direction =3D direction; + BPFD_UNLOCK(d); break; default: error =3D EINVAL; @@ -1381,26 +1546,38 @@ break; =20 case BIOCFEEDBACK: + BPFD_LOCK(d); d->bd_feedback =3D *(u_int *)addr; + BPFD_UNLOCK(d); break; =20 case BIOCLOCK: + BPFD_LOCK(d); d->bd_locked =3D 1; + BPFD_UNLOCK(d); break; =20 case FIONBIO: /* Non-blocking I/O */ break; =20 case FIOASYNC: /* Send signal on receive packets */ + BPFD_LOCK(d); d->bd_async =3D *(int *)addr; + BPFD_UNLOCK(d); break; =20 case FIOSETOWN: + /* + * XXX: Add some sort of locking here? + * fsetown() can sleep. + */ error =3D fsetown(*(int *)addr, &d->bd_sigio); break; =20 case FIOGETOWN: + BPFD_LOCK(d); *(int *)addr =3D fgetown(&d->bd_sigio); + BPFD_UNLOCK(d); break; =20 /* This is deprecated, FIOSETOWN should be used instead. */ @@ -1421,16 +1598,23 @@ =20 if (sig >=3D NSIG) error =3D EINVAL; - else + else { + BPFD_LOCK(d); d->bd_sig =3D sig; + BPFD_UNLOCK(d); + } break; } case BIOCGRSIG: + BPFD_LOCK(d); *(u_int *)addr =3D d->bd_sig; + BPFD_UNLOCK(d); break; =20 case BIOCGETBUFMODE: + BPFD_LOCK(d); *(u_int *)addr =3D d->bd_bufmode; + BPFD_UNLOCK(d); break; =20 case BIOCSETBUFMODE: @@ -1485,95 +1669,130 @@ /* * Set d's packet filter program to fp. If this file already has a = filter, * free it and replace it. Returns EINVAL for bogus requests. + * + * Note we need global lock here to serialize bpf_setf() and = bpf_setif() calls + * since reading d->bd_bif can't be protected by d or interface lock = due to + * lock order. + * + * Additionally, we have to acquire interface write lock due to = bpf_mtap() uses + * interface read lock to read all filers. + * */ static int bpf_setf(struct bpf_d *d, struct bpf_program *fp, u_long cmd) { +#ifdef COMPAT_FREEBSD32 + struct bpf_program fp_swab; + struct bpf_program32 *fp32; +#endif struct bpf_insn *fcode, *old; - u_int wfilter, flen, size; #ifdef BPF_JITTER - bpf_jit_filter *ofunc; + bpf_jit_filter *jfunc, *ofunc; #endif + size_t size; + u_int flen; + int need_upgrade; + #ifdef COMPAT_FREEBSD32 - struct bpf_program32 *fp32; - struct bpf_program fp_swab; - - if (cmd =3D=3D BIOCSETWF32 || cmd =3D=3D BIOCSETF32 || cmd =3D=3D = BIOCSETFNR32) { + switch (cmd) { + case BIOCSETF32: + case BIOCSETWF32: + case BIOCSETFNR32: fp32 =3D (struct bpf_program32 *)fp; fp_swab.bf_len =3D fp32->bf_len; fp_swab.bf_insns =3D (struct bpf_insn = *)(uintptr_t)fp32->bf_insns; fp =3D &fp_swab; - if (cmd =3D=3D BIOCSETWF32) + switch (cmd) { + case BIOCSETF32: + cmd =3D BIOCSETF; + break; + case BIOCSETWF32: cmd =3D BIOCSETWF; + break; + } + break; } #endif - if (cmd =3D=3D BIOCSETWF) { - old =3D d->bd_wfilter; - wfilter =3D 1; + + fcode =3D NULL; #ifdef BPF_JITTER - ofunc =3D NULL; + jfunc =3D ofunc =3D NULL; #endif - } else { - wfilter =3D 0; - old =3D d->bd_rfilter; -#ifdef BPF_JITTER - ofunc =3D d->bd_bfilter; -#endif - } - if (fp->bf_insns =3D=3D NULL) { - if (fp->bf_len !=3D 0) + need_upgrade =3D 0; + + /* + * Check new filter validness before acquiring any locks. + * Allocate memory for new filter, if needed. + */ + flen =3D fp->bf_len; + if (flen > bpf_maxinsns || (fp->bf_insns =3D=3D NULL && flen !=3D = 0)) + return (EINVAL); + size =3D flen * sizeof(*fp->bf_insns); + if (size > 0) { + /* We're setting up new filter. Copy and check actual = data. */ + fcode =3D malloc(size, M_BPF, M_WAITOK); + if (copyin(fp->bf_insns, fcode, size) !=3D 0 || + !bpf_validate(fcode, flen)) { + free(fcode, M_BPF); return (EINVAL); - BPFD_LOCK(d); - if (wfilter) - d->bd_wfilter =3D NULL; - else { - d->bd_rfilter =3D NULL; -#ifdef BPF_JITTER - d->bd_bfilter =3D NULL; -#endif - if (cmd =3D=3D BIOCSETF) - reset_d(d); } - BPFD_UNLOCK(d); - if (old !=3D NULL) - free((caddr_t)old, M_BPF); #ifdef BPF_JITTER - if (ofunc !=3D NULL) - bpf_destroy_jit_filter(ofunc); + /* Filter is copied inside fcode and is perfectly valid. = */ + jfunc =3D bpf_jitter(fcode, flen); #endif - return (0); } - flen =3D fp->bf_len; - if (flen > bpf_maxinsns) - return (EINVAL); =20 - size =3D flen * sizeof(*fp->bf_insns); - fcode =3D (struct bpf_insn *)malloc(size, M_BPF, M_WAITOK); - if (copyin((caddr_t)fp->bf_insns, (caddr_t)fcode, size) =3D=3D 0 = && - bpf_validate(fcode, (int)flen)) { - BPFD_LOCK(d); - if (wfilter) - d->bd_wfilter =3D fcode; - else { - d->bd_rfilter =3D fcode; + BPF_LOCK(); + + /* + * Set up new filter. + * Protect filter change by interface lock. + * Additionally, we are protected by global lock here. + */ + if (d->bd_bif !=3D NULL) + BPFIF_WLOCK(d->bd_bif); + BPFD_LOCK(d); + if (cmd =3D=3D BIOCSETWF) { + old =3D d->bd_wfilter; + d->bd_wfilter =3D fcode; + } else { + old =3D d->bd_rfilter; + d->bd_rfilter =3D fcode; #ifdef BPF_JITTER - d->bd_bfilter =3D bpf_jitter(fcode, flen); + ofunc =3D d->bd_bfilter; + d->bd_bfilter =3D jfunc; #endif - if (cmd =3D=3D BIOCSETF) - reset_d(d); + if (cmd =3D=3D BIOCSETF) + reset_d(d); + + if (fcode !=3D NULL) { + /* + * Do not require upgrade by first BIOCSETF + * (used to set snaplen) by pcap_open_live(). + */ + if (d->bd_writer !=3D 0 && --d->bd_writer =3D=3D = 0) + need_upgrade =3D 1; + CTR4(KTR_NET, "%s: filter function set by pid = %d, " + "bd_writer counter %d, need_upgrade %d", + __func__, d->bd_pid, d->bd_writer, = need_upgrade); } - BPFD_UNLOCK(d); - if (old !=3D NULL) - free((caddr_t)old, M_BPF); + } + BPFD_UNLOCK(d); + if (d->bd_bif !=3D NULL) + BPFIF_WUNLOCK(d->bd_bif); + if (old !=3D NULL) + free(old, M_BPF); #ifdef BPF_JITTER - if (ofunc !=3D NULL) - bpf_destroy_jit_filter(ofunc); + if (ofunc !=3D NULL) + bpf_destroy_jit_filter(ofunc); #endif =20 - return (0); - } - free((caddr_t)fcode, M_BPF); - return (EINVAL); + /* Move d to active readers list. */ + if (need_upgrade) + bpf_upgraded(d); + + BPF_UNLOCK(); + return (0); } =20 /* @@ -1587,28 +1806,30 @@ struct bpf_if *bp; struct ifnet *theywant; =20 + BPF_LOCK_ASSERT(); + theywant =3D ifunit(ifr->ifr_name); if (theywant =3D=3D NULL || theywant->if_bpf =3D=3D NULL) return (ENXIO); =20 bp =3D theywant->if_bpf; =20 + /* Check if interface is not being detached from BPF */ + BPFIF_RLOCK(bp); + if (bp->flags & BPFIF_FLAG_DYING) { + BPFIF_RUNLOCK(bp); + return (ENXIO); + } + BPFIF_RUNLOCK(bp); + /* * Behavior here depends on the buffering model. If we're using * kernel memory buffers, then we can allocate them here. If = we're * using zero-copy, then the user process must have registered * buffers by the time we get here. If not, return an error. - * - * XXXRW: There are locking issues here with multi-threaded use: = what - * if two threads try to set the interface at once? */ switch (d->bd_bufmode) { case BPF_BUFMODE_BUFFER: - if (d->bd_sbuf =3D=3D NULL) - bpf_buffer_alloc(d); - KASSERT(d->bd_sbuf !=3D NULL, ("bpf_setif: bd_sbuf = NULL")); - break; - case BPF_BUFMODE_ZBUF: if (d->bd_sbuf =3D=3D NULL) return (EINVAL); @@ -1617,15 +1838,8 @@ default: panic("bpf_setif: bufmode %d", d->bd_bufmode); } - if (bp !=3D d->bd_bif) { - if (d->bd_bif) - /* - * Detach if attached to something else. - */ - bpf_detachd(d); - + if (bp !=3D d->bd_bif) bpf_attachd(d, bp); - } BPFD_LOCK(d); reset_d(d); BPFD_UNLOCK(d); @@ -1653,7 +1867,7 @@ */ revents =3D events & (POLLOUT | POLLWRNORM); BPFD_LOCK(d); - d->bd_pid =3D td->td_proc->p_pid; + BPF_PID_REFRESH(d, td); if (events & (POLLIN | POLLRDNORM)) { if (bpf_ready(d)) revents |=3D events & (POLLIN | POLLRDNORM); @@ -1688,7 +1902,7 @@ * Refresh PID associated with this descriptor. */ BPFD_LOCK(d); - d->bd_pid =3D curthread->td_proc->p_pid; + BPF_PID_REFRESH_CUR(d); kn->kn_fop =3D &bpfread_filtops; kn->kn_hook =3D d; knlist_add(&d->bd_sel.si_note, kn, 1); @@ -1744,9 +1958,19 @@ struct timeval tv; =20 gottime =3D 0; - BPFIF_LOCK(bp); + + BPFIF_RLOCK(bp); + LIST_FOREACH(d, &bp->bif_dlist, bd_next) { - BPFD_LOCK(d); + /* + * We are not using any locks for d here because: + * 1) any filter change is protected by interface + * write lock + * 2) destroying/detaching d is protected by interface + * write lock, too + */ + + /* XXX: Do not protect counter for the sake of = performance. */ ++d->bd_rcount; /* * NB: We dont call BPF_CHECK_DIRECTION() here since = there is no @@ -1762,6 +1986,11 @@ #endif slen =3D bpf_filter(d->bd_rfilter, pkt, pktlen, pktlen); if (slen !=3D 0) { + /* + * Filter matches. Let's to acquire write lock. + */ + BPFD_LOCK(d); + d->bd_fcount++; if (!gottime) { microtime(&tv); @@ -1772,10 +2001,10 @@ #endif catchpacket(d, pkt, pktlen, slen, bpf_append_bytes, &tv); + BPFD_UNLOCK(d); } - BPFD_UNLOCK(d); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } =20 #define BPF_CHECK_DIRECTION(d, r, i) = \ @@ -1784,6 +2013,7 @@ =20 /* * Incoming linkage from device drivers, when packet is in an mbuf = chain. + * Locking model is explained in bpf_tap(). */ void bpf_mtap(struct bpf_if *bp, struct mbuf *m) @@ -1806,11 +2036,11 @@ =20 pktlen =3D m_length(m, NULL); =20 - BPFIF_LOCK(bp); + BPFIF_RLOCK(bp); + LIST_FOREACH(d, &bp->bif_dlist, bd_next) { if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, = bp->bif_ifp)) continue; - BPFD_LOCK(d); ++d->bd_rcount; #ifdef BPF_JITTER bf =3D bpf_jitter_enable !=3D 0 ? d->bd_bfilter : NULL; @@ -1821,6 +2051,8 @@ #endif slen =3D bpf_filter(d->bd_rfilter, (u_char *)m, pktlen, = 0); if (slen !=3D 0) { + BPFD_LOCK(d); + d->bd_fcount++; if (!gottime) { microtime(&tv); @@ -1831,10 +2063,10 @@ #endif catchpacket(d, (u_char *)m, pktlen, = slen, bpf_append_mbuf, &tv); + BPFD_UNLOCK(d); } - BPFD_UNLOCK(d); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } =20 /* @@ -1869,14 +2101,17 @@ mb.m_len =3D dlen; pktlen +=3D dlen; =20 - BPFIF_LOCK(bp); + + BPFIF_RLOCK(bp); + LIST_FOREACH(d, &bp->bif_dlist, bd_next) { if (BPF_CHECK_DIRECTION(d, m->m_pkthdr.rcvif, = bp->bif_ifp)) continue; - BPFD_LOCK(d); ++d->bd_rcount; slen =3D bpf_filter(d->bd_rfilter, (u_char *)&mb, = pktlen, 0); if (slen !=3D 0) { + BPFD_LOCK(d); + d->bd_fcount++; if (!gottime) { microtime(&tv); @@ -1887,10 +2122,10 @@ #endif catchpacket(d, (u_char *)&mb, pktlen, = slen, bpf_append_mbuf, &tv); + BPFD_UNLOCK(d); } - BPFD_UNLOCK(d); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } =20 #undef BPF_CHECK_DIRECTION @@ -2040,7 +2275,7 @@ } if (d->bd_wfilter !=3D NULL) free((caddr_t)d->bd_wfilter, M_BPF); - mtx_destroy(&d->bd_mtx); + mtx_destroy(&d->bd_lock); } =20 /* @@ -2070,15 +2305,16 @@ panic("bpfattach"); =20 LIST_INIT(&bp->bif_dlist); + LIST_INIT(&bp->bif_wlist); bp->bif_ifp =3D ifp; bp->bif_dlt =3D dlt; - mtx_init(&bp->bif_mtx, "bpf interface lock", NULL, MTX_DEF); + rw_init(&bp->bif_lock, "bpf interface lock"); KASSERT(*driverp =3D=3D NULL, ("bpfattach2: driverp already = initialized")); *driverp =3D bp; =20 - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_INSERT_HEAD(&bpf_iflist, bp, bif_next); - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); =20 /* * Compute the length of the bpf header. This is not = necessarily @@ -2093,10 +2329,9 @@ } =20 /* - * Detach bpf from an interface. This involves detaching each = descriptor - * associated with the interface, and leaving bd_bif NULL. Notify each - * descriptor as it's detached so that any sleepers wake up and get - * ENXIO. + * Detach bpf from an interface. This involves detaching each = descriptor + * associated with the interface. Notify each descriptor as it's = detached + * so that any sleepers wake up and get ENXIO. */ void bpfdetach(struct ifnet *ifp) @@ -2109,31 +2344,45 @@ ndetached =3D 0; #endif =20 + BPF_LOCK(); /* Find all bpf_if struct's which reference ifp and detach them. = */ do { - mtx_lock(&bpf_mtx); LIST_FOREACH(bp, &bpf_iflist, bif_next) { if (ifp =3D=3D bp->bif_ifp) break; } if (bp !=3D NULL) LIST_REMOVE(bp, bif_next); - mtx_unlock(&bpf_mtx); =20 if (bp !=3D NULL) { #ifdef INVARIANTS ndetached++; #endif while ((d =3D LIST_FIRST(&bp->bif_dlist)) !=3D = NULL) { - bpf_detachd(d); + bpf_detachd_locked(d); BPFD_LOCK(d); bpf_wakeup(d); BPFD_UNLOCK(d); } - mtx_destroy(&bp->bif_mtx); - free(bp, M_BPF); + /* Free writer-only descriptors */ + while ((d =3D LIST_FIRST(&bp->bif_wlist)) !=3D = NULL) { + bpf_detachd_locked(d); + BPFD_LOCK(d); + bpf_wakeup(d); + BPFD_UNLOCK(d); + } + + /* + * Delay freing bp till interface is detached + * and all routes through this interface are = removed. + * Mark bp as detached to restrict new = consumers. + */ + BPFIF_WLOCK(bp); + bp->flags |=3D BPFIF_FLAG_DYING; + BPFIF_WUNLOCK(bp); } } while (bp !=3D NULL); + BPF_UNLOCK(); =20 #ifdef INVARIANTS if (ndetached =3D=3D 0) @@ -2142,6 +2391,37 @@ } =20 /* + * Interface departure handler. + * Note departure event does not guarantee interface is going down. + */ +static void +bpf_ifdetach(void *arg __unused, struct ifnet *ifp) +{ + struct bpf_if *bp; + + BPF_LOCK(); + if ((bp =3D ifp->if_bpf) =3D=3D NULL) { + BPF_UNLOCK(); + return; + } + + /* Check if bpfdetach() was called previously */ + if ((bp->flags & BPFIF_FLAG_DYING) =3D=3D 0) { + BPF_UNLOCK(); + return; + } + + CTR3(KTR_NET, "%s: freing BPF instance %p for interface %p", + __func__, bp, ifp); + + ifp->if_bpf =3D NULL; + BPF_UNLOCK(); + + rw_destroy(&bp->bif_lock); + free(bp, M_BPF); +} + +/* * Get a list of available data link type of the interface. */ static int @@ -2151,24 +2431,22 @@ struct ifnet *ifp; struct bpf_if *bp; =20 + BPF_LOCK_ASSERT(); + ifp =3D d->bd_bif->bif_ifp; n =3D 0; error =3D 0; - mtx_lock(&bpf_mtx); LIST_FOREACH(bp, &bpf_iflist, bif_next) { if (bp->bif_ifp !=3D ifp) continue; if (bfl->bfl_list !=3D NULL) { - if (n >=3D bfl->bfl_len) { - mtx_unlock(&bpf_mtx); + if (n >=3D bfl->bfl_len) return (ENOMEM); - } error =3D copyout(&bp->bif_dlt, bfl->bfl_list + n, sizeof(u_int)); } n++; } - mtx_unlock(&bpf_mtx); bfl->bfl_len =3D n; return (error); } @@ -2183,18 +2461,19 @@ struct ifnet *ifp; struct bpf_if *bp; =20 + BPF_LOCK_ASSERT(); + if (d->bd_bif->bif_dlt =3D=3D dlt) return (0); ifp =3D d->bd_bif->bif_ifp; - mtx_lock(&bpf_mtx); + LIST_FOREACH(bp, &bpf_iflist, bif_next) { if (bp->bif_ifp =3D=3D ifp && bp->bif_dlt =3D=3D dlt) break; } - mtx_unlock(&bpf_mtx); + if (bp !=3D NULL) { opromisc =3D d->bd_promisc; - bpf_detachd(d); bpf_attachd(d, bp); BPFD_LOCK(d); reset_d(d); @@ -2223,6 +2502,11 @@ dev =3D make_dev(&bpf_cdevsw, 0, UID_ROOT, GID_WHEEL, 0600, = "bpf"); /* For compatibility */ make_dev_alias(dev, "bpf0"); + + /* Register interface departure handler */ + bpf_ifdetach_cookie =3D EVENTHANDLER_REGISTER( + ifnet_departure_event, bpf_ifdetach, NULL, + EVENTHANDLER_PRI_ANY); } =20 /* @@ -2236,9 +2520,9 @@ struct bpf_if *bp; struct bpf_d *bd; =20 - mtx_lock(&bpf_mtx); + BPF_LOCK(); LIST_FOREACH(bp, &bpf_iflist, bif_next) { - BPFIF_LOCK(bp); + BPFIF_RLOCK(bp); LIST_FOREACH(bd, &bp->bif_dlist, bd_next) { BPFD_LOCK(bd); bd->bd_rcount =3D 0; @@ -2249,11 +2533,14 @@ bd->bd_zcopy =3D 0; BPFD_UNLOCK(bd); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); } =20 +/* + * Fill filter statistics + */ static void bpfstats_fill_xbpf(struct xbpf_d *d, struct bpf_d *bd) { @@ -2261,6 +2548,7 @@ bzero(d, sizeof(*d)); BPFD_LOCK_ASSERT(bd); d->bd_structsize =3D sizeof(*d); + /* XXX: reading should be protected by global lock */ d->bd_immediate =3D bd->bd_immediate; d->bd_promisc =3D bd->bd_promisc; d->bd_hdrcmplt =3D bd->bd_hdrcmplt; @@ -2285,6 +2573,9 @@ d->bd_bufmode =3D bd->bd_bufmode; } =20 +/* + * Handle `netstat -B' stats request + */ static int bpf_stats_sysctl(SYSCTL_HANDLER_ARGS) { @@ -2322,24 +2613,31 @@ if (bpf_bpfd_cnt =3D=3D 0) return (SYSCTL_OUT(req, 0, 0)); xbdbuf =3D malloc(req->oldlen, M_BPF, M_WAITOK); - mtx_lock(&bpf_mtx); + BPF_LOCK(); if (req->oldlen < (bpf_bpfd_cnt * sizeof(*xbd))) { - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); free(xbdbuf, M_BPF); return (ENOMEM); } index =3D 0; LIST_FOREACH(bp, &bpf_iflist, bif_next) { - BPFIF_LOCK(bp); + BPFIF_RLOCK(bp); + /* Send writers-only first */ + LIST_FOREACH(bd, &bp->bif_wlist, bd_next) { + xbd =3D &xbdbuf[index++]; + BPFD_LOCK(bd); + bpfstats_fill_xbpf(xbd, bd); + BPFD_UNLOCK(bd); + } LIST_FOREACH(bd, &bp->bif_dlist, bd_next) { xbd =3D &xbdbuf[index++]; BPFD_LOCK(bd); bpfstats_fill_xbpf(xbd, bd); BPFD_UNLOCK(bd); } - BPFIF_UNLOCK(bp); + BPFIF_RUNLOCK(bp); } - mtx_unlock(&bpf_mtx); + BPF_UNLOCK(); error =3D SYSCTL_OUT(req, xbdbuf, index * sizeof(*xbd)); free(xbdbuf, M_BPF); return (error); Index: net/bpf.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- net/bpf.h (revision 239830) +++ net/bpf.h (working copy) @@ -917,14 +917,21 @@ =20 /* * Descriptor associated with each attached hardware interface. + * FIXME: this structure is exposed to external callers to speed up + * bpf_peers_present() call. However we cover all fields not needed by + * this function via BPF_INTERNAL define */ struct bpf_if { LIST_ENTRY(bpf_if) bif_next; /* list of all = interfaces */ LIST_HEAD(, bpf_d) bif_dlist; /* descriptor list */ +#ifdef BPF_INTERNAL u_int bif_dlt; /* link layer type */ u_int bif_hdrlen; /* length of header (with = padding) */ struct ifnet *bif_ifp; /* corresponding interface */ - struct mtx bif_mtx; /* mutex for interface */ + struct rwlock bif_lock; /* interface lock */ + LIST_HEAD(, bpf_d) bif_wlist; /* writer-only list */ + int flags; /* Interface flags */ +#endif }; =20 void bpf_bufheld(struct bpf_d *d); Index: net/bpf_buffer.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- net/bpf_buffer.c (revision 239830) +++ net/bpf_buffer.c (working copy) @@ -93,21 +93,6 @@ SYSCTL_INT(_net_bpf, OID_AUTO, maxbufsize, CTLFLAG_RW, &bpf_maxbufsize, 0, "Default capture buffer in bytes"); =20 -void -bpf_buffer_alloc(struct bpf_d *d) -{ - - KASSERT(d->bd_fbuf =3D=3D NULL, ("bpf_buffer_alloc: bd_fbuf !=3D = NULL")); - KASSERT(d->bd_sbuf =3D=3D NULL, ("bpf_buffer_alloc: bd_sbuf !=3D = NULL")); - KASSERT(d->bd_hbuf =3D=3D NULL, ("bpf_buffer_alloc: bd_hbuf !=3D = NULL")); - - d->bd_fbuf =3D (caddr_t)malloc(d->bd_bufsize, M_BPF, M_WAITOK); - d->bd_sbuf =3D (caddr_t)malloc(d->bd_bufsize, M_BPF, M_WAITOK); - d->bd_hbuf =3D NULL; - d->bd_slen =3D 0; - d->bd_hlen =3D 0; -} - /* * Simple data copy to the current kernel buffer. */ @@ -183,18 +168,42 @@ bpf_buffer_ioctl_sblen(struct bpf_d *d, u_int *i) { u_int size; + caddr_t fbuf, sbuf; =20 + size =3D *i; + if (size > bpf_maxbufsize) + *i =3D size =3D bpf_maxbufsize; + else if (size < BPF_MINBUFSIZE) + *i =3D size =3D BPF_MINBUFSIZE; + + /* Allocate buffers immediately */ + fbuf =3D (caddr_t)malloc(size, M_BPF, M_WAITOK); + sbuf =3D (caddr_t)malloc(size, M_BPF, M_WAITOK); + BPFD_LOCK(d); if (d->bd_bif !=3D NULL) { + /* Interface already attached, unable to change buffers = */ BPFD_UNLOCK(d); + free(fbuf, M_BPF); + free(sbuf, M_BPF); return (EINVAL); } - size =3D *i; - if (size > bpf_maxbufsize) - *i =3D size =3D bpf_maxbufsize; - else if (size < BPF_MINBUFSIZE) - *i =3D size =3D BPF_MINBUFSIZE; + + /* Free old buffers if set */ + if (d->bd_fbuf !=3D NULL) + free(d->bd_fbuf, M_BPF); + if (d->bd_sbuf !=3D NULL) + free(d->bd_sbuf, M_BPF); + + /* Fill in new data */ d->bd_bufsize =3D size; + d->bd_fbuf =3D fbuf; + d->bd_sbuf =3D sbuf; + + d->bd_hbuf =3D NULL; + d->bd_slen =3D 0; + d->bd_hlen =3D 0; + BPFD_UNLOCK(d); return (0); } Index: net/bpf_buffer.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- net/bpf_buffer.h (revision 239830) +++ net/bpf_buffer.h (working copy) @@ -36,7 +36,6 @@ #error "no user-serviceable parts inside" #endif =20 -void bpf_buffer_alloc(struct bpf_d *d); void bpf_buffer_append_bytes(struct bpf_d *d, caddr_t buf, u_int = offset, void *src, u_int len); void bpf_buffer_append_mbuf(struct bpf_d *d, caddr_t buf, u_int = offset, Index: net/bpfdesc.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- net/bpfdesc.h (revision 239830) +++ net/bpfdesc.h (working copy) @@ -79,6 +79,7 @@ u_char bd_promisc; /* true if listening = promiscuously */ u_char bd_state; /* idle, waiting, or timed out = */ u_char bd_immediate; /* true to return on packet = arrival */ + u_char bd_writer; /* non-zero if d is writer-only = */ int bd_hdrcmplt; /* false to fill in src lladdr = automatically */ int bd_direction; /* select packet direction */ int bd_feedback; /* true to feed back sent = packets */ @@ -86,7 +87,7 @@ int bd_sig; /* signal to send upon packet = reception */ struct sigio * bd_sigio; /* information for async I/O */ struct selinfo bd_sel; /* bsd select info */ - struct mtx bd_mtx; /* mutex for this descriptor */ + struct mtx bd_lock; /* per-descriptor lock */ struct callout bd_callout; /* for BPF timeouts with select = */ struct label *bd_label; /* MAC label for descriptor */ u_int64_t bd_fcount; /* number of packets which = matched filter */ @@ -105,10 +106,16 @@ #define BPF_WAITING 1 /* waiting for read timeout in = select */ #define BPF_TIMED_OUT 2 /* read timeout has expired in = select */ =20 -#define BPFD_LOCK(bd) mtx_lock(&(bd)->bd_mtx) -#define BPFD_UNLOCK(bd) mtx_unlock(&(bd)->bd_mtx) -#define BPFD_LOCK_ASSERT(bd) mtx_assert(&(bd)->bd_mtx, MA_OWNED) +#define BPFD_LOCK(bd) mtx_lock(&(bd)->bd_lock) +#define BPFD_UNLOCK(bd) mtx_unlock(&(bd)->bd_lock) +#define BPFD_LOCK_ASSERT(bd) mtx_assert(&(bd)->bd_lock, MA_OWNED) =20 +#define BPF_PID_REFRESH(bd, td) (bd)->bd_pid =3D = (td)->td_proc->p_pid +#define BPF_PID_REFRESH_CUR(bd) (bd)->bd_pid =3D = curthread->td_proc->p_pid + +#define BPF_LOCK() mtx_lock(&bpf_mtx) +#define BPF_UNLOCK() mtx_unlock(&bpf_mtx) +#define BPF_LOCK_ASSERT() mtx_assert(&bpf_mtx, MA_OWNED) /* * External representation of the bpf descriptor */ @@ -143,7 +150,11 @@ u_int64_t bd_spare[4]; }; =20 -#define BPFIF_LOCK(bif) mtx_lock(&(bif)->bif_mtx) -#define BPFIF_UNLOCK(bif) mtx_unlock(&(bif)->bif_mtx) +#define BPFIF_RLOCK(bif) rw_rlock(&(bif)->bif_lock) +#define BPFIF_RUNLOCK(bif) rw_runlock(&(bif)->bif_lock) +#define BPFIF_WLOCK(bif) rw_wlock(&(bif)->bif_lock) +#define BPFIF_WUNLOCK(bif) rw_wunlock(&(bif)->bif_lock) =20 +#define BPFIF_FLAG_DYING 1 /* Reject new bpf consumers */ + #endif From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 16:50:05 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DAD3154C for ; Wed, 17 Oct 2012 16:50:05 +0000 (UTC) (envelope-from ming.fu@netsweeper.com) Received: from mail.netsweeper.com (mail.netsweeper.com [216.171.98.87]) by mx1.freebsd.org (Postfix) with ESMTP id A81658FC0C for ; Wed, 17 Oct 2012 16:50:05 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netsweeper.com (Postfix) with ESMTP id 8EC8F1E410FB for ; Wed, 17 Oct 2012 12:41:45 -0400 (EDT) X-Virus-Scanned: amavisd-new at mail.netsweeper.com Received: from mail.netsweeper.com ([127.0.0.1]) by localhost (mail.netsweeper.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YEur7bOfJTI7 for ; Wed, 17 Oct 2012 12:41:45 -0400 (EDT) Received: from [192.168.4.202] (unknown [216.171.98.93]) by mail.netsweeper.com (Postfix) with ESMTPSA id 66A661E410A6 for ; Wed, 17 Oct 2012 12:41:45 -0400 (EDT) Message-ID: <507EDFCE.3060702@netsweeper.com> Date: Wed, 17 Oct 2012 12:41:50 -0400 From: Ming Fu User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: netmap NETMAP_SW_RING or NETMAP_HW_RING Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 16:50:05 -0000 Hi, What is the difference between NETMAP_SW_RING and NETMAP_HW_RING. When using netmap_open() in the example code to create a netmap fdesc, one of these two need to be ORed to the queue ID, in order to bind only one RX queue. netmap code updated from FreeBSD RELENG_9. Regards, Ming From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 17:33:07 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 081E0337 for ; Wed, 17 Oct 2012 17:33:07 +0000 (UTC) (envelope-from ming.fu@netsweeper.com) Received: from mail.netsweeper.com (mail.netsweeper.com [216.171.98.87]) by mx1.freebsd.org (Postfix) with ESMTP id C206C8FC08 for ; Wed, 17 Oct 2012 17:33:06 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.netsweeper.com (Postfix) with ESMTP id EEFED1E40E75 for ; Wed, 17 Oct 2012 13:32:54 -0400 (EDT) X-Virus-Scanned: amavisd-new at mail.netsweeper.com Received: from mail.netsweeper.com ([127.0.0.1]) by localhost (mail.netsweeper.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZCdPBCdv4-zz for ; Wed, 17 Oct 2012 13:32:54 -0400 (EDT) Received: from [192.168.4.202] (unknown [216.171.98.93]) by mail.netsweeper.com (Postfix) with ESMTPSA id BD9901E40DD9 for ; Wed, 17 Oct 2012 13:32:54 -0400 (EDT) Message-ID: <507EEBD1.3020703@netsweeper.com> Date: Wed, 17 Oct 2012 13:33:05 -0400 From: Ming Fu User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: netmap NETMAP_SW_RING or NETMAP_HW_RING References: <507EDFCE.3060702@netsweeper.com> In-Reply-To: <507EDFCE.3060702@netsweeper.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 17:33:07 -0000 After a second look at the netmap_open code, I believe the NETMAP_HW_RING is the choice. My next trouble to to receive packets. The program spins off 8 threads, each thread try to bind to one of the queue on an igb card. (queue 0-7). depending on how I call the netmap_open(). if I call netmap_open(&me, id /*| NETMAP_HW_RING */, 1); The program will able to receive packets, but of course each thread receives packet from all 8 queues. If I call netmap_open(&me, id | NETMAP_HW_RING , 1) none of the thread was able to receive packet. come across the following line of code in nm_util.c } else if (ringid & NETMAP_HW_RING) { D("XXX check multiple threads"); What does it suggest? any special requirement for multi-threaded program? Regards, Ming On 10/17/2012 12:41 PM, Ming Fu wrote: > Hi, > > What is the difference between NETMAP_SW_RING and NETMAP_HW_RING. > When using netmap_open() in the example code to create a netmap fdesc, > one of these two need to be ORed to the queue ID, in order to bind > only one RX queue. > > netmap code updated from FreeBSD RELENG_9. > > Regards, > Ming > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 18:46:11 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DD99FBA; Wed, 17 Oct 2012 18:46:10 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id A71D38FC0A; Wed, 17 Oct 2012 18:46:10 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0FF4DB91E; Wed, 17 Oct 2012 14:46:10 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: ixgbe & if_igb RX ring locking Date: Wed, 17 Oct 2012 10:06:51 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> In-Reply-To: <201210150904.27567.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210171006.51214.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 17 Oct 2012 14:46:10 -0400 (EDT) Cc: "Alexander V. Chernikov" , Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 18:46:11 -0000 On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: > On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: > > On 13.10.2012 23:24, Jack Vogel wrote: > > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: > > > > >> > > >> one option could be (same as it is done in the timer > > >> routine in dummynet) to build a list of all the packets > > >> that need to be sent to if_input(), and then call > > >> if_input with the entire list outside the lock. > > >> > > >> It would be even easier if we modify the various *_input() > > >> routines to handle a list of mbufs instead of just one. > > > > Bulk processing is generally a good idea we probably should implement. > > Probably starting from driver queue ending with marked mbufs > > (OURS/forward/legacy processing (appletalk and similar))? > > > > This can minimize an impact for all > > locks on RX side: > > L2 > > * rx PFIL hook > > L3 (both IPv4 and IPv6) > > * global IF_ADDR_RLOCK (currently commented out) > > * Per-interface ADDR_RLOCK > > * PFIL hook > > > > From the first glance, there can be problems with: > > * Increased latency (we should have some kind of rx_process_limit), but > > still > > * reader locks being acquired for much longer amount of time > > > > >> > > >> cheers > > >> luigi > > >> > > >> Very interesting idea Luigi, will have to get that some thought. > > > > > > Jack > > > > Returning to original post topic: > > > > Given > > 1) we are currently binding ixgbe ithreads to CPU cores > > 2) RX queue lock is used by (indirectly) in only 2 places: > > a) ISR routine (msix or legacy irq) > > b) taskqueue routine which is scheduled if some packets remains in RX > > queue and rx_process_limit ended OR we need something to TX > > > > 3) in practice taskqueue routine is a nightmare for many people since > > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > > some traffic burst happens: once it is called it starts to schedule > > itself more and more replacing original ISR routine. Additionally, > > increasing rx_process_limit does not help since taskqueue is called with > > the same limit. Finally, currently netisr taskq threads are not bound to > > any CPU which makes the process even more uncontrollable. > > I think part of the problem here is that the taskqueue in ixgbe(4) is > bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > just start transmitting packets directly. > > I fixed this in igb(4) here: > > http://svnweb.freebsd.org/base?view=revision&revision=233708 > > You can try this for ixgbe(4). It also comments out a spurious taskqueue > reschedule from the watchdog handler that might also lower the taskqueue > usage. You can try changing that #if 0 to an #if 1 to test just the txeof > changes: Is anyone able to test this btw to see if it improves things on ixgbe at all? (I don't have any ixgbe hardware.) -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 18:46:11 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DD99FBA; Wed, 17 Oct 2012 18:46:10 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id A71D38FC0A; Wed, 17 Oct 2012 18:46:10 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0FF4DB91E; Wed, 17 Oct 2012 14:46:10 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Subject: Re: ixgbe & if_igb RX ring locking Date: Wed, 17 Oct 2012 10:06:51 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p20; KDE/4.5.5; amd64; ; ) References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> In-Reply-To: <201210150904.27567.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201210171006.51214.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 17 Oct 2012 14:46:10 -0400 (EDT) Cc: "Alexander V. Chernikov" , Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 18:46:11 -0000 On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: > On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: > > On 13.10.2012 23:24, Jack Vogel wrote: > > > On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: > > > > >> > > >> one option could be (same as it is done in the timer > > >> routine in dummynet) to build a list of all the packets > > >> that need to be sent to if_input(), and then call > > >> if_input with the entire list outside the lock. > > >> > > >> It would be even easier if we modify the various *_input() > > >> routines to handle a list of mbufs instead of just one. > > > > Bulk processing is generally a good idea we probably should implement. > > Probably starting from driver queue ending with marked mbufs > > (OURS/forward/legacy processing (appletalk and similar))? > > > > This can minimize an impact for all > > locks on RX side: > > L2 > > * rx PFIL hook > > L3 (both IPv4 and IPv6) > > * global IF_ADDR_RLOCK (currently commented out) > > * Per-interface ADDR_RLOCK > > * PFIL hook > > > > From the first glance, there can be problems with: > > * Increased latency (we should have some kind of rx_process_limit), but > > still > > * reader locks being acquired for much longer amount of time > > > > >> > > >> cheers > > >> luigi > > >> > > >> Very interesting idea Luigi, will have to get that some thought. > > > > > > Jack > > > > Returning to original post topic: > > > > Given > > 1) we are currently binding ixgbe ithreads to CPU cores > > 2) RX queue lock is used by (indirectly) in only 2 places: > > a) ISR routine (msix or legacy irq) > > b) taskqueue routine which is scheduled if some packets remains in RX > > queue and rx_process_limit ended OR we need something to TX > > > > 3) in practice taskqueue routine is a nightmare for many people since > > there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after > > some traffic burst happens: once it is called it starts to schedule > > itself more and more replacing original ISR routine. Additionally, > > increasing rx_process_limit does not help since taskqueue is called with > > the same limit. Finally, currently netisr taskq threads are not bound to > > any CPU which makes the process even more uncontrollable. > > I think part of the problem here is that the taskqueue in ixgbe(4) is > bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should > just start transmitting packets directly. > > I fixed this in igb(4) here: > > http://svnweb.freebsd.org/base?view=revision&revision=233708 > > You can try this for ixgbe(4). It also comments out a spurious taskqueue > reschedule from the watchdog handler that might also lower the taskqueue > usage. You can try changing that #if 0 to an #if 1 to test just the txeof > changes: Is anyone able to test this btw to see if it improves things on ixgbe at all? (I don't have any ixgbe hardware.) -- John Baldwin From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 19:29:52 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E56CAF0; Wed, 17 Oct 2012 19:29:52 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from aussmtpmrkpc120.us.dell.com (aussmtpmrkpc120.us.dell.com [143.166.82.159]) by mx1.freebsd.org (Postfix) with ESMTP id A79AB8FC16; Wed, 17 Oct 2012 19:29:52 +0000 (UTC) X-Loopcount0: from 64.238.244.148 X-IronPort-AV: E=Sophos;i="4.80,602,1344229200"; d="scan'208";a="7329248" Message-ID: <507F072F.6080707@vangyzen.net> Date: Wed, 17 Oct 2012 14:29:51 -0500 From: Eric van Gyzen User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120822 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org, "Bjoern A. Zeeb" Subject: Re: Tahi "Redirected On-link" Test Case References: <507DD768.7000803@vangyzen.net> In-Reply-To: <507DD768.7000803@vangyzen.net> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 19:29:53 -0000 On 10/16/2012 16:53, Eric van Gyzen wrote: > I am currently working on a fix for kern/152791 (Tahi IPv6 Ready Logo > test case #169: Redirected On-link). I have a change to add the host > route, and it works for test case 169. However, the route never gets > removed, so all subsequent test cases fail (because they first verify > that the Node Under Test thinks the destination is off-link). > > How/When should I clean up the route? > > Each test case runs a common cleanup procedure, which sends a RA with > a Router Lifetime of zero and a Prefix Information option with a Valid > Lifetime and Preferred Lifetime of zero. This deprecates the NUT's > only global address, by which it reaches the newly-on-link > destination. However, it doesn't seem rational to use this event to > trigger a cleanup of the route. > > The only other trigger I can imagine is the transition of the > Destination Cache entry to the Stale state. That also doesn't make > complete sense. (It probably also wouldn't work, since in my testing, > test case 170 begins immediately after test case 169 ends.) > > I'm assuming a certain amount of familiarity (on your part) with these > tests. If you'd like, I can explain them in more detail. > > Thanks in advance for any advice, > > Eric Ignore me. I was working with incomplete information. The common cleanup procedure also includes packets that trigger NUD to delete the entry from the Neighbor Cache. So, the now-obvious answer to my question is to delete the route on this event. Eric From owner-freebsd-net@FreeBSD.ORG Wed Oct 17 21:55:40 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5EDD7532; Wed, 17 Oct 2012 21:55:40 +0000 (UTC) (envelope-from guy.helmer@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id EF2968FC0A; Wed, 17 Oct 2012 21:55:39 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id k10so16544767iea.13 for ; Wed, 17 Oct 2012 14:55:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=+VPzoL2TvdeKPfNnawln5jqNkmyBemf0SdUXnawLnFU=; b=dc9CcfcZkWorCHQz0jYbalZ/Abbkc1ThndzcXkgem+YVc4iCI2B87NSN0g2ncB9fBt t2zo45pXmpONFezcy73vn36oz5G1iqwehaRg6iH/9fAGSe8bmiZskZgCslLlBdaxVVp3 DWuZ23DVaTP1n4JU4Px3HpLAequHW3R5VUEb+pWAkw0rtswaIfpdlxykxuIqwh9ds68P hMYYp+C+zIy6SpReSmBf7YdJpPVQldTJ9s8L3O6Ax8YeL4jcs8nm2+Ew082UDTUzg6Fr l2PHgC+2tN3wbFpGAMjiJDs6eVGk4n8aR2s31172dKdeQyma4IwQIgNMaldmb9JGhcrO IqMg== Received: by 10.50.135.38 with SMTP id pp6mr3069810igb.36.1350510939583; Wed, 17 Oct 2012 14:55:39 -0700 (PDT) Received: from [192.168.221.99] ([216.81.189.9]) by mx.google.com with ESMTPS id yf6sm12681211igb.0.2012.10.17.14.55.37 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 17 Oct 2012 14:55:38 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: 8.3: kernel panic in bpf.c catchpacket() From: Guy Helmer In-Reply-To: <381E3EEC-7EDB-428B-A724-434443E51A53@gmail.com> Date: Wed, 17 Oct 2012 16:55:40 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4B5399BF-4EE0-4182-8297-3BB97C4AA884@gmail.com> <59F9A36E-3DB2-4F6F-BB2A-A4C9DA76A70C@gmail.com> <5075C05E.9070800@FreeBSD.org> <1EDA1615-2CDE-405A-A725-AF7CC7D3E273@gmail.com> <381E3EEC-7EDB-428B-A724-434443E51A53@gmail.com> To: "Alexander V. Chernikov" X-Mailer: Apple Mail (2.1499) Cc: freebsd-net@freebsd.org, FreeBSD Stable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2012 21:55:40 -0000 On Oct 17, 2012, at 8:58 AM, Guy Helmer wrote: > On Oct 12, 2012, at 8:54 AM, Guy Helmer wrote: >=20 >>=20 >> On Oct 10, 2012, at 1:37 PM, Alexander V. Chernikov = wrote: >>=20 >>> On 10.10.2012 00:36, Guy Helmer wrote: >>>>=20 >>>> On Oct 8, 2012, at 8:09 AM, Guy Helmer = wrote: >>>>=20 >>>>> I'm seeing a consistent new kernel panic in FreeBSD 8.3: >>>>> I'm not seeing how bd_sbuf would be NULL here. Any ideas? >>>>=20 >>>> Since I've not had any replies, I hope nobody minds if I reply with = more information. >>>>=20 >>>> This panic seems to be occasionally triggered now that my user land = code is changing the packet filter a while after the bpd device has been = opened and an initial packet filter was set (previously, my code did not = change the filter after it was initially set). >>>>=20 >>>> I'm focusing on bpf_setf() since that seems to be the place that = could be tickling a problem, and I see that bpf_setf() calls reset_d(d) = to clear the hold buffer. I have manually verified that the BPFD lock is = held during the call to reset_d(), and the lock is held every other = place that the buffers are manipulated, so I haven't been able to find = any place that seems vulnerable to losing one of the bpf buffers. Still = searching, but any help would be appreciated. >>>=20 >>> Can you please check this code on -current? >>> Locking has changed quite significantly some time ago, so there is = good chance that you can get rid of this panic (or discover different = one which is really "new") :). >>=20 >> I'm not ready to run this app on current, so I have merged revs = 229898, 233937, 233938, 233946, 235744, 235745, 235746, 235747, 236231, = 236251, 236261, 236262, 236559, and 236806 to my 8.3 checkout to get = code that should be virtually identical to current without the timestamp = changes. >>=20 >> Unfortunately, I have only been able to trigger the panic in my test = lab once -- so I'm not sure whether a lack of problems with the updated = code will be indicative of likely success in the field where this has = been trigged regularly at some sites=85 >>=20 >> Thanks, >> Guy >>=20 >=20 >=20 > FWIW, I was able to trigger the panic with the original 8.3 code again = in my test lab. With these changes resulting from merging the revs = mentioned above, I have not seen any panics in my test lab setup in two = days of load testing, and AFAIK, packet capturing seems to be working = fine. Of course, the test system panic'ed with the same problem in = catchpacket() an hour after I wrote this. (kgdb) where #0 doadump () at pcpu.h:224 #1 0xffffffff804c8280 in boot (howto=3D260) at = ../../../kern/kern_shutdown.c:441 #2 0xffffffff804c8703 in panic (fmt=3D0x0) at = ../../../kern/kern_shutdown.c:614 #3 0xffffffff8069ffad in trap_fatal (frame=3D0xffffffff809edbc0, = eva=3DVariable "eva" is not available. ) at ../../../amd64/amd64/trap.c:825 #4 0xffffffff806a02e1 in trap_pfault (frame=3D0xffffff800014a8a0, = usermode=3D0) at ../../../amd64/amd64/trap.c:741 #5 0xffffffff806a06bf in trap (frame=3D0xffffff800014a8a0) at ../../../amd64/amd64/trap.c:478 #6 0xffffffff80687cd4 in calltrap () at = ../../../amd64/amd64/exception.S:228 #7 0xffffffff8069dc06 in bcopy () at ../../../amd64/amd64/support.S:124 #8 0xffffffff8056f69e in catchpacket (d=3D0xffffff005aaaf000,=20 pkt=3D0xffffff0001f46200 "", pktlen=3D522, snaplen=3DVariable = "snaplen" is not available. ) at ../../../net/bpf.c:2240 #9 0xffffffff8056fc66 in bpf_mtap (bp=3D0xffffff0001be8c80,=20 m=3D0xffffff0001f46200) at ../../../net/bpf.c:2064 #10 0xffffffff80579c15 in ether_input (ifp=3D0xffffff0001b73800,=20 m=3D0xffffff0001f46200) at ../../../net/if_ethersubr.c:635 #11 0xffffffff802b694a in em_rxeof (rxr=3D0xffffff0001bca200, count=3D99, = done=3D0x0) at ../../../dev/e1000/if_em.c:4404 #12 0xffffffff802b6db8 in em_handle_que (context=3DVariable "context" is = not available. ) at ../../../dev/e1000/if_em.c:1494 #13 0xffffffff80506d85 in taskqueue_run_locked = (queue=3D0xffffff0001be1580) at ../../../kern/subr_taskqueue.c:250 ---Type to continue, or q to quit---q=20 Quit (kgdb) frame 8 #8 0xffffffff8056f69e in catchpacket (d=3D0xffffff005aaaf000,=20 pkt=3D0xffffff0001f46200 "", pktlen=3D522, snaplen=3DVariable = "snaplen" is not available. ) at ../../../net/bpf.c:2240 warning: Source file is more recent than executable. 2240 bpf_append_bytes(d, d->bd_sbuf, curlen, &hdr, = sizeof(hdr)); (kgdb) print *d $1 =3D {bd_next =3D {le_next =3D 0xffffff0023fff400, le_prev =3D = 0xffffff0001be8c90},=20 bd_sbuf =3D 0x0, bd_hbuf =3D 0xffffff8000ffa000 "??~P", bd_fbuf =3D = 0x0,=20 bd_slen =3D 0, bd_hlen =3D 2068, bd_bufsize =3D 8388608,=20 bd_bif =3D 0xffffff0001be8c80, bd_rtout =3D 1, bd_rfilter =3D = 0xffffff0001e6f580,=20 bd_wfilter =3D 0x0, bd_bfilter =3D 0x0, bd_rcount =3D 7, bd_dcount =3D = 0,=20 bd_promisc =3D 1 '\001', bd_state =3D 0 '\0', bd_immediate =3D 1 = '\001',=20 bd_writer =3D 0 '\0', bd_hdrcmplt =3D 1, bd_direction =3D 1, = bd_feedback =3D 0,=20 bd_async =3D 0, bd_sig =3D 23, bd_sigio =3D 0x0, bd_sel =3D {si_tdlist = =3D { tqh_first =3D 0x0, tqh_last =3D 0x0}, si_note =3D {kl_list =3D { slh_first =3D 0x0}, kl_lock =3D 0xffffffff80497920 = ,=20 kl_unlock =3D 0xffffffff804978f0 ,=20 kl_assert_locked =3D 0xffffffff804945d0 = ,=20 kl_assert_unlocked =3D 0xffffffff804945e0 = ,=20 kl_lockarg =3D 0xffffff005aaaf0d8}, si_mtx =3D 0x0}, bd_lock =3D { lock_object =3D {lo_name =3D 0xffffff0001a5fce0 "bpf", lo_flags =3D = 16973824,=20 lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =3D = 18446742974226712768},=20 bd_callout =3D {c_links =3D {sle =3D {sle_next =3D 0x0}, tqe =3D = {tqe_next =3D 0x0,=20 tqe_prev =3D 0x0}}, c_time =3D 0, c_arg =3D 0x0, c_func =3D 0,=20= c_lock =3D 0xffffff005aaaf0d8, c_flags =3D 0, c_cpu =3D 0}, bd_label = =3D 0x0,=20 bd_fcount =3D 7, bd_pid =3D 89517, bd_locked =3D 0, bd_bufmode =3D 1, = bd_wcount =3D 0,=20 bd_wfcount =3D 0, bd_wdcount =3D 0, bd_zcopy =3D 0, bd_compat32 =3D 0 = '\0'} Now, I am thinking the malloc() of the sbuf is failing but not sure = how/why -- I thought malloc(size, M_BPF, M_WAITOK) should not fail? Guy= From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 01:41:06 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 84C7EE24; Thu, 18 Oct 2012 01:41:06 +0000 (UTC) (envelope-from yongari@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 534F38FC0A; Thu, 18 Oct 2012 01:41:06 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I1f6Cf052543; Thu, 18 Oct 2012 01:41:06 GMT (envelope-from yongari@freefall.freebsd.org) Received: (from yongari@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I1f53s052539; Thu, 18 Oct 2012 01:41:05 GMT (envelope-from yongari) Date: Thu, 18 Oct 2012 01:41:05 GMT Message-Id: <201210180141.q9I1f53s052539@freefall.freebsd.org> To: nevzorovn@gmail.com, yongari@FreeBSD.org, freebsd-net@FreeBSD.org, yongari@FreeBSD.org From: yongari@FreeBSD.org Subject: Re: kern/171520: [alc] alc network driver + tso + vlan does not work. X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 01:41:06 -0000 Synopsis: [alc] alc network driver + tso + vlan does not work. State-Changed-From-To: open->feedback State-Changed-By: yongari State-Changed-When: Thu Oct 18 01:40:32 UTC 2012 State-Changed-Why: I'm pretty sure TSO over VLAN worked well on my box. Could you share your exact network configuration and let me know how I can reproduce it? Responsible-Changed-From-To: freebsd-net->yongari Responsible-Changed-By: yongari Responsible-Changed-When: Thu Oct 18 01:40:32 UTC 2012 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=171520 From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 01:49:54 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 03C3B232 for ; Thu, 18 Oct 2012 01:49:53 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id BE2398FC0C for ; Thu, 18 Oct 2012 01:49:53 +0000 (UTC) Received: from JRE-MBP-2.local (c-50-143-149-146.hsd1.ca.comcast.net [50.143.149.146]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q9I1nkd0057884 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 17 Oct 2012 18:49:47 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <507F603A.4050808@freebsd.org> Date: Wed, 17 Oct 2012 18:49:46 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Mariano Cediel Subject: Re: one physical interface -> n virtual interfaces References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 01:49:54 -0000 On 10/16/12 1:35 PM, Mariano Cediel wrote: > How do I create, from a physical interface, n virtual interfaces, but > all effects are real, their MAC different, on which we can do > individually NAT, etc, etc.? > > I need one external interface has 2 public IPs, and I'll do every NAT > over every (with ipfw and divert) > individually (each of them has its own gateway) > > A little help to start researching ..... > Greetings. > > (sorry for my poor english) use netgraph ng_eiface, ng_bridge and ng_ether From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 02:05:28 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 156FB590; Thu, 18 Oct 2012 02:05:28 +0000 (UTC) (envelope-from yongari@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id C529B8FC12; Thu, 18 Oct 2012 02:05:27 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I25RAh053831; Thu, 18 Oct 2012 02:05:27 GMT (envelope-from yongari@freefall.freebsd.org) Received: (from yongari@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I25QLO053825; Thu, 18 Oct 2012 02:05:26 GMT (envelope-from yongari) Date: Thu, 18 Oct 2012 02:05:26 GMT Message-Id: <201210180205.q9I25QLO053825@freefall.freebsd.org> To: rich@enterprisesystems.net, yongari@FreeBSD.org, freebsd-net@FreeBSD.org, yongari@FreeBSD.org From: yongari@FreeBSD.org Subject: Re: kern/169399: [re] RealTek RTL8168/8111/8111c network interface not working X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 02:05:28 -0000 Synopsis: [re] RealTek RTL8168/8111/8111c network interface not working State-Changed-From-To: open->closed State-Changed-By: yongari State-Changed-When: Thu Oct 18 02:03:08 UTC 2012 State-Changed-Why: Support for RTL8168E-VL was added after releasing 7.4-RELEASE. Update to latest stable/7 or use newer FreeBSD releases. Responsible-Changed-From-To: freebsd-net->yongari Responsible-Changed-By: yongari Responsible-Changed-When: Thu Oct 18 02:03:08 UTC 2012 Responsible-Changed-Why: Track. http://www.freebsd.org/cgi/query-pr.cgi?pr=169399 From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 02:12:00 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E65388D5; Thu, 18 Oct 2012 02:12:00 +0000 (UTC) (envelope-from yongari@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id B6BE88FC0C; Thu, 18 Oct 2012 02:12:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I2C0ZW054095; Thu, 18 Oct 2012 02:12:00 GMT (envelope-from yongari@freefall.freebsd.org) Received: (from yongari@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I2C0O7054091; Thu, 18 Oct 2012 02:12:00 GMT (envelope-from yongari) Date: Thu, 18 Oct 2012 02:12:00 GMT Message-Id: <201210180212.q9I2C0O7054091@freefall.freebsd.org> To: Felko1982@web.de, yongari@FreeBSD.org, freebsd-net@FreeBSD.org, yongari@FreeBSD.org From: yongari@FreeBSD.org Subject: Re: kern/161381: [re] RTL8169SC - re0: PHY write failed X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 02:12:01 -0000 Synopsis: [re] RTL8169SC - re0: PHY write failed State-Changed-From-To: open->feedback State-Changed-By: yongari State-Changed-When: Thu Oct 18 02:11:29 UTC 2012 State-Changed-Why: Most of these kind of errors come from broken hardware or unstable power supply. If your re(4) device is a stand-along PCI controller, would you firmly resit the controller and try again? Knowing how other operating systems works on this device would be good idea too. Responsible-Changed-From-To: freebsd-net->yongari Responsible-Changed-By: yongari Responsible-Changed-When: Thu Oct 18 02:11:29 UTC 2012 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=161381 From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 05:10:11 2012 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 67C9476A; Thu, 18 Oct 2012 05:10:11 +0000 (UTC) (envelope-from yongari@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 34EBF8FC17; Thu, 18 Oct 2012 05:10:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9I5AB4i007998; Thu, 18 Oct 2012 05:10:11 GMT (envelope-from yongari@freefall.freebsd.org) Received: (from yongari@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9I5AA0A007994; Thu, 18 Oct 2012 05:10:10 GMT (envelope-from yongari) Date: Thu, 18 Oct 2012 05:10:10 GMT Message-Id: <201210180510.q9I5AA0A007994@freefall.freebsd.org> To: universite@ukr.net, yongari@FreeBSD.org, freebsd-net@FreeBSD.org, yongari@FreeBSD.org From: yongari@FreeBSD.org Subject: Re: kern/168152: [xl] Periodically, the network card xl0 stops working -- xl0: watchdog timeout X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 05:10:11 -0000 Synopsis: [xl] Periodically, the network card xl0 stops working -- xl0: watchdog timeout State-Changed-From-To: open->feedback State-Changed-By: yongari State-Changed-When: Thu Oct 18 05:08:25 UTC 2012 State-Changed-Why: http://people.freebsd.org/~yongari/xl/xl.watchdog.diff Would you give above patch spin and let me know how it goes? Responsible-Changed-From-To: freebsd-net->yongari Responsible-Changed-By: yongari Responsible-Changed-When: Thu Oct 18 05:08:25 UTC 2012 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=168152 From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 06:02:16 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9227A247 for ; Thu, 18 Oct 2012 06:02:16 +0000 (UTC) (envelope-from saeedeh.motlagh@gmail.com) Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com [209.85.216.47]) by mx1.freebsd.org (Postfix) with ESMTP id C79D38FC0A for ; Thu, 18 Oct 2012 06:02:15 +0000 (UTC) Received: by mail-qa0-f47.google.com with SMTP id i29so1164408qaf.13 for ; Wed, 17 Oct 2012 23:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=xfiqw0/qqWm0Az66P0owQNuVuKKI4ieuT6Tg70uL67c=; b=RLFTDBQrRD+HmIXqdzlK0i1JILk1wXg83uDVvpoFbqYZZCr5gxbXrTqf9WA/BV6w/I +pLt9VxG7nee7zydTxiz6xiP793ba+6Ziq9Mf0KQ3xzbIU0kwWNQlXD2jB8w+QGrG1/Y WzkNrQqixuSceAf4QYhgEYrvGRBtPCfBjGH+bXMVmSJkCZARxiBWXfLBRe5RzJ8Opnv+ r5xwE+GcyZXJMjBEtpLANI3gy4tvGjFli8Nebpe+Gz1WbzmBcz5GqJ6uzk/setAzEYuz Jy9m9oPCaXwrxxdXTUu8SYzYT9w1SkaFzZgJdKB2gnU7Wac8Z4BBrZuoI+7AVsGMcSmy QgFA== Received: by 10.229.172.10 with SMTP id j10mr9252192qcz.97.1350540135160; Wed, 17 Oct 2012 23:02:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.49.105.71 with HTTP; Wed, 17 Oct 2012 23:01:34 -0700 (PDT) In-Reply-To: References: From: saeedeh motlagh Date: Thu, 18 Oct 2012 09:31:34 +0330 Message-ID: Subject: Re: TCP_DROP_SYNFIN kernel option side effects?! To: h bagade Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 06:02:16 -0000 i know in RFC 1644 TCP packets SYN and FIN flags are set for some testing issues but not sure if it has being used in any other issues*.** * * * * * * * On Tue, Oct 16, 2012 at 6:57 PM, h bagade wrote: > Hi all, > > I need to add this option to kernel in order to defeating Nmap > OS-Fingerprinting. My system is running as Web Server and also it is the > gateway on the network. > I want to know if setting this option has any side effects on other parts > of the system? Is there any situation that SYN and FIN bits are set both in > TCP packets? Is it a normal situation? > > Any helps or comments are really appreciated. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 13:20:20 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7F017E12 for ; Thu, 18 Oct 2012 13:20:20 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id D06F58FC0C for ; Thu, 18 Oct 2012 13:20:19 +0000 (UTC) Received: (qmail 13162 invoked from network); 18 Oct 2012 14:59:13 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 18 Oct 2012 14:59:13 -0000 Message-ID: <5080020E.1010603@networx.ch> Date: Thu, 18 Oct 2012 15:20:14 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> In-Reply-To: <20121013182223.GA73341@onelab2.iet.unipi.it> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Alexander V. Chernikov" , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 13:20:20 -0000 On 13.10.2012 20:22, Luigi Rizzo wrote: > On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote: >> Hello list! >> >> >> Packets receiving code for both ixgbe and if_igb looks like the following: >> >> >> ixgbe_msix_que >> >> -- ixgbe_rxeof() >> { >> IXGBE_RX_LOCK(rxr); >> while >> { >> get_packet; >> >> -- ixgbe_rx_input() >> { >> ++ IXGBE_RX_UNLOCK(rxr); >> if_input(packet); >> ++ IXGBE_RX_LOCK(rxr); >> } >> >> } >> IXGBE_RX_UNLOCK(rxr); >> } >> >> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe). >> >> These lines probably do LORs masking (if any) well. >> However, such change introduce quite significant performance drop: >> >> On my routing setup (nearly the same from previous -Intel 10G thread in >> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is >> nearly 20%. > > one option could be (same as it is done in the timer > routine in dummynet) to build a list of all the packets > that need to be sent to if_input(), and then call > if_input with the entire list outside the lock. > > It would be even easier if we modify the various *_input() > routines to handle a list of mbufs instead of just one. Not really. You'd just run into tons of layering complexity. Somewhere the decomposition and serialization has to be done. Perhaps the right place is to dequeue a batch of packets from the HW ring and then have a task/thread send it up the stack one by one. -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 13:26:59 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 49E8FFEF for ; Thu, 18 Oct 2012 13:26:59 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id CF4D78FC12 for ; Thu, 18 Oct 2012 13:26:58 +0000 (UTC) Received: (qmail 13207 invoked from network); 18 Oct 2012 15:05:53 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 18 Oct 2012 15:05:53 -0000 Message-ID: <5080039E.9070202@networx.ch> Date: Thu, 18 Oct 2012 15:26:54 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: h bagade Subject: Re: TCP_DROP_SYNFIN kernel option side effects?! References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 13:26:59 -0000 On 16.10.2012 17:27, h bagade wrote: > Hi all, > > I need to add this option to kernel in order to defeating Nmap > OS-Fingerprinting. My system is running as Web Server and also it is the > gateway on the network. > I want to know if setting this option has any side effects on other parts > of the system? Is there any situation that SYN and FIN bits are set both in > TCP packets? Is it a normal situation? SYN and FIN is not normal. Doing TCP_DROP_SYNFIN is not RFC compliant but doesn't cause any problems. -- Andre From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 14:09:40 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0C01A3A7 for ; Thu, 18 Oct 2012 14:09:40 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 6B1758FC0A for ; Thu, 18 Oct 2012 14:09:37 +0000 (UTC) Received: (qmail 13412 invoked from network); 18 Oct 2012 15:48:31 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 18 Oct 2012 15:48:31 -0000 Message-ID: <50800D9D.1090705@networx.ch> Date: Thu, 18 Oct 2012 16:09:33 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Vijay Singh Subject: Re: A small cleanup patch References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 14:09:40 -0000 On 05.10.2012 01:21, Vijay Singh wrote: > Folks, I came up with this while going through the lltable code. Thank you. I just purged a larger number of stray spl* from the net*/* directories. This stuff won't be backported to 9-STABLE though. -- Andre > kong@[/u/vijay/bsd/CODE/cur/sys]# svn diff net/if.c > Index: net/if.c > =================================================================== > --- net/if.c (revision 241169) > +++ net/if.c (working copy) > @@ -691,12 +691,9 @@ > if_attachdomain(void *dummy) > { > struct ifnet *ifp; > - int s; > > - s = splnet(); > TAILQ_FOREACH(ifp, &V_ifnet, if_link) > if_attachdomain1(ifp); > - splx(s); > } > SYSINIT(domainifattach, SI_SUB_PROTO_IFATTACHDOMAIN, SI_ORDER_SECOND, > if_attachdomain, NULL); > @@ -705,22 +702,17 @@ > if_attachdomain1(struct ifnet *ifp) > { > struct domain *dp; > - int s; > > - s = splnet(); > - > /* > * Since dp->dom_ifattach calls malloc() with M_WAITOK, we > * cannot lock ifp->if_afdata initialization, entirely. > */ > if (IF_AFDATA_TRYLOCK(ifp) == 0) { > - splx(s); > return; > } > if (ifp->if_afdata_initialized >= domain_init_status) { > IF_AFDATA_UNLOCK(ifp); > - splx(s); > - printf("if_attachdomain called more than once on %s\n", > + log(LOG_WARNING, "if_attachdomain called more than once on %s\n", > ifp->if_xname); > return; > } > @@ -734,8 +726,6 @@ > ifp->if_afdata[dp->dom_family] = > (*dp->dom_ifattach)(ifp); > } > - > - splx(s); > } > > /* > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 17:00:43 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AE74DA32 for ; Thu, 18 Oct 2012 17:00:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 254668FC16 for ; Thu, 18 Oct 2012 17:00:42 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q9IH0pFU081891; Thu, 18 Oct 2012 20:00:51 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q9IH0dnf087071; Thu, 18 Oct 2012 20:00:39 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q9IH0duu087070; Thu, 18 Oct 2012 20:00:39 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 18 Oct 2012 20:00:39 +0300 From: Konstantin Belousov To: Andre Oppermann Subject: Re: A small cleanup patch Message-ID: <20121018170039.GS35915@deviant.kiev.zoral.com.ua> References: <50800D9D.1090705@networx.ch> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="WQ/nOZqcqYGY8uZi" Content-Disposition: inline In-Reply-To: <50800D9D.1090705@networx.ch> User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: net@freebsd.org, Vijay Singh X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 17:00:43 -0000 --WQ/nOZqcqYGY8uZi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Oct 18, 2012 at 04:09:33PM +0200, Andre Oppermann wrote: > On 05.10.2012 01:21, Vijay Singh wrote: > > Folks, I came up with this while going through the lltable code. >=20 > Thank you. I just purged a larger number of stray spl* from the > net*/* directories. This stuff won't be backported to 9-STABLE > though. Why ? What is the value of having the fossils in the actively maintained stable tree ? --WQ/nOZqcqYGY8uZi Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlCANbcACgkQC3+MBN1Mb4iUggCg6D6yJWMjj5xWSZ8XBJpSdgMZ uIMAnieSq76yERz2C9XACrD1e+aTKJ0g =yG+l -----END PGP SIGNATURE----- --WQ/nOZqcqYGY8uZi-- From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 18:09:17 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3C4B06DA; Thu, 18 Oct 2012 18:09:17 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id C52A18FC0A; Thu, 18 Oct 2012 18:09:16 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so11593029vbm.13 for ; Thu, 18 Oct 2012 11:09:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=6xBmignxqv8e3WBsSItXa3RxiYoe1WRrJZ/N6CKnPh8=; b=K1V3I5qLhH+QphFD5pRElKzR4kE1ApJgsWEi6ap6vVIJtJbdh3AOAm1HrysxL65y9k Ya61cXCXaV+8GXg/W+1xRnrxyJeNOvv8st6kyynPZRVBZbXcRwXVnJnptYf14IxjuXbD ZiPPDUmmnXqhu4lZOHCCBQQBBRJ9tLYcQv8pGJPUgJ9pJaWk9keuhd3QWpdGRnw2lkMH g2R6409yOY9pQmFxjeK7nLmmx3vcUVwYDrFtY8K9BOL7nuesl9nrmrbCC9pMOHDIJHvW vAdzGew8OGbuMxTmbOmRtfpdU+SRJGHEoFYJmBmiF684fa87yLBj0QIQQqtoiXdiSQX2 +RVg== MIME-Version: 1.0 Received: by 10.52.34.37 with SMTP id w5mr13188544vdi.86.1350583756089; Thu, 18 Oct 2012 11:09:16 -0700 (PDT) Received: by 10.58.68.8 with HTTP; Thu, 18 Oct 2012 11:09:16 -0700 (PDT) In-Reply-To: <5080020E.1010603@networx.ch> References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> <5080020E.1010603@networx.ch> Date: Thu, 18 Oct 2012 11:09:16 -0700 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Jack Vogel To: Andre Oppermann Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "Alexander V. Chernikov" , Luigi Rizzo , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 18:09:17 -0000 On Thu, Oct 18, 2012 at 6:20 AM, Andre Oppermann wrote: > On 13.10.2012 20:22, Luigi Rizzo wrote: > >> On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote: >> >>> Hello list! >>> >>> >>> Packets receiving code for both ixgbe and if_igb looks like the >>> following: >>> >>> >>> ixgbe_msix_que >>> >>> -- ixgbe_rxeof() >>> { >>> IXGBE_RX_LOCK(rxr); >>> while >>> { >>> get_packet; >>> >>> -- ixgbe_rx_input() >>> { >>> ++ IXGBE_RX_UNLOCK(rxr); >>> if_input(packet); >>> ++ IXGBE_RX_LOCK(rxr); >>> } >>> >>> } >>> IXGBE_RX_UNLOCK(rxr); >>> } >>> >>> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe). >>> >>> These lines probably do LORs masking (if any) well. >>> However, such change introduce quite significant performance drop: >>> >>> On my routing setup (nearly the same from previous -Intel 10G thread in >>> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is >>> nearly 20%. >>> >> >> one option could be (same as it is done in the timer >> routine in dummynet) to build a list of all the packets >> that need to be sent to if_input(), and then call >> if_input with the entire list outside the lock. >> >> It would be even easier if we modify the various *_input() >> routines to handle a list of mbufs instead of just one. >> > > Not really. You'd just run into tons of layering complexity. > Somewhere the decomposition and serialization has to be done. > > Perhaps the right place is to dequeue a batch of packets from > the HW ring and then have a task/thread send it up the stack > one by one. > I was thinking about how to code this, something like what I did with the refresh routine, in any case I will experiment with it. Jack From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 18:43:56 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C1973F9C; Thu, 18 Oct 2012 18:43:56 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 7D7788FC1B; Thu, 18 Oct 2012 18:43:56 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 005EF73029; Thu, 18 Oct 2012 21:04:21 +0200 (CEST) Date: Thu, 18 Oct 2012 21:04:20 +0200 From: Luigi Rizzo To: Andre Oppermann Subject: Re: ixgbe & if_igb RX ring locking Message-ID: <20121018190420.GB98348@onelab2.iet.unipi.it> References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> <5080020E.1010603@networx.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5080020E.1010603@networx.ch> User-Agent: Mutt/1.4.2.3i Cc: "Alexander V. Chernikov" , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 18:43:56 -0000 On Thu, Oct 18, 2012 at 03:20:14PM +0200, Andre Oppermann wrote: > On 13.10.2012 20:22, Luigi Rizzo wrote: > >On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov wrote: > >>Hello list! > >> > >> > >>Packets receiving code for both ixgbe and if_igb looks like the following: > >> > >> > >>ixgbe_msix_que > >> > >>-- ixgbe_rxeof() > >> { > >> IXGBE_RX_LOCK(rxr); > >> while > >> { > >> get_packet; > >> > >> -- ixgbe_rx_input() > >> { > >> ++ IXGBE_RX_UNLOCK(rxr); > >> if_input(packet); > >> ++ IXGBE_RX_LOCK(rxr); > >> } > >> > >> } > >> IXGBE_RX_UNLOCK(rxr); > >> } > >> > >>Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe). > >> > >>These lines probably do LORs masking (if any) well. > >>However, such change introduce quite significant performance drop: > >> > >>On my routing setup (nearly the same from previous -Intel 10G thread in > >>-net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which is > >>nearly 20%. > > > >one option could be (same as it is done in the timer > >routine in dummynet) to build a list of all the packets > >that need to be sent to if_input(), and then call > >if_input with the entire list outside the lock. > > > >It would be even easier if we modify the various *_input() > >routines to handle a list of mbufs instead of just one. > > Not really. You'd just run into tons of layering complexity. > Somewhere the decomposition and serialization has to be done. > > Perhaps the right place is to dequeue a batch of packets from > the HW ring and then have a task/thread send it up the stack > one by one. this is exactly what the dummynet code does -- collect a batch of packets, release the lock, and then loop over the batch to feed ip_input/ip_output or other things. My point was, however, that instead of having to write an explicit loop in all clients of ether_input(), we could make ether_input() itself (or ether_input_batch(), does not really matter) able to handle the batch and in turn call the main function. The frontend then could apply some smarts to try and group packets (not too different from TCP Receive Side Coalescing/Large Receive Offload) within the batch, and this could be done without locking/unlocking on each packet. Furthermore, chances are that you can pass batches from one layer to the next one in this way, something that wouldn't work if your workflow can only handle one packet at a time. And finally, the good thing is that implementation can be incremental and on a case-by-case basis. The VALE bridge uses this strategy http://info.iet.unipi.it/~luigi/vale/ and moving batches instead of single packets brings the forwarding rate from 4 to 17~Mpps. At high rates, it really pays off. cheers luigi > -- > Andre > From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 21:00:30 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D942C351 for ; Thu, 18 Oct 2012 21:00:30 +0000 (UTC) (envelope-from rafaelhfaria@cenadigital.com.br) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id A35A18FC1A for ; Thu, 18 Oct 2012 21:00:30 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id rp8so9418181pbb.13 for ; Thu, 18 Oct 2012 14:00:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cenadigital.com.br; s=mail; h=mime-version:from:date:message-id:subject:to:content-type; bh=29S3XOJqxqmp3+6lnfKAIWouKSzQu5OsydZjCeuhDtU=; b=GV+wbq2YwOA59oefF4s1fUdeIGoK5uC20NlrAsLQQVZF9Ax1UdRIToZiYGMvpmQ6pf GbGIZWVTLFQcDug2TNbVz0owDeD2By3Do5JXUh3oSW+6aOGd9gbJmjttMiP3/X9Na7bW 0ARbMjOjpWRDhCi0+RzebRJqWPPRAG8oSZLZc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type :x-gm-message-state; bh=29S3XOJqxqmp3+6lnfKAIWouKSzQu5OsydZjCeuhDtU=; b=IHZs3xP5cJ+iL71D0K3aT8abD6bKTkOPMBCZt5EAat73DVVzKWu072nl7sCFLI+sqm a23rvNbVQOcyBm/gzZ8r5PjOrvQ+NKwIwNVKFFAZK8Isyco0nmglzek7hGyclPqN+eh5 advIU7Y523rLoF0zImtstFEFd+JaTy/4+fwicPo/5P6Hjxhm+HQvo3ZlCJdCZFYZPGwo cJcHActh/4decbvTd2LBh3lnkeBimphMtCMUL0Sr/Ly0B8HPkjKmZMzttWkicfvtUXwU CqxNjrRe2WGtx96SNc+P9szY5xYLBGT+36DK9POVnaToSu4BTU1AjyFGKNvVQCEwM+JA 92VQ== Received: by 10.68.189.138 with SMTP id gi10mr69387676pbc.165.1350594029708; Thu, 18 Oct 2012 14:00:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.66.11.166 with HTTP; Thu, 18 Oct 2012 13:59:49 -0700 (PDT) From: Rafael Henrique Faria Date: Thu, 18 Oct 2012 17:59:49 -0300 Message-ID: Subject: CARP on vSwitch To: freebsd-net@freebsd.org X-Gm-Message-State: ALoCoQmy4hdv0DoZOq3zaNyKSghNoECon+XyCkPpTlveAWM1r7qh/QPNccsmjSBTs7FVMgJQGAD2 Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 21:00:31 -0000 Hi, I'm trying to use CARP on two FreeBSD servers in a ESX environment. But it's not working. The problem is that every frame sent from CARP gets back to the same host. This is an old problem: http://www.mail-archive.com/freebsd-net@freebsd.org/msg30562.html And already have a patch, but its 3 years old. And not yet commit-ed. There is any reason for this? I always used freebsd-update to keep the servers updated, and don't want to compile a kernel just to use the CARP. Someone have any suggestion or correction to this problem? Thanks in advance. -- Rafael Henrique da Silva Faria From owner-freebsd-net@FreeBSD.ORG Thu Oct 18 21:41:17 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ABF11DFA for ; Thu, 18 Oct 2012 21:41:17 +0000 (UTC) (envelope-from david@catwhisker.org) Received: from albert.catwhisker.org (m209-73.dsl.rawbw.com [198.144.209.73]) by mx1.freebsd.org (Postfix) with ESMTP id 640678FC17 for ; Thu, 18 Oct 2012 21:41:17 +0000 (UTC) Received: from albert.catwhisker.org (localhost [127.0.0.1]) by albert.catwhisker.org (8.14.5/8.14.5) with ESMTP id q9ILfB65018156 for ; Thu, 18 Oct 2012 14:41:11 -0700 (PDT) (envelope-from david@albert.catwhisker.org) Received: (from david@localhost) by albert.catwhisker.org (8.14.5/8.14.5/Submit) id q9ILfBgB018155 for net@freebsd.org; Thu, 18 Oct 2012 14:41:11 -0700 (PDT) (envelope-from david) Date: Thu, 18 Oct 2012 14:41:10 -0700 From: David Wolfskill To: net@freebsd.org Subject: Seeing "rn_addmask: mask impossibly already in tree" on console Message-ID: <20121018214110.GE1817@albert.catwhisker.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gRZ38brEgCoUohoa" Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: net@freebsd.org, David Wolfskill List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Oct 2012 21:41:17 -0000 --gRZ38brEgCoUohoa Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Running: FreeBSD g1-227.catwhisker.org 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #273 24= 1679M: Thu Oct 18 04:52:24 PDT 2012 root@g1-227.catwhisker.org:/usr/obj= /usr/src/sys/CANARY i386 on my laptop, at times when I switch from X to vty0, I see (e.g.): Oct 18 08:27:07 g1-227 kernel: rn_addmask: mask impossibly already in treer= n_addmask: mask impossibly already in treern_addmask: mask impossibly alrea= dy in treern_addmask: mask impossibly already in treern_addmask: mask impos= sibly already in treern_addmask: mask impossibly already in treern_addmask:= mask impossibly already in tree... I see where the message is issued (sys/net/radix.c:539 @r210122, last updated 2010-07-15 14:41:59Z). As this is a laptop, and thus subject to being connected to networks I do not control, I run a packet filter on it, and the one with which I happen to be most familair is ipfw. Thus that's what I use. So it's possible that either ipfw or routing is driving rn_addmask(). However, I'm unclear on: * What (specifically) is actually causing this. * Whether or not it's enough of an issue or problem that I should take evasive action. Put differently: what is my exposure here? * If so, what sort of evasive action is appropriate for me to take. I suppose I could try a packet-capture, but the lack of timestamps make correlating the message-issuance with particular packets a little more challenging than I'd prefer. I note, too, that I'm running a very similar ipfw configuration on the packet-filter machine here at home; while I find the above- quoted whines in /var/log/console.log on the laptop, I do not find them mentioned in that file for the packet filter machine. Clues? [Please include me in the recipient list, as I'm not subscribed to net@; I've set Reply-To as a hint.] Thanks! Peace, david --=20 David H. Wolfskill david@catwhisker.org Taliban: Evil men with guns afraid of truth from a 14-year old girl. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --gRZ38brEgCoUohoa Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCAd3UACgkQmprOCmdXAD1x+wCfTfmvzlGbIzPcoS7pv4HhdvJC VecAn1FFqYD2W2Zkjb9rjwNxNfqg8wOE =DPnx -----END PGP SIGNATURE----- --gRZ38brEgCoUohoa-- From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 00:05:32 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 51F579DF; Fri, 19 Oct 2012 00:05:32 +0000 (UTC) (envelope-from kob6558@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8EF8E8FC1B; Fri, 19 Oct 2012 00:05:30 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id b5so7765432lbd.13 for ; Thu, 18 Oct 2012 17:05:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=IIsw37PqoV2e/719pra0mZrpmVMAiycBPKw0J9ZxOGA=; b=GU77uSZwLBpY3wL6i+pEDs54arD/azBdiTd7k0BPm1Sem3Un8qmHkEpi33HMnTw9rX 1pjF/X4Fm7wKWAiopKmB+TMHWOUl22uyIcvSsgm/QGlc9i4qYh2LbiqFWiZmSbNAGg1d F5vg9hnEFWRl8SuYPyi3gqqMCyayjVXk/XRoXmKr4CFJ2QA6LLDNwnLB4QuERS3k79g6 yFqG+tjAfr5evZqphq5GJsdsQIyegv3VmaxnR5mqdjM8ioA6SALLWgDcSN3IHioaCgcD y0PKJ272dGBBysIn2s8Wt5RXVWHOkP8EcUTWwARd1PGKpdKosvFHcYQV3BD/x+YutTJX AlAA== MIME-Version: 1.0 Received: by 10.112.103.106 with SMTP id fv10mr8411346lbb.8.1350605129973; Thu, 18 Oct 2012 17:05:29 -0700 (PDT) Received: by 10.112.4.227 with HTTP; Thu, 18 Oct 2012 17:05:29 -0700 (PDT) In-Reply-To: <16534.1350461943@tristatelogic.com> References: <16534.1350461943@tristatelogic.com> Date: Thu, 18 Oct 2012 17:05:29 -0700 Message-ID: Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) From: Kevin Oberman To: "Ronald F. Guilmette" Content-Type: text/plain; charset=UTF-8 Cc: freebsd-net@freebsd.org, Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 00:05:32 -0000 On Wed, Oct 17, 2012 at 1:19 AM, Ronald F. Guilmette wrote: > > In message > , you wrote: > >>for wifi - you need to configure /etc/wpa_supplicant.conf as well, >>right? > > Did that. Yes. > >>You don't need the ssid in the ifconfig line; > > OK. If you say so. (See my prior e-mail where I wondered aloud if there > are circumstances where the ssid might have to appear in both places.) > > wpa_supplicant > 9 >>will scan and find your AP. >> >>The driver should call back to non-n and non-g if needs be. >> >>As for the config - erm, you have two interfaces on the same L2. >>That's going to confuse things, right? > > Well, I can't speak for the hardware, but it sure as hell does confuse > *me*. (1/2 :-) > >>What's 'netstat -rn' show? > > > Routing tables > > Internet: > Destination Gateway Flags Refs Use Netif Expire > default 192.168.1.1 UGS 0 104122 re0 > 127.0.0.1 link#10 UH 0 0 lo0 > 192.168.1.0/24 link#4 U 0 23515 re0 > 192.168.1.21 link#11 UHS 0 0 lo0 > 192.168.1.23 link#4 UHS 0 0 lo0 > > Internet6: > Destination Gateway Flags Netif Expire > ::/96 ::1 UGRS lo0 > ::1 link#10 UH lo0 > ::ffff:0.0.0.0/96 ::1 UGRS lo0 > fe80::/10 ::1 UGRS lo0 > fe80::%re0/64 link#4 U re0 > fe80::224:21ff:fe65:ada0%re0 link#4 UHS lo0 > fe80::%lo0/64 link#10 U lo0 > fe80::1%lo0 link#10 UHS lo0 > fe80::%wlan0/64 link#11 U wlan0 > fe80::222:fbff:fe76:6d18%wlan0 link#11 UHS lo0 > ff01::%re0/32 fe80::224:21ff:fe65:ada0%re0 U re0 > ff01::%lo0/32 ::1 U lo0 > ff01::%wlan0/32 fe80::222:fbff:fe76:6d18%wlan0 U wlan0 > ff02::/16 ::1 UGRS lo0 > ff02::%re0/32 fe80::224:21ff:fe65:ada0%re0 U re0 > ff02::%lo0/32 ::1 U lo0 > ff02::%wlan0/32 fe80::222:fbff:fe76:6d18%wlan0 U wlan0 To use WPA and a static address, you need something like: ifconfig_wlan0 ="WPA inet 192.168.1.21/24" so that was OK. Now, you seem to have both interfaces on the same /24 with a /24 netmask. This is probably going to result in some poorly defined behavior. I'm not sure just what you are trying to do, but I suspect that it is not what you are doing. If you are trying to allow the system to use wired when it is connected and wireless when disconnected, thi is the wrong way. You should put both interfaces into a lagg and have a single IP on the lagg interface. As it is, there is no way to be sure which outgoing interface will be used when both are connected and exactly This said, I am not sure how this might cause the interface to fail to associate. I'm guessing that you are simply not associating and the scan is falling back to 'B' after failing to find an AP in faster modes. The question is "why?". What is the output of "ifconfig wlan0 list aps"? One thing I see is: country US authmode WPA1+WPA2/802.11i privacy OFF For home users I would normally expect WPA-PSK to be used. What key_mgmt are you specifying? It looks like authentication might be failing. You might try running the supplicant manually (after stopping any that is running) and see what you get. > P.S. I ain't using IPv6... like not at all. Unfortunate, but I can't run it at home, either, as Comcast is not yet offering it in my area. (Nor is Verizon who will be my provider next month.) -- R. Kevin Oberman, Network Engineer E-mail: kob6558@gmail.com From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 04:24:44 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 05D71B1D; Fri, 19 Oct 2012 04:24:44 +0000 (UTC) (envelope-from rfg@tristatelogic.com) Received: from outgoing.tristatelogic.com (segfault.tristatelogic.com [69.62.255.118]) by mx1.freebsd.org (Postfix) with ESMTP id B35228FC17; Fri, 19 Oct 2012 04:24:43 +0000 (UTC) Received: from segfault-nmh-helo.tristatelogic.com (localhost [127.0.0.1]) by segfault.tristatelogic.com (Postfix) with ESMTP id 3B00250821; Thu, 18 Oct 2012 21:24:36 -0700 (PDT) To: Kevin Oberman Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) In-Reply-To: Date: Thu, 18 Oct 2012 21:24:36 -0700 Message-ID: <2529.1350620676@tristatelogic.com> From: "Ronald F. Guilmette" Cc: freebsd-net@freebsd.org, Adrian Chadd X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 04:24:44 -0000 In message , Kevin Oberman wrote: >To use WPA and a static address, you need something like: >ifconfig_wlan0 ="WPA inet 192.168.1.21/24" >so that was OK. Yea, actually I did already have the static+WPA working. >Now, you seem to have both interfaces on the same /24 with a /24 >netmask. This is probably going to result in some poorly defined >behavior. :-) I think that's the polite way of putting it. >I'm not sure just what you are trying to do... That's OK. Tha makes two of us. (1/2 :-) >but I suspect that it is not what you are doing. Actually, I wasn't trying to achieve much of anything, specifically, when I had _two_ ifconfig_XXX= lines in rc.conf for _both_ my wired and wirless interfaces. I was just being lazy, not taking the ifconfig for the wired out when I started using the wirless. And then I looked at it and realized pretty much what you said, which is basically: Who the hell knows where the packets will go if there are multiple routes from one machine to someplace else, and if none of them is more specific than the other. It's definitely an enigma. Does this produce Heisenbergian packet flow? (I was rather hoping that you FreeBSD networking gurus would enlighten me on this somewhat interesting, even if obscure point.) >This said, I am not sure how this might cause the interface to fail to >associate. I am with you. I don't think it would or should. >I'm guessing that you are simply not associating and the >scan is falling back to 'B' after failing to find an AP in faster >modes. The question is "why?". Yea, it seems kind of odd. >What is the output of "ifconfig wlan0 list aps"? Umm... well... I've rebooted since I mailed/posted earlier, and now things are looking rather different. In particular, it appears that I have `G' notw on the wirless link _and_ from elsewher on my network I can successfully ping _both_ 192.168.1.23 (the wired connection) _and_ also 192.168.1.21 (the wireless connection). And traceroute says that those are both one hop away from my other server which is at 192.168.1.2. ============================================================================== re0: flags=8843 metric 0 mtu 1500 options=8209b ether 00:24:21:65:ad:a0 inet 192.168.1.23 netmask 0xffffff00 broadcast 192.168.1.255 inet6 fe80::224:21ff:fe65:ada0%re0 prefixlen 64 scopeid 0x4 nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active iwn0: flags=8843 metric 0 mtu 2290 ether 00:22:fb:76:6d:18 nd6 options=29 media: IEEE 802.11 Wireless Ethernet autoselect mode 11ng status: associated lo0: flags=8049 metric 0 mtu 16384 options=600003 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa inet 127.0.0.1 netmask 0xff000000 nd6 options=21 wlan0: flags=8843 metric 0 mtu 1500 ether 00:22:fb:76:6d:18 inet 192.168.1.21 netmask 0xffffff00 broadcast 192.168.1.255 inet6 fe80::222:fbff:fe76:6d18%wlan0 prefixlen 64 scopeid 0xb nd6 options=29 media: IEEE 802.11 Wireless Ethernet OFDM/36Mbps mode 11ng status: associated ssid ronair2-1 channel 11 (2462 MHz 11g ht/40-) bssid c0:c1:c0:8b:4b:f3 country US authmode WPA2/802.11i privacy ON deftxkey UNDEF AES-CCM 2:128-bit AES-CCM 3:128-bit txpower 15 bmiss 10 scanvalid 450 bgscan bgscanintvl 300 bgscanidle 250 roam:rssi 7 roam:rate 64 protmode CTS ampdulimit 64k ampdudensity 8 -amsdutx amsdurx shortgi wme roaming MANUAL ============================================================================== Here's the stuff that you specifically asked for, although I don't know if it is even relevant anymore: ============================================================================== % ifconfig wlan0 list aps SSID/MESH ID BSSID CHAN RATE S:N INT CAPS Cisco 58:6d:8f:7e:6c:5d 11 54M -74:-95 100 EP RSN HTCAP WPS WPA WME ronair2-1 c0:c1:c0:8b:4b:f3 11 54M -65:-95 100 EP RSN HTCAP WPS WME belkin.194 08:86:3b:6f:91:94 11 54M -81:-95 100 EP HTCAP WPA RSN WME WPS Cisco 58:6d:8f:7e:6c:5e 36 54M -86:-95 100 EP RSN HTCAP WPS WPA WME Fluff c0:3f:0e:78:3e:f5 2 54M -82:-95 100 EPS RSN WPA WME HTCAP ATH WPS linksysLA 00:18:f8:e6:4b:58 5 54M -90:-95 100 EP RSN HTCAP WPA WME belkin.194.... 08:86:3b:6f:91:96 149 54M -90:-95 100 EP WPS HTCAP WPA RSN WME erikadoming... a0:21:b7:9d:0f:98 5 54M -83:-95 100 EPS RSN WPA WME HTCAP ATH WPS dcz_26 00:1b:2f:02:40:de 11 54M -91:-95 100 EPS WPA ============================================================================== (My AP is "ronair2-1". As you can see, I live is a busy neighborhood.) >One thing I see is: >country US authmode WPA1+WPA2/802.11i privacy OFF Huh?? Where are you seeing THAT? Oh! I see. I guess it must have been in the ifconfig -a ioutput that I sent earlier. Well, as you can see above, that appears to have changed now too. >For home users I would normally expect WPA-PSK to be used. Indeed. And as far as I know, that _is_ what I _am_ using. >What key_mgmt are you specifying? Sorry. I don't understand the question. Anyway, the bottom line here is that it appears that I no longer have any bug report to file. Whatever was causing me to get `B' (or was it `A'? I forget now) before seems to have sorted itself out on its own. As I was telling my neighbor just the other day, it never ceases to amaze me what vast numers of problems are so often cured by a simple hard reset (i.e. power cycle). Anyway, if I ever see the problme arise again, I'll let somebody know. Regards, rfg From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 06:53:22 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F30B1F3F for ; Fri, 19 Oct 2012 06:53:21 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id C0EF48FC0A for ; Fri, 19 Oct 2012 06:53:21 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so148914pad.13 for ; Thu, 18 Oct 2012 23:53:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=uEnKUqssRp4PjGfZ0pIJ93+U5sajYSJCSzW/wTkqC9s=; b=cfCFhw0lvTgvx8b4kcaH7qPugDGVsij8DninMpZVl4BCpu3rPMDgEDN7zpKEgYTiqC 7l/RIKGTAzVK/ByhmaIuPs6gJS+X7uLYj5o3oW/nNIqnfxyaOjgoRHBiIi5Q7RsEVrah Kn5sYcW3oDoRHptk7Iv8Tbwo/AWiSO2JhcZS6c6UM9+cjS5jcuSWOETHFB5INoN2if3Q iJFn7o8WXS7pHtZ1rEOSt/kv/NENZGrWznikhQgac99w88Xv6cvlZBXb/6nDly/0JUU4 wCMELqLUP1EorvGvbLvN1KiCUEC5n3yp798xGZ05gYT/B/+SDVNMz47RqckzIHFASSfh kCyA== MIME-Version: 1.0 Received: by 10.68.229.138 with SMTP id sq10mr2509879pbc.126.1350629601239; Thu, 18 Oct 2012 23:53:21 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.68.146.233 with HTTP; Thu, 18 Oct 2012 23:53:21 -0700 (PDT) In-Reply-To: <2529.1350620676@tristatelogic.com> References: <2529.1350620676@tristatelogic.com> Date: Thu, 18 Oct 2012 23:53:21 -0700 X-Google-Sender-Auth: tJqo2dGfIP6XUI-VIWyWmZotzH0 Message-ID: Subject: Re: Wireless Networking Bug(s) in 9.1-RC2 (?) From: Adrian Chadd To: "Ronald F. Guilmette" Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Kevin Oberman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 06:53:22 -0000 The obscure answer has to do with what the L2 adjacency stuff is doing. Because it adds that default route out a specific interface, it will send ARP requests out that. Even if the other interface goes down, it'll still throw them out that interface. It's just a side effect of how the L2 adjacency stuff works. Adrian From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 07:53:07 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 73401864; Fri, 19 Oct 2012 07:53:07 +0000 (UTC) (envelope-from fabien.thomas@netasq.com) Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1]) by mx1.freebsd.org (Postfix) with ESMTP id E61AA8FC14; Fri, 19 Oct 2012 07:53:06 +0000 (UTC) Received: from [10.2.1.1] (unknown [10.2.1.1]) by work.netasq.com (Postfix) with ESMTPSA id 54D6E2705764; Fri, 19 Oct 2012 09:53:05 +0200 (CEST) Subject: Re: ixgbe & if_igb RX ring locking Mime-Version: 1.0 (Apple Message framework v1283) From: Fabien Thomas In-Reply-To: Date: Fri, 19 Oct 2012 09:53:07 +0200 Message-Id: <390AF360-AEC3-495E-881A-1ACCFEF42815@netasq.com> References: <5079A9A1.4070403@FreeBSD.org> <20121013182223.GA73341@onelab2.iet.unipi.it> <5080020E.1010603@networx.ch> To: Jack Vogel X-Mailer: Apple Mail (2.1283) Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "Alexander V. Chernikov" , Luigi Rizzo , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 07:53:07 -0000 Le 18 oct. 2012 =E0 20:09, Jack Vogel a =E9crit : > On Thu, Oct 18, 2012 at 6:20 AM, Andre Oppermann = wrote: >=20 >> On 13.10.2012 20:22, Luigi Rizzo wrote: >>=20 >>> On Sat, Oct 13, 2012 at 09:49:21PM +0400, Alexander V. Chernikov = wrote: >>>=20 >>>> Hello list! >>>>=20 >>>>=20 >>>> Packets receiving code for both ixgbe and if_igb looks like the >>>> following: >>>>=20 >>>>=20 >>>> ixgbe_msix_que >>>>=20 >>>> -- ixgbe_rxeof() >>>> { >>>> IXGBE_RX_LOCK(rxr); >>>> while >>>> { >>>> get_packet; >>>>=20 >>>> -- ixgbe_rx_input() >>>> { >>>> ++ IXGBE_RX_UNLOCK(rxr); >>>> if_input(packet); >>>> ++ IXGBE_RX_LOCK(rxr); >>>> } >>>>=20 >>>> } >>>> IXGBE_RX_UNLOCK(rxr); >>>> } >>>>=20 >>>> Lines marked with ++ appeared in r209068(igb) and r217593(ixgbe). >>>>=20 >>>> These lines probably do LORs masking (if any) well. >>>> However, such change introduce quite significant performance drop: >>>>=20 >>>> On my routing setup (nearly the same from previous -Intel 10G = thread in >>>> -net) adding lock/unlock causes 2.8MPPS decrease to 2.3MPPS which = is >>>> nearly 20%. >>>>=20 >>>=20 >>> one option could be (same as it is done in the timer >>> routine in dummynet) to build a list of all the packets >>> that need to be sent to if_input(), and then call >>> if_input with the entire list outside the lock. >>>=20 >>> It would be even easier if we modify the various *_input() >>> routines to handle a list of mbufs instead of just one. >>>=20 >>=20 >> Not really. You'd just run into tons of layering complexity. >> Somewhere the decomposition and serialization has to be done. >>=20 >> Perhaps the right place is to dequeue a batch of packets from >> the HW ring and then have a task/thread send it up the stack >> one by one. >>=20 >=20 > I was thinking about how to code this, something like what I did with > the refresh routine, in any case I will experiment with it. This modified version for mq polling create a list of packet that are = injected later (mc is the list). = http://www.gitorious.org/~fabient/freebsd/fabient-freebsd/blobs/work/polln= g_mq_stable_8/sys/dev/ixgbe/ixgbe.c#line4615 >=20 > Jack > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 08:01:57 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4CA7AA5B for ; Fri, 19 Oct 2012 08:01:57 +0000 (UTC) (envelope-from sunkeyong@gmail.com) Received: from mail-qc0-f182.google.com (mail-qc0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id E43258FC14 for ; Fri, 19 Oct 2012 08:01:56 +0000 (UTC) Received: by mail-qc0-f182.google.com with SMTP id l39so152036qcs.13 for ; Fri, 19 Oct 2012 01:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=T6awshakab+GOnOXmQ4FgrLRfJkjuzE5OBRl0WZUy4o=; b=02s/f2icWbXPCIUS7qcEBmbeL/hb8ckSS27nBIFN8R32TQscgT3OSlEo9uhX0CDV4W rYFYineIycpZClSDqtYKcedSK99KoVMLdgRp2QuQYqfp1ZB+QxRN3NyVIypEQqYBdrWR zXTG7ob8h4snVenMSjS5n5CAbYokQERlzSWoUImZWNA77oj6cT1wc+oS8lOAaLvi8K5v vO3rnonvwYBf7pwJaBZpkb+HublxQxPBsmBjSo1RUOO1kFJndv1iSZZrPAhulRqUeukX 5VE9+5ZxjQITXATCX0e69bTdk8syptPMRvUgCHdPO15TJvjodCdIEXKUwjVqRl0qjBlp GLAA== MIME-Version: 1.0 Received: by 10.224.213.10 with SMTP id gu10mr384338qab.10.1350633716181; Fri, 19 Oct 2012 01:01:56 -0700 (PDT) Received: by 10.49.39.167 with HTTP; Fri, 19 Oct 2012 01:01:56 -0700 (PDT) Date: Fri, 19 Oct 2012 16:01:56 +0800 Message-ID: Subject: kern/94020: [tcp] tcp_timer_2msl_tw NULL pointer dereference panic From: Sun Keyong To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 08:01:57 -0000 Hi, Recently I meet a bug about the tcp_timer_2msl_tw null pointer dereference panic. I found there is a PR94020, the status was closed. Could anyone point to me how to fix this, and where I can found how to fix? Thanks a lot KY From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 09:01:31 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 53175AE1 for ; Fri, 19 Oct 2012 09:01:31 +0000 (UTC) (envelope-from simond@irrelevant.org) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id BBC988FC14 for ; Fri, 19 Oct 2012 09:01:29 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id b5so243180lbd.13 for ; Fri, 19 Oct 2012 02:01:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.org; s=irrelevant; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=xJduYT4hlOPmQbFunLrA0S42mDzxMrivIORS1PXqfZg=; b=RWRXDiJpMHk5dSpNPzXkDBTzMFxTwNToTaXgyCOW4ke//ZWzhECLYsRsoAEb++IvVt JHz6JRt7ncbg00vIgLr1Q5x+JVT7DekCHuZJv0rXgUiVtI+AVFjK39G5F9SlQuH/5q69 70Mbscd6RHj52TLpQJJ3dEoJ8iLsVJjRB0qm4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=xJduYT4hlOPmQbFunLrA0S42mDzxMrivIORS1PXqfZg=; b=C1F2yonZCe93v8HHdN5JEC4BdEKRGRlwyoEPXH8wRqIMhTwUsc/y7CfMA2VdDwzyco QKPdQ5kEh/2nHrT0VpMhZ538rlAAnsfq1gCaB+Ta4OPfBfsudFYKCyNSjB2Bgyo4Bigw w0dCxgjaiT9wgUGb0SzJOnk1qsUNMa4BWwQmcg5TBYERm7fp8i3j3o0FX74Utyf7zuoR HwKE+n4JmPLtX9xoitSo8YxkqCzRUJxLzFwIDpn3N8mv8PpQmXd4erZiVZtSovkgRp7F cBQk945WunFbkDLcTtUpvxLuNBdL/VUOx9f1ruKuJHaLnsMEV6xqxBioGs2uX7Pi4jWA Ob/g== MIME-Version: 1.0 Received: by 10.152.103.38 with SMTP id ft6mr528732lab.40.1350637288567; Fri, 19 Oct 2012 02:01:28 -0700 (PDT) Received: by 10.114.63.83 with HTTP; Fri, 19 Oct 2012 02:01:28 -0700 (PDT) X-Originating-IP: [94.31.26.5] In-Reply-To: References: Date: Fri, 19 Oct 2012 10:01:28 +0100 Message-ID: Subject: Re: CARP on vSwitch From: Simon Dick To: Rafael Henrique Faria Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQk68fXbUovceY4leu3RUOR5xLyPMu3yymydjHJwGORMxbBq33qAMmZ14jeU20Ve30q585tW Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 09:01:31 -0000 On 18 October 2012 21:59, Rafael Henrique Faria wrote: > Hi, I'm trying to use CARP on two FreeBSD servers in a ESX environment. But > it's not working. > > The problem is that every frame sent from CARP gets back to the same host. > This is an old problem: > > http://www.mail-archive.com/freebsd-net@freebsd.org/msg30562.html > > And already have a patch, but its 3 years old. And not yet commit-ed. There > is any reason for this? > I always used freebsd-update to keep the servers updated, and don't want to > compile a kernel just to use the CARP. > > Someone have any suggestion or correction to this problem? I found the following page useful when getting CARP working between ESXi servers: http://doc.pfsense.org/index.php/CARP_Configuration_Troubleshooting#ESX_VDS_Promisc_Workaround From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 09:15:06 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9D1BF290 for ; Fri, 19 Oct 2012 09:15:06 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id F0C258FC0C for ; Fri, 19 Oct 2012 09:15:05 +0000 (UTC) Received: (qmail 29979 invoked from network); 19 Oct 2012 10:53:51 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 19 Oct 2012 10:53:51 -0000 Message-ID: <50811A14.5080903@networx.ch> Date: Fri, 19 Oct 2012 11:15:00 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: A small cleanup patch References: <50800D9D.1090705@networx.ch> <20121018170039.GS35915@deviant.kiev.zoral.com.ua> In-Reply-To: <20121018170039.GS35915@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org, Vijay Singh X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 09:15:06 -0000 On 18.10.2012 19:00, Konstantin Belousov wrote: > On Thu, Oct 18, 2012 at 04:09:33PM +0200, Andre Oppermann wrote: >> On 05.10.2012 01:21, Vijay Singh wrote: >>> Folks, I came up with this while going through the lltable code. >> >> Thank you. I just purged a larger number of stray spl* from the >> net*/* directories. This stuff won't be backported to 9-STABLE >> though. > > Why ? What is the value of having the fossils in the actively maintained > stable tree ? To avoid churn within a stable release track. -- Andre From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 11:25:39 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 633F0D03; Fri, 19 Oct 2012 11:25:39 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [8.8.178.136]) by mx2.freebsd.org (Postfix) with ESMTP id 19E8D3B53B7; Fri, 19 Oct 2012 11:25:36 +0000 (UTC) Message-ID: <508138A4.5030901@FreeBSD.org> Date: Fri, 19 Oct 2012 15:25:24 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20121010 Thunderbird/15.0.1 MIME-Version: 1.0 To: net@freebsd.org Subject: [RFC] Enabling IPFIREWALL_FORWARD in run-time X-Enigmail-Version: 1.4.3 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigCA59115641B47F6217D4A48C" Cc: ipfw@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 11:25:39 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigCA59115641B47F6217D4A48C Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi All, Many years ago i have already proposed this feature, but at that time several people were against, because as they said, it could affect performance. Now, when we have high speed network adapters, SMP kernel and network stack, several locks acquired in the path of each packet, and i have an ability to test this in the lab. So, i prepared the patch, that removes IPFIREWALL_FORWARD option from the kernel and makes this functionality always build-in, but it is turned off by default and can be enabled via the sysctl(8) variable net.pfil.forward=3D1. http://people.freebsd.org/~ae/pfil_forward.diff Also we have done some tests with the ixia traffic generator connected via 10G network adapter. Tests have show that there is no visible difference, and there is no visible performance degradation. Any objections? --=20 WBR, Andrey V. Elsukov --------------enigCA59115641B47F6217D4A48C Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQEcBAEBAgAGBQJQgTisAAoJEAHF6gQQyKF6+UwH/2xemnR6Si2AtLcRJrB0HpXa Kr8r2BCyTulAdBsYBznduCj4cvhpiVrXNhqIf9y1mrY4LMz0Ci98OClOTaom82t/ /1msCig4nt61ZV5X21aQ19xzWUqu/Njx1gGz63v2dBKAyhngdJ3EjGa5sU1L2RU2 wJnJ4/iSmq1IT9Y6x0iFXG+1LZTs/Kg+/9j5G8qnTJDRP0sIRwopG4Imd5MdHOLM KrnpCm2HMxvxq6xls4phaBy20p/Yy5LDl7iDgJLyK7Ro8TA05me6zVBzz9hnuJjJ zN65HAMlhZsfeXb5VxRfKh11QcS8jdYhHATUSYuHIlGibdAa4Pj+hZlWzVKTS1E= =9ra7 -----END PGP SIGNATURE----- --------------enigCA59115641B47F6217D4A48C-- From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 11:47:11 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EC093305; Fri, 19 Oct 2012 11:47:10 +0000 (UTC) (envelope-from zam4ever@gmail.com) Received: from mail-qa0-f54.google.com (mail-qa0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 655C58FC08; Fri, 19 Oct 2012 11:47:10 +0000 (UTC) Received: by mail-qa0-f54.google.com with SMTP id p27so67209qat.13 for ; Fri, 19 Oct 2012 04:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=KCt2rbM2J0//nzGbRbVss9xuEKCsfdhls/n/5ye2RJE=; b=ysfoB5JETcdWj7wpywQ8gZJG837a3N1HcUGudlE1YaInSeL3tXjMtDS5pH2m0M1zVA OBAxH8ouBtesyLr2Tu2eDAJbktBrZSLJzAH7QJr/R/oRQy8UG5/e481saxOTKiPvqpWU i4eTXO7MOeakEo705nqHo4eoAFp5Lv2J6v/5cmE19aNHB8rStwVetSC1KtrCkw2gQoWR Xq+5KY2+C6JEfLpkkSFdEgNtzDwQlsM+mOlEY8Utwb8pCAcCv6USKoTbEpNW+hUxMKYF D+UdnAAyctuXHNRZiE7gFKmglLRhbAYkImso16z2hIg1EyeHk14aVxEaRdPs1NqQilaD 24LA== MIME-Version: 1.0 Received: by 10.224.168.136 with SMTP id u8mr602590qay.17.1350647223650; Fri, 19 Oct 2012 04:47:03 -0700 (PDT) Received: by 10.49.117.134 with HTTP; Fri, 19 Oct 2012 04:47:03 -0700 (PDT) Received: by 10.49.117.134 with HTTP; Fri, 19 Oct 2012 04:47:03 -0700 (PDT) In-Reply-To: <508138A4.5030901@FreeBSD.org> References: <508138A4.5030901@FreeBSD.org> Date: Fri, 19 Oct 2012 19:47:03 +0800 Message-ID: Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time From: Zamri Besar To: "Andrey V. Elsukov" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 11:47:11 -0000 On Oct 19, 2012 7:25 PM, "Andrey V. Elsukov" wrote: > > Hi All, > > Many years ago i have already proposed this feature, but at that time > several people were against, because as they said, it could affect > performance. Now, when we have high speed network adapters, SMP kernel > and network stack, several locks acquired in the path of each packet, > and i have an ability to test this in the lab. > > So, i prepared the patch, that removes IPFIREWALL_FORWARD option from > the kernel and makes this functionality always build-in, but it is > turned off by default and can be enabled via the sysctl(8) variable > net.pfil.forward=1. > > http://people.freebsd.org/~ae/pfil_forward.diff > > Also we have done some tests with the ixia traffic generator connected > via 10G network adapter. Tests have show that there is no visible > difference, and there is no visible performance degradation. > > Any objections? > > -- > WBR, Andrey V. Elsukov > This is what I want many years ago too... ;) I vote for "yes" From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 11:47:11 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 675BF307 for ; Fri, 19 Oct 2012 11:47:11 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id D34C48FC0A for ; Fri, 19 Oct 2012 11:47:10 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q9JBlGrI018455; Fri, 19 Oct 2012 14:47:16 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q9JBl42Q093798; Fri, 19 Oct 2012 14:47:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q9JBl4Di093797; Fri, 19 Oct 2012 14:47:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 19 Oct 2012 14:47:04 +0300 From: Konstantin Belousov To: Andre Oppermann Subject: Re: A small cleanup patch Message-ID: <20121019114704.GY35915@deviant.kiev.zoral.com.ua> References: <50800D9D.1090705@networx.ch> <20121018170039.GS35915@deviant.kiev.zoral.com.ua> <50811A14.5080903@networx.ch> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="j7Mt6kSMu9nlWjvx" Content-Disposition: inline In-Reply-To: <50811A14.5080903@networx.ch> User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: net@freebsd.org, Vijay Singh X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 11:47:11 -0000 --j7Mt6kSMu9nlWjvx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 19, 2012 at 11:15:00AM +0200, Andre Oppermann wrote: > On 18.10.2012 19:00, Konstantin Belousov wrote: > > On Thu, Oct 18, 2012 at 04:09:33PM +0200, Andre Oppermann wrote: > >> On 05.10.2012 01:21, Vijay Singh wrote: > >>> Folks, I came up with this while going through the lltable code. > >> > >> Thank you. I just purged a larger number of stray spl* from the > >> net*/* directories. This stuff won't be backported to 9-STABLE > >> though. > > > > Why ? What is the value of having the fossils in the actively maintained > > stable tree ? >=20 > To avoid churn within a stable release track. IMO, this is wrong argument. The artificial difference, esp. due to the nop and garbage, makes both code reading and MFC harder. --j7Mt6kSMu9nlWjvx Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlCBPbcACgkQC3+MBN1Mb4g2bwCgiSCaKpxMvb8SDGt/seLlFpEd t0sAoLbvwk45lSW8OxbVSy5bL+05wMKU =jAmM -----END PGP SIGNATURE----- --j7Mt6kSMu9nlWjvx-- From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 12:02:53 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A4292A89 for ; Fri, 19 Oct 2012 12:02:53 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id E969E8FC0C for ; Fri, 19 Oct 2012 12:02:52 +0000 (UTC) Received: (qmail 35284 invoked from network); 19 Oct 2012 13:41:36 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 19 Oct 2012 13:41:36 -0000 Message-ID: <50814166.1000602@networx.ch> Date: Fri, 19 Oct 2012 14:02:46 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: "Andrey V. Elsukov" Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time References: <508138A4.5030901@FreeBSD.org> In-Reply-To: <508138A4.5030901@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 12:02:53 -0000 On 19.10.2012 13:25, Andrey V. Elsukov wrote: > Hi All, > > Many years ago i have already proposed this feature, but at that time > several people were against, because as they said, it could affect > performance. Now, when we have high speed network adapters, SMP kernel > and network stack, several locks acquired in the path of each packet, > and i have an ability to test this in the lab. > > So, i prepared the patch, that removes IPFIREWALL_FORWARD option from > the kernel and makes this functionality always build-in, but it is > turned off by default and can be enabled via the sysctl(8) variable > net.pfil.forward=1. > > http://people.freebsd.org/~ae/pfil_forward.diff > > Also we have done some tests with the ixia traffic generator connected > via 10G network adapter. Tests have show that there is no visible > difference, and there is no visible performance degradation. > > Any objections? No objection as such. However I don't entirely agree with the naming of pfil_forward. The functionality is specific to IPFW and TCP, it's doing transparent interjected termination of tcp connections on the local host while keeping the original IP addresses and port numbers visible in netstat output. So it's a feature of IPFW/IP and should be fitted in there for sysctl name and .h files instead of pfil. -- Andre From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 12:18:56 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 64A37E39; Fri, 19 Oct 2012 12:18:56 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [8.8.178.136]) by mx2.freebsd.org (Postfix) with ESMTP id 232A43B4F7F; Fri, 19 Oct 2012 12:18:54 +0000 (UTC) Message-ID: <50814523.2070002@FreeBSD.org> Date: Fri, 19 Oct 2012 16:18:43 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20121010 Thunderbird/15.0.1 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time References: <508138A4.5030901@FreeBSD.org> <50814166.1000602@networx.ch> In-Reply-To: <50814166.1000602@networx.ch> X-Enigmail-Version: 1.4.3 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigC2F9C7A14662BA4A777BD6AB" Cc: ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 12:18:56 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigC2F9C7A14662BA4A777BD6AB Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 19.10.2012 16:02, Andre Oppermann wrote:>> http://people.freebsd.org/~ae/pfil_forward.diff >> >> Also we have done some tests with the ixia traffic generator connected= >> via 10G network adapter. Tests have show that there is no visible >> difference, and there is no visible performance degradation. >> >> Any objections? > > No objection as such. However I don't entirely agree with the > naming of pfil_forward. The functionality is specific to IPFW > and TCP, it's doing transparent interjected termination of tcp > connections on the local host while keeping the original IP > addresses and port numbers visible in netstat output. > > So it's a feature of IPFW/IP and should be fitted in there for > sysctl name and .h files instead of pfil. Actually it can be used not only by ipfw. We already have net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and placing it into net.inet.ip.fw is undesirable, because we can have kernel without ipfw. So, i decided to choose pfil, because it could not work without pfil. --=20 WBR, Andrey V. Elsukov --------------enigC2F9C7A14662BA4A777BD6AB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQEcBAEBAgAGBQJQgUUqAAoJEAHF6gQQyKF6pyMIAILQkM9tSI6KL5bdG7qotu/Q ulM49kdqP6eHNGt2FMCy634r6uM7HNPK0oY3cZq9acxbUFXf/es8PViz/ELCFmcL V1BUAoDj2J6PBx4n1oGNf+efV9J/s/7YHLj93RH1hgFWVOIOoPdzlyhm/bIs5Dz2 HQ7Nw92GfMCIFREEcZZ55H5ai9xUJoP4BOYDrJ/za9I/XpxTTzqoGUrEJFJUKJP9 ASArYTggA5UrESKTMg/WV2/pIlmwkfEtgAjzAkjweeUi4N3T6QRjY8w8lbz7aZn0 GOq60Ia6cmmrwDZkmTmJ9NJGNKQ7yRlheprcLh9pmoWPEKpgZedcYeDcTLkrprk= =fWAC -----END PGP SIGNATURE----- --------------enigC2F9C7A14662BA4A777BD6AB-- From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 13:56:54 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CDBD5FC0; Fri, 19 Oct 2012 13:56:54 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from sola.nimnet.asn.au (paqi.nimnet.asn.au [115.70.110.159]) by mx1.freebsd.org (Postfix) with ESMTP id 09EE58FC08; Fri, 19 Oct 2012 13:56:53 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by sola.nimnet.asn.au (8.14.2/8.14.2) with ESMTP id q9JDujNr096880; Sat, 20 Oct 2012 00:56:45 +1100 (EST) (envelope-from smithi@nimnet.asn.au) Date: Sat, 20 Oct 2012 00:56:45 +1100 (EST) From: Ian Smith To: "Andrey V. Elsukov" Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time In-Reply-To: <508138A4.5030901@FreeBSD.org> Message-ID: <20121020002249.X88114@sola.nimnet.asn.au> References: <508138A4.5030901@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 13:56:54 -0000 On Fri, 19 Oct 2012 15:25:24 +0400, Andrey V. Elsukov wrote: > Hi All, > > Many years ago i have already proposed this feature, but at that time > several people were against, because as they said, it could affect > performance. Now, when we have high speed network adapters, SMP kernel > and network stack, several locks acquired in the path of each packet, > and i have an ability to test this in the lab. > > So, i prepared the patch, that removes IPFIREWALL_FORWARD option from > the kernel and makes this functionality always build-in, but it is > turned off by default and can be enabled via the sysctl(8) variable > net.pfil.forward=1. > > http://people.freebsd.org/~ae/pfil_forward.diff > > Also we have done some tests with the ixia traffic generator connected > via 10G network adapter. Tests have show that there is no visible > difference, and there is no visible performance degradation. > > Any objections? Looks great. I'll no longer have to tell people on questions@ that using ipfw fwd is the only reason left not to just load the module. Taking the code on trust, only to comment on the documentation: ipfw.8: ======= To enable .Cm fwd -a custom kernel needs to be compiled with the option -.Cd "options IPFIREWALL_FORWARD" . +the +.Xr sysctl 8 +variable +.Va net.pfil.forward +should be set to 1. NOTES: ======= -# IPFIREWALL_FORWARD enables changing of the packet destination either -# to do some sort of policy routing or transparent proxying. Used by -# ``ipfw forward''. All redirections apply to locally generated -# packets too. Because of this great care is required when -# crafting the ruleset. ipfw(8) could perhaps incorporate that description (and warning) from NOTES in the entry under SYSCTLS where net.pfil.forward (or whatever :) would be expected to be described, apart from sysctl -d ? cheers, Ian > WBR, Andrey V. Elsukov From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 14:05:50 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 40CDF3ED for ; Fri, 19 Oct 2012 14:05:50 +0000 (UTC) (envelope-from oppermann@networx.ch) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 994FB8FC14 for ; Fri, 19 Oct 2012 14:05:48 +0000 (UTC) Received: (qmail 35725 invoked from network); 19 Oct 2012 15:44:32 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 19 Oct 2012 15:44:32 -0000 Message-ID: <50815E36.6010703@networx.ch> Date: Fri, 19 Oct 2012 16:05:42 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: "Andrey V. Elsukov" Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time References: <508138A4.5030901@FreeBSD.org> <50814166.1000602@networx.ch> <50814523.2070002@FreeBSD.org> In-Reply-To: <50814523.2070002@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 14:05:50 -0000 On 19.10.2012 14:18, Andrey V. Elsukov wrote: > On 19.10.2012 16:02, Andre Oppermann wrote:>> > http://people.freebsd.org/~ae/pfil_forward.diff >>> >>> Also we have done some tests with the ixia traffic generator connected >>> via 10G network adapter. Tests have show that there is no visible >>> difference, and there is no visible performance degradation. >>> >>> Any objections? >> >> No objection as such. However I don't entirely agree with the >> naming of pfil_forward. The functionality is specific to IPFW >> and TCP, it's doing transparent interjected termination of tcp >> connections on the local host while keeping the original IP >> addresses and port numbers visible in netstat output. >> >> So it's a feature of IPFW/IP and should be fitted in there for >> sysctl name and .h files instead of pfil. > > Actually it can be used not only by ipfw. We already have > net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and > placing it into net.inet.ip.fw is undesirable, because we can have > kernel without ipfw. So, i decided to choose pfil, because it could not > work without pfil. Again, it's not a property of pfil. It's a property of IP and it should live there from a configuration point of view. Other firewalls than ipfw don't make use of it. You could rename it to transparent connection proxy or some such. -- Andre From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:17:44 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id C7959854; Fri, 19 Oct 2012 15:17:44 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id E46CB3B6660; Fri, 19 Oct 2012 15:17:41 +0000 (UTC) Message-ID: <50816ECE.4020002@FreeBSD.org> Date: Fri, 19 Oct 2012 19:16:30 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time References: <508138A4.5030901@FreeBSD.org> <50814166.1000602@networx.ch> <50814523.2070002@FreeBSD.org> <50815E36.6010703@networx.ch> In-Reply-To: <50815E36.6010703@networx.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "Andrey V. Elsukov" , ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:17:44 -0000 On 19.10.2012 18:05, Andre Oppermann wrote: > On 19.10.2012 14:18, Andrey V. Elsukov wrote: >> On 19.10.2012 16:02, Andre Oppermann wrote:>> >> http://people.freebsd.org/~ae/pfil_forward.diff >>>> >>>> Also we have done some tests with the ixia traffic generator connected >>>> via 10G network adapter. Tests have show that there is no visible >>>> difference, and there is no visible performance degradation. >>>> >>>> Any objections? >>> >>> No objection as such. However I don't entirely agree with the >>> naming of pfil_forward. The functionality is specific to IPFW >>> and TCP, it's doing transparent interjected termination of tcp >>> connections on the local host while keeping the original IP >>> addresses and port numbers visible in netstat output. >>> >>> So it's a feature of IPFW/IP and should be fitted in there for >>> sysctl name and .h files instead of pfil. >> >> Actually it can be used not only by ipfw. We already have >> net.inet.ip.forwarding and net.inet6.ip6.forwarding variables, and >> placing it into net.inet.ip.fw is undesirable, because we can have >> kernel without ipfw. So, i decided to choose pfil, because it could not >> work without pfil. > > Again, it's not a property of pfil. It's a property of IP and it Not exactly. It is currently supported in both IPv4 and IPv6. > should live there from a configuration point of view. Other firewalls > than ipfw don't make use of it. > > You could rename it to transparent connection proxy or some such. fwd is widely used as policy-based routing, so it is not just upper-layer TCP feature. > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:24:15 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 82BA4B13; Fri, 19 Oct 2012 15:24:15 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id 2BB123B4F81; Fri, 19 Oct 2012 15:24:13 +0000 (UTC) Message-ID: <50817057.3090200@FreeBSD.org> Date: Fri, 19 Oct 2012 19:23:03 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org> In-Reply-To: <201210171006.51214.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:24:15 -0000 On 17.10.2012 18:06, John Baldwin wrote: > On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: >> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: >>> On 13.10.2012 23:24, Jack Vogel wrote: >>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: >>> >>>>> >>>>> one option could be (same as it is done in the timer >>>>> routine in dummynet) to build a list of all the packets >>>>> that need to be sent to if_input(), and then call >>>>> if_input with the entire list outside the lock. >>>>> >>>>> It would be even easier if we modify the various *_input() >>>>> routines to handle a list of mbufs instead of just one. >>> >>> Bulk processing is generally a good idea we probably should implement. >>> Probably starting from driver queue ending with marked mbufs >>> (OURS/forward/legacy processing (appletalk and similar))? >>> >>> This can minimize an impact for all >>> locks on RX side: >>> L2 >>> * rx PFIL hook >>> L3 (both IPv4 and IPv6) >>> * global IF_ADDR_RLOCK (currently commented out) >>> * Per-interface ADDR_RLOCK >>> * PFIL hook >>> >>> From the first glance, there can be problems with: >>> * Increased latency (we should have some kind of rx_process_limit), but >>> still >>> * reader locks being acquired for much longer amount of time >>> >>>>> >>>>> cheers >>>>> luigi >>>>> >>>>> Very interesting idea Luigi, will have to get that some thought. >>>> >>>> Jack >>> >>> Returning to original post topic: >>> >>> Given >>> 1) we are currently binding ixgbe ithreads to CPU cores >>> 2) RX queue lock is used by (indirectly) in only 2 places: >>> a) ISR routine (msix or legacy irq) >>> b) taskqueue routine which is scheduled if some packets remains in RX >>> queue and rx_process_limit ended OR we need something to TX >>> >>> 3) in practice taskqueue routine is a nightmare for many people since >>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >>> some traffic burst happens: once it is called it starts to schedule >>> itself more and more replacing original ISR routine. Additionally, >>> increasing rx_process_limit does not help since taskqueue is called with >>> the same limit. Finally, currently netisr taskq threads are not bound to >>> any CPU which makes the process even more uncontrollable. >> >> I think part of the problem here is that the taskqueue in ixgbe(4) is >> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >> just start transmitting packets directly. >> >> I fixed this in igb(4) here: >> >> http://svnweb.freebsd.org/base?view=revision&revision=233708 >> >> You can try this for ixgbe(4). It also comments out a spurious taskqueue >> reschedule from the watchdog handler that might also lower the taskqueue >> usage. You can try changing that #if 0 to an #if 1 to test just the txeof >> changes: > > Is anyone able to test this btw to see if it improves things on ixgbe at all? > (I don't have any ixgbe hardware.) Yes. I'll try to to this next week (since ixgbe driver from at least 9-S fails to detect twinax cable which works in 8-S....)). > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:24:15 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id 82BA4B13; Fri, 19 Oct 2012 15:24:15 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from dhcp170-36-red.yandex.net (freefall.freebsd.org [8.8.178.135]) by mx2.freebsd.org (Postfix) with ESMTP id 2BB123B4F81; Fri, 19 Oct 2012 15:24:13 +0000 (UTC) Message-ID: <50817057.3090200@FreeBSD.org> Date: Fri, 19 Oct 2012 19:23:03 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120627 Thunderbird/13.0.1 MIME-Version: 1.0 To: John Baldwin Subject: Re: ixgbe & if_igb RX ring locking References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org> In-Reply-To: <201210171006.51214.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Luigi Rizzo , Jack Vogel , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:24:15 -0000 On 17.10.2012 18:06, John Baldwin wrote: > On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: >> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: >>> On 13.10.2012 23:24, Jack Vogel wrote: >>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo wrote: >>> >>>>> >>>>> one option could be (same as it is done in the timer >>>>> routine in dummynet) to build a list of all the packets >>>>> that need to be sent to if_input(), and then call >>>>> if_input with the entire list outside the lock. >>>>> >>>>> It would be even easier if we modify the various *_input() >>>>> routines to handle a list of mbufs instead of just one. >>> >>> Bulk processing is generally a good idea we probably should implement. >>> Probably starting from driver queue ending with marked mbufs >>> (OURS/forward/legacy processing (appletalk and similar))? >>> >>> This can minimize an impact for all >>> locks on RX side: >>> L2 >>> * rx PFIL hook >>> L3 (both IPv4 and IPv6) >>> * global IF_ADDR_RLOCK (currently commented out) >>> * Per-interface ADDR_RLOCK >>> * PFIL hook >>> >>> From the first glance, there can be problems with: >>> * Increased latency (we should have some kind of rx_process_limit), but >>> still >>> * reader locks being acquired for much longer amount of time >>> >>>>> >>>>> cheers >>>>> luigi >>>>> >>>>> Very interesting idea Luigi, will have to get that some thought. >>>> >>>> Jack >>> >>> Returning to original post topic: >>> >>> Given >>> 1) we are currently binding ixgbe ithreads to CPU cores >>> 2) RX queue lock is used by (indirectly) in only 2 places: >>> a) ISR routine (msix or legacy irq) >>> b) taskqueue routine which is scheduled if some packets remains in RX >>> queue and rx_process_limit ended OR we need something to TX >>> >>> 3) in practice taskqueue routine is a nightmare for many people since >>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >>> some traffic burst happens: once it is called it starts to schedule >>> itself more and more replacing original ISR routine. Additionally, >>> increasing rx_process_limit does not help since taskqueue is called with >>> the same limit. Finally, currently netisr taskq threads are not bound to >>> any CPU which makes the process even more uncontrollable. >> >> I think part of the problem here is that the taskqueue in ixgbe(4) is >> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >> just start transmitting packets directly. >> >> I fixed this in igb(4) here: >> >> http://svnweb.freebsd.org/base?view=revision&revision=233708 >> >> You can try this for ixgbe(4). It also comments out a spurious taskqueue >> reschedule from the watchdog handler that might also lower the taskqueue >> usage. You can try changing that #if 0 to an #if 1 to test just the txeof >> changes: > > Is anyone able to test this btw to see if it improves things on ixgbe at all? > (I don't have any ixgbe hardware.) Yes. I'll try to to this next week (since ixgbe driver from at least 9-S fails to detect twinax cable which works in 8-S....)). > -- WBR, Alexander From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:48:03 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9ADE129E; Fri, 19 Oct 2012 15:48:03 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id E74748FC12; Fri, 19 Oct 2012 15:48:02 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fw7so820958vcb.13 for ; Fri, 19 Oct 2012 08:48:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LyQpCVIq+rlEOBpqfsniQevQqr2BzqLmlbe5ZKcEjGI=; b=XqPrwuIvrS2KLuRGcw8wwZnWDT4JsuYhRRsytdSrxFRmyh7v4Hji0Q7neaWOSeqJX/ DtAl54DqTKVLqbjQ7QXeC1MHT6L+ic6ors2/TXpJGtS2kjHT2VCoee9VvittH3vPIU1z P3hFxgO9+5m3XEVHjMC1m5WYbaNj/RN9g3HSNeMSXVKTdt6epgf5aB1CZwcXGJaRshYf 6GKgUmSElKrLtyBvJJkT4PRSUAO60A58TZC54BJpchK5XGIK4BNx7vhWJVf0PchM0rsB 8v/lN+wTHmqtUoxzz7rbdj6Jewq6h0Tdu0a7khpd3tcd5/3BlL0fvOtB71Xevbu6GBzI UJ4Q== MIME-Version: 1.0 Received: by 10.52.75.70 with SMTP id a6mr1762200vdw.5.1350661681798; Fri, 19 Oct 2012 08:48:01 -0700 (PDT) Received: by 10.58.68.8 with HTTP; Fri, 19 Oct 2012 08:48:01 -0700 (PDT) In-Reply-To: <50817057.3090200@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org> <50817057.3090200@FreeBSD.org> Date: Fri, 19 Oct 2012 08:48:01 -0700 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Jack Vogel To: "Alexander V. Chernikov" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org, Luigi Rizzo , John Baldwin , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:48:03 -0000 On Fri, Oct 19, 2012 at 8:23 AM, Alexander V. Chernikov < melifaro@freebsd.org> wrote: > On 17.10.2012 18:06, John Baldwin wrote: > >> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: >> >>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: >>> >>>> On 13.10.2012 23:24, Jack Vogel wrote: >>>> >>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo >>>>> wrote: >>>>> >>>> >>>> >>>>>> one option could be (same as it is done in the timer >>>>>> routine in dummynet) to build a list of all the packets >>>>>> that need to be sent to if_input(), and then call >>>>>> if_input with the entire list outside the lock. >>>>>> >>>>>> It would be even easier if we modify the various *_input() >>>>>> routines to handle a list of mbufs instead of just one. >>>>>> >>>>> >>>> Bulk processing is generally a good idea we probably should implement. >>>> Probably starting from driver queue ending with marked mbufs >>>> (OURS/forward/legacy processing (appletalk and similar))? >>>> >>>> This can minimize an impact for all >>>> locks on RX side: >>>> L2 >>>> * rx PFIL hook >>>> L3 (both IPv4 and IPv6) >>>> * global IF_ADDR_RLOCK (currently commented out) >>>> * Per-interface ADDR_RLOCK >>>> * PFIL hook >>>> >>>> From the first glance, there can be problems with: >>>> * Increased latency (we should have some kind of rx_process_limit), but >>>> still >>>> * reader locks being acquired for much longer amount of time >>>> >>>> >>>>>> cheers >>>>>> luigi >>>>>> >>>>>> Very interesting idea Luigi, will have to get that some thought. >>>>>> >>>>> >>>>> Jack >>>>> >>>> >>>> Returning to original post topic: >>>> >>>> Given >>>> 1) we are currently binding ixgbe ithreads to CPU cores >>>> 2) RX queue lock is used by (indirectly) in only 2 places: >>>> a) ISR routine (msix or legacy irq) >>>> b) taskqueue routine which is scheduled if some packets remains in RX >>>> queue and rx_process_limit ended OR we need something to TX >>>> >>>> 3) in practice taskqueue routine is a nightmare for many people since >>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >>>> some traffic burst happens: once it is called it starts to schedule >>>> itself more and more replacing original ISR routine. Additionally, >>>> increasing rx_process_limit does not help since taskqueue is called with >>>> the same limit. Finally, currently netisr taskq threads are not bound to >>>> any CPU which makes the process even more uncontrollable. >>>> >>> >>> I think part of the problem here is that the taskqueue in ixgbe(4) is >>> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >>> just start transmitting packets directly. >>> >>> I fixed this in igb(4) here: >>> >>> http://svnweb.freebsd.org/**base?view=revision&revision=**233708 >>> >>> You can try this for ixgbe(4). It also comments out a spurious taskqueue >>> reschedule from the watchdog handler that might also lower the taskqueue >>> usage. You can try changing that #if 0 to an #if 1 to test just the >>> txeof >>> changes: >>> >> >> Is anyone able to test this btw to see if it improves things on ixgbe at >> all? >> (I don't have any ixgbe hardware.) >> > Yes. I'll try to to this next week (since ixgbe driver from at least 9-S > fails to detect twinax cable which works in 8-S....)). > >> >> > If you have a major problem like this you might want to put it in a bug report or at least an email with that specific topic rather than bury it in an unrelated thread in a parenthetic remark :( This is the first I've heard of this, did you check the code on HEAD to see if it also has the issue? Jack From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 15:48:03 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9ADE129E; Fri, 19 Oct 2012 15:48:03 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id E74748FC12; Fri, 19 Oct 2012 15:48:02 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fw7so820958vcb.13 for ; Fri, 19 Oct 2012 08:48:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LyQpCVIq+rlEOBpqfsniQevQqr2BzqLmlbe5ZKcEjGI=; b=XqPrwuIvrS2KLuRGcw8wwZnWDT4JsuYhRRsytdSrxFRmyh7v4Hji0Q7neaWOSeqJX/ DtAl54DqTKVLqbjQ7QXeC1MHT6L+ic6ors2/TXpJGtS2kjHT2VCoee9VvittH3vPIU1z P3hFxgO9+5m3XEVHjMC1m5WYbaNj/RN9g3HSNeMSXVKTdt6epgf5aB1CZwcXGJaRshYf 6GKgUmSElKrLtyBvJJkT4PRSUAO60A58TZC54BJpchK5XGIK4BNx7vhWJVf0PchM0rsB 8v/lN+wTHmqtUoxzz7rbdj6Jewq6h0Tdu0a7khpd3tcd5/3BlL0fvOtB71Xevbu6GBzI UJ4Q== MIME-Version: 1.0 Received: by 10.52.75.70 with SMTP id a6mr1762200vdw.5.1350661681798; Fri, 19 Oct 2012 08:48:01 -0700 (PDT) Received: by 10.58.68.8 with HTTP; Fri, 19 Oct 2012 08:48:01 -0700 (PDT) In-Reply-To: <50817057.3090200@FreeBSD.org> References: <5079A9A1.4070403@FreeBSD.org> <507C1960.6050500@FreeBSD.org> <201210150904.27567.jhb@freebsd.org> <201210171006.51214.jhb@freebsd.org> <50817057.3090200@FreeBSD.org> Date: Fri, 19 Oct 2012 08:48:01 -0700 Message-ID: Subject: Re: ixgbe & if_igb RX ring locking From: Jack Vogel To: "Alexander V. Chernikov" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org, Luigi Rizzo , John Baldwin , net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 15:48:03 -0000 On Fri, Oct 19, 2012 at 8:23 AM, Alexander V. Chernikov < melifaro@freebsd.org> wrote: > On 17.10.2012 18:06, John Baldwin wrote: > >> On Monday, October 15, 2012 9:04:27 am John Baldwin wrote: >> >>> On Monday, October 15, 2012 10:10:40 am Alexander V. Chernikov wrote: >>> >>>> On 13.10.2012 23:24, Jack Vogel wrote: >>>> >>>>> On Sat, Oct 13, 2012 at 11:22 AM, Luigi Rizzo >>>>> wrote: >>>>> >>>> >>>> >>>>>> one option could be (same as it is done in the timer >>>>>> routine in dummynet) to build a list of all the packets >>>>>> that need to be sent to if_input(), and then call >>>>>> if_input with the entire list outside the lock. >>>>>> >>>>>> It would be even easier if we modify the various *_input() >>>>>> routines to handle a list of mbufs instead of just one. >>>>>> >>>>> >>>> Bulk processing is generally a good idea we probably should implement. >>>> Probably starting from driver queue ending with marked mbufs >>>> (OURS/forward/legacy processing (appletalk and similar))? >>>> >>>> This can minimize an impact for all >>>> locks on RX side: >>>> L2 >>>> * rx PFIL hook >>>> L3 (both IPv4 and IPv6) >>>> * global IF_ADDR_RLOCK (currently commented out) >>>> * Per-interface ADDR_RLOCK >>>> * PFIL hook >>>> >>>> From the first glance, there can be problems with: >>>> * Increased latency (we should have some kind of rx_process_limit), but >>>> still >>>> * reader locks being acquired for much longer amount of time >>>> >>>> >>>>>> cheers >>>>>> luigi >>>>>> >>>>>> Very interesting idea Luigi, will have to get that some thought. >>>>>> >>>>> >>>>> Jack >>>>> >>>> >>>> Returning to original post topic: >>>> >>>> Given >>>> 1) we are currently binding ixgbe ithreads to CPU cores >>>> 2) RX queue lock is used by (indirectly) in only 2 places: >>>> a) ISR routine (msix or legacy irq) >>>> b) taskqueue routine which is scheduled if some packets remains in RX >>>> queue and rx_process_limit ended OR we need something to TX >>>> >>>> 3) in practice taskqueue routine is a nightmare for many people since >>>> there is no way to stop "kernel {ix0 que}" thread eating 100% cpu after >>>> some traffic burst happens: once it is called it starts to schedule >>>> itself more and more replacing original ISR routine. Additionally, >>>> increasing rx_process_limit does not help since taskqueue is called with >>>> the same limit. Finally, currently netisr taskq threads are not bound to >>>> any CPU which makes the process even more uncontrollable. >>>> >>> >>> I think part of the problem here is that the taskqueue in ixgbe(4) is >>> bogusly rescheduled for TX handling. Instead, ixgbe_msix_que() should >>> just start transmitting packets directly. >>> >>> I fixed this in igb(4) here: >>> >>> http://svnweb.freebsd.org/**base?view=revision&revision=**233708 >>> >>> You can try this for ixgbe(4). It also comments out a spurious taskqueue >>> reschedule from the watchdog handler that might also lower the taskqueue >>> usage. You can try changing that #if 0 to an #if 1 to test just the >>> txeof >>> changes: >>> >> >> Is anyone able to test this btw to see if it improves things on ixgbe at >> all? >> (I don't have any ixgbe hardware.) >> > Yes. I'll try to to this next week (since ixgbe driver from at least 9-S > fails to detect twinax cable which works in 8-S....)). > >> >> > If you have a major problem like this you might want to put it in a bug report or at least an email with that specific topic rather than bury it in an unrelated thread in a parenthetic remark :( This is the first I've heard of this, did you check the code on HEAD to see if it also has the issue? Jack From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 20:06:30 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 160597AE; Fri, 19 Oct 2012 20:06:30 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id D70C88FC08; Fri, 19 Oct 2012 20:06:29 +0000 (UTC) Received: from JRE-MBP-2.local (c-50-143-149-146.hsd1.ca.comcast.net [50.143.149.146]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id q9JK6L8K068127 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 19 Oct 2012 13:06:23 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <5081B2BD.3090103@freebsd.org> Date: Fri, 19 Oct 2012 13:06:21 -0700 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: "Andrey V. Elsukov" Subject: Re: [RFC] Enabling IPFIREWALL_FORWARD in run-time References: <508138A4.5030901@FreeBSD.org> In-Reply-To: <508138A4.5030901@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: ipfw@freebsd.org, net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 20:06:30 -0000 On 10/19/12 4:25 AM, Andrey V. Elsukov wrote: > Hi All, > > Many years ago i have already proposed this feature, but at that time > several people were against, because as they said, it could affect > performance. Now, when we have high speed network adapters, SMP kernel > and network stack, several locks acquired in the path of each packet, > and i have an ability to test this in the lab. > > So, i prepared the patch, that removes IPFIREWALL_FORWARD option from > the kernel and makes this functionality always build-in, but it is > turned off by default and can be enabled via the sysctl(8) variable > net.pfil.forward=1. > > http://people.freebsd.org/~ae/pfil_forward.diff > > Also we have done some tests with the ixia traffic generator connected > via 10G network adapter. Tests have show that there is no visible > difference, and there is no visible performance degradation. > > Any objections? > NO objection from me.. It was always my intention to "some day" either make it standard, OR at least default it to 'on'. looks ot me as if a couple of your 'goto's might just be changed to {} From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 20:34:07 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 459BB1F7 for ; Fri, 19 Oct 2012 20:34:07 +0000 (UTC) (envelope-from guy.helmer@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id 033188FC12 for ; Fri, 19 Oct 2012 20:34:06 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id k10so1693598iea.13 for ; Fri, 19 Oct 2012 13:34:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=8+2bZHNRYT0d1IjMjvWzeYnkzvwIGSp49vRgjPE9v+4=; b=znXaoCA3h/DT7SqCt36WWTqrc90Aj4QumJiFRnMxK8EgZE/QL8R2A3UDTOIysP6mtm oT2jes56PFF+5Us/+xXtZbMjnKmat0pPhI5A86vIqKLtvSVoh7sW9mDqvpoit1eK5XVo AnWoeqgomzIOlHstiqnfi7w/4niMTY4rPYW7IZWKOu8QE2jumDiaF9OW1xa5gLPytp2S imLVu9HR3P8ukt2dTj9+hst+kmuQup1NoeaKh86uFYNI7viDRkzHVCceO8fQWovwfI8S tofHVS8zIoTjl7gpxt6tOJioTda2SRK60XeB/YGt4DoqBdPCQMuUmEPP5VCsxvTFxbjg iWxQ== Received: by 10.50.57.200 with SMTP id k8mr9935857igq.29.1350678846395; Fri, 19 Oct 2012 13:34:06 -0700 (PDT) Received: from [192.168.221.107] ([216.81.189.9]) by mx.google.com with ESMTPS id 7sm1235690igh.0.2012.10.19.13.34.04 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 19 Oct 2012 13:34:05 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: CARP on vSwitch From: Guy Helmer In-Reply-To: Date: Fri, 19 Oct 2012 15:34:00 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Rafael Henrique Faria X-Mailer: Apple Mail (2.1499) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 20:34:07 -0000 On Oct 18, 2012, at 3:59 PM, Rafael Henrique Faria = wrote: > Hi, I'm trying to use CARP on two FreeBSD servers in a ESX = environment. But > it's not working. >=20 > The problem is that every frame sent from CARP gets back to the same = host. > This is an old problem: >=20 > http://www.mail-archive.com/freebsd-net@freebsd.org/msg30562.html >=20 > And already have a patch, but its 3 years old. And not yet commit-ed. = There > is any reason for this? > I always used freebsd-update to keep the servers updated, and don't = want to > compile a kernel just to use the CARP. >=20 > Someone have any suggestion or correction to this problem? >=20 > Thanks in advance. I have been using this ipfw rule pair to filter the CARP packets to work = around this problem: # Allow CARP advertisements out from me and in from anyone but me ${fwcmd} add allow carp from me to any out ${fwcmd} add deny carp from me to any in Guy From owner-freebsd-net@FreeBSD.ORG Fri Oct 19 22:00:35 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 50BAC3E7 for ; Fri, 19 Oct 2012 22:00:35 +0000 (UTC) (envelope-from steven@pyro.eu.org) Received: from falkenstein-2.sn.de.cluster.ok24.net (falkenstein-2.sn.de.cluster.ok24.net [IPv6:2002:4e2f:2f89:2::1]) by mx1.freebsd.org (Postfix) with ESMTP id C553D8FC0C for ; Fri, 19 Oct 2012 22:00:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=simple/simple; d=pyro.eu.org; s=10.2012; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=2gU+Pj28jGmHVfDVRaUDRf9Dgc7Zf6wtlvYt5VN5CGM=; b=L2f6e5N2wObvI0z5SVzE7sbmkz1fGpTWsZuaHtatfYe7LXMVkqTyNM+bBNIJlI4ViK4Tp1xJHeX7I3Noel0MIwBZIpGTHCy7RarX8uZEKrVCKFODwuuQ0GWGFCwj+udrrcNAhKpC55lFmIWrsxzADW95dIvXacPtoA66J2sV5qk=; X-Spam-Status: No, score=-1.1 required=2.0 tests=ALL_TRUSTED, BAYES_00, DKIM_ADSP_DISCARD, TVD_RCVD_IP Received: from 188-220-33-66.zone11.bethere.co.uk ([188.220.33.66] helo=guisborough-1.rcc.uk.cluster.ok24.net) by falkenstein-2.sn.de.cluster.ok24.net with esmtp (Exim 4.72) (envelope-from ) id 1TPKcF-0000kq-HH; Fri, 19 Oct 2012 23:00:31 +0100 X-Spam-Status: No, score=-4.0 required=2.0 tests=ALL_TRUSTED, AWL, BAYES_00, DKIM_POLICY_SIGNALL Received: from [192.168.0.110] (helo=[192.168.0.9]) by guisborough-1.rcc.uk.cluster.ok24.net with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1TPKc5-0002nL-PT; Fri, 19 Oct 2012 23:00:27 +0100 Message-ID: <5081CD71.2050709@pyro.eu.org> Date: Fri, 19 Oct 2012 23:00:17 +0100 From: Steven Chamberlain User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.7) Gecko/20120922 Icedove/10.0.7 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Debian Bug#690986: CVE-2012-5363 CVE-2012-5365 References: <20121019193436.5031.87058.reportbug@pisco.westfalen.local> In-Reply-To: <20121019193436.5031.87058.reportbug@pisco.westfalen.local> X-Enigmail-Version: 1.4.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Moritz Muehlenhoff , 690986@bugs.debian.org, 690986-forwarded@bugs.debian.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Oct 2012 22:00:35 -0000 Hi, On 19/10/12 20:34, Moritz Muehlenhoff wrote: > Two security issues were found in the kfreebsd network stack: > http://www.openwall.com/lists/oss-security/2012/10/10/8 > Issue #1 was assigned CVE-2012-5363 > Issue #2 was assigned CVE-2012-5365 Thank you for mentioning it. Issue #2 seems similar to CVE-2011-2393, which I assumed was only relevant where we'd set net.inet6.ip6.accept_rtadv=1, which isn't the upstream FreeBSD default. Issue #1 however might affect any FreeBSD system acting as an IPv6 router. If this can actually be confirmed, then the worst case I can imagine, is if a FreeBSD box acts as an IPv6 router for multiple interfaces, perhaps serving different users; any one of them might flood with Neighbour Solicitations on their local link and create a DoS affecting other interfaces. I found some code committed to OpenBSD (in 2008, uh-oh), supposedly from KAME (but I can't find it in their repository?), implementing per-interface and global limits on the number of prefixes/routes accepted via RA. I imagine that's the best way to avoid some or all of these issues. > http://www.openbsd.org/cgi-bin/cvsweb/src/sys/netinet6/in6_proto.c?sortby=date#rev1.56 Just recently it seems this was also committed to NetBSD HEAD: "4 new sysctls to avoid ipv6 DoS attacks from OpenBSD". I don't know of an easier way to link to a whole CVS commit, but here are (hopefully all) the changes to individual files: > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/ip6_input.c.diff?r1=1.138&r2=1.139&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/ip6_var.h.diff?r1=1.58&r2=1.59&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/nd6.c.diff?r1=1.142&r2=1.143&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/nd6.h.diff?r1=1.56&r2=1.57&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/icmp6.c.diff?r1=1.160&r2=1.161&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/in6.c.diff?r1=1.160&r2=1.161&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/in6_proto.c.diff?r1=1.96&r2=1.97&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/in6_var.h.diff?r1=1.64&r2=1.65&sortby=date&only_with_tag=MAIN > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/netinet6/nd6_rtr.c.diff?r1=1.82&r2=1.83&sortby=date&only_with_tag=MAIN Regards, -- Steven Chamberlain steven@pyro.eu.org From owner-freebsd-net@FreeBSD.ORG Sat Oct 20 13:06:38 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 478A5B32; Sat, 20 Oct 2012 13:06:38 +0000 (UTC) (envelope-from nevzorovn@gmail.com) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4B6458FC12; Sat, 20 Oct 2012 13:06:37 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id e12so1038047lag.13 for ; Sat, 20 Oct 2012 06:06:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=4QX/jLEYeyNPTrmdWEKKeKOTt5XMQNl6ICXaaC1Zyh4=; b=cTM3MiwVt6UNtMgzN/8Eg+3oRoDzy+hZGtR58tJMKuUNEImFjBBnOeReuDGLcZOO7K obWspFKnoQ8HdmHw1wfVB0yZRDaF1qlCHBlJ2Zo2S3D8KZTm27DQaecxgnZm2/rbBjYP pIf6OYXQqAyt4yFykUyXUoUDquCZ9JvlhaL+us8b92UKIuHR9lJX9TqGJkYmAiNJKkJa QaYOlAL7jDRDC8zb6+3g+2RI2pRye9VK81dOlXNiXf2jY65mTYe0QBlxWl4jIzdC2mV4 v9uBaDpjZRpMpjh/x9HeU7vHnq5QdtHLc3imjdu1pb0fIFYiHW5EWvcsA/PcBxMdfXCF UthA== MIME-Version: 1.0 Received: by 10.112.14.107 with SMTP id o11mr1655206lbc.98.1350738390001; Sat, 20 Oct 2012 06:06:30 -0700 (PDT) Received: by 10.112.42.70 with HTTP; Sat, 20 Oct 2012 06:06:29 -0700 (PDT) In-Reply-To: <201210180141.q9I1f53s052539@freefall.freebsd.org> References: <201210180141.q9I1f53s052539@freefall.freebsd.org> Date: Sat, 20 Oct 2012 19:06:29 +0600 Message-ID: Subject: Re: kern/171520: [alc] alc network driver + tso + vlan does not work. From: Nikolay Nevzorov To: yongari@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Oct 2012 13:06:38 -0000 On my netbook TSO over VLAN doesn't on generic and my kernel in any network config. #ifconfig alc0: flags=8843 metric 0 mtu 1500 options=c3098 ether 88:ae:1d:61:29:d2 inet6 fe80::8aae:1dff:fe61:29d2%alc0 prefixlen 64 scopeid 0x1 nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active ath0: flags=8802 metric 0 mtu 2290 ether c4:46:19:3b:0d:cf nd6 options=29 media: IEEE 802.11 Wireless Ethernet autoselect (autoselect) status: no carrier ipfw0: flags=8801 metric 0 mtu 65536 nd6 options=29 lo0: flags=8049 metric 0 mtu 16384 options=3 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x9 inet 127.0.0.1 netmask 0xff000000 inet 172.31.1.1 netmask 0xffffffff nd6 options=21 vlan2: flags=8843 metric 0 mtu 1500 ether 88:ae:1d:61:29:d2 inet 192.168.255.254 netmask 0xffffff00 broadcast 192.168.255.255 inet6 fe80::8aae:1dff:fe61:29d2%vlan2 prefixlen 64 scopeid 0xa nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active vlan: 2 parent interface: alc0 vlan3: flags=8843 metric 0 mtu 1500 ether 88:ae:1d:61:29:d2 inet 10.196.179.142 netmask 0xffffff00 broadcast 10.196.179.255 nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active vlan: 3 parent interface: alc0 vlan4: flags=8843 metric 0 mtu 1500 ether 88:ae:1d:61:29:d2 inet 192.168.84.254 netmask 0xffffff00 broadcast 192.168.84.255 inet6 fe80::8aae:1dff:fe61:29d2%vlan4 prefixlen 64 scopeid 0xc nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active vlan: 4 parent interface: alc0 ng0: flags=88d1 metric 0 mtu 1400 inet 145.255.22.221 --> 79.140.16.89 netmask 0xffffffff nd6 options=29 #dmesg Copyright (c) 1992-2012 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.0-RELEASE #1 r237140: Sun Jun 17 12:20:32 YEKT 2012 niko@louna:/usr/obj/usr/src/sys/LOUNA amd64 CPU: Intel(R) Atom(TM) CPU N450 @ 1.66GHz (1662.63-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x106ca Family = 6 Model = 1c Stepping = 10 Features=0xbfe9fbff Features2=0x40e39d AMD Features=0x20100800 AMD Features2=0x1 TSC: P-state invariant, performance statistics real memory = 1073741824 (1024 MB) avail memory = 1007714304 (961 MB) Event timer "LAPIC" quality 400 ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 1 core(s) x 2 HTT threads cpu0 (BSP): APIC ID: 0 cpu1 (AP/HT): APIC ID: 1 ioapic0: Changing APIC ID to 4 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 smbios0: at iomem 0xfe120-0xfe13e on motherboard smbios0: Version: 2.6, BCD Revision: 2.6 acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: on acpi0 cpu1: on acpi0 acpi_ec0: port 0x62,0x66 on acpi0 acpi_button0: on acpi0 acpi_button1: on acpi0 acpi_lid0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 vgapci0: port 0x60c0-0x60c7 mem 0x58180000-0x581fffff,0x40000000-0x4fffffff,0x58000000-0x580fffff irq 16 at device 2.0 on pci0 agp0: on vgapci0 agp0: aperture size is 256M, detected 8188k stolen memory vgapci1: mem 0x58100000-0x5817ffff at device 2.1 on pci0 hdac0: mem 0x58200000-0x58203fff irq 16 at device 27.0 on pci0 pcib1: at device 28.0 on pci0 pci1: on pcib1 alc0: port 0x5000-0x507f mem 0x57000000-0x5703ffff irq 16 at device 0.0 on pci1 alc0: 15872 Tx FIFO, 15360 Rx FIFO alc0: Using 1 MSI message(s). miibus0: on alc0 atphy0: PHY 0 on miibus0 atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow alc0: Ethernet address: 88:ae:1d:61:29:d2 pcib2: at device 28.1 on pci0 pci2: on pcib2 ath0: mem 0x56000000-0x5600ffff irq 17 at device 0.0 on pci2 ath0: [HT] enabling HT modes ath0: [HT] 1 RX streams; 1 TX streams ath0: AR9285 mac 192.2 RF5133 phy 14.0 uhci0: port 0x6080-0x609f irq 16 at device 29.0 on pci0 uhci0: LegSup = 0x2f00 usbus0: on uhci0 uhci1: port 0x6060-0x607f irq 17 at device 29.1 on pci0 uhci1: LegSup = 0x2f00 usbus1: on uhci1 uhci2: port 0x6040-0x605f irq 18 at device 29.2 on pci0 uhci2: LegSup = 0x2f00 usbus2: on uhci2 uhci3: port 0x6020-0x603f irq 19 at device 29.3 on pci0 uhci3: LegSup = 0x2f00 usbus3: on uhci3 ehci0: mem 0x58204400-0x582047ff irq 16 at device 29.7 on pci0 usbus4: EHCI version 1.0 usbus4: on ehci0 pcib3: at device 30.0 on pci0 pci5: on pcib3 isab0: at device 31.0 on pci0 isa0: on isab0 ahci0: port 0x60b8-0x60bf,0x60cc-0x60cf,0x60b0-0x60b7,0x60c8-0x60cb,0x60a0-0x60af mem 0x58204000-0x582043ff irq 17 at device 31.2 on pci0 ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported ahcich0: at channel 0 on ahci0 ahcich1: at channel 1 on ahci0 pci0: at device 31.3 (no driver attached) acpi_tz0: on acpi0 battery0: on acpi0 acpi_acad0: on acpi0 atrtc0: port 0x70-0x77 on acpi0 atrtc0: Warning: Couldn't map I/O. Event timer "RTC" frequency 32768 Hz quality 0 hpet0: iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 450 Event timer "HPET1" frequency 14318180 Hz quality 440 Event timer "HPET2" frequency 14318180 Hz quality 440 attimer0: port 0x40-0x43,0x50-0x53 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse, device ID 3 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 coretemp0: on cpu0 est0: on cpu0 p4tcc0: on cpu0 coretemp1: on cpu1 est1: on cpu1 p4tcc1: on cpu1 Timecounters tick every 1.000 msec ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based forwarding enabled, default to accept, logging disabled DUMMYNET 0 with IPv6 initialized (100409) load_dn_sched dn_sched RR loaded load_dn_sched dn_sched WF2Q+ loaded load_dn_sched dn_sched FIFO loaded load_dn_sched dn_sched PRIO loaded load_dn_sched dn_sched QFQ loaded hdac0: HDA Codec #0: Realtek ALC272 pcm0: at cad 0 nid 1 on hdac0 pcm1: at cad 0 nid 1 on hdac0 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 12Mbps Full Speed USB v1.0 usbus4: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 ugen4.1: at usbus4 uhub4: on usbus4 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 152627MB (312581808 512 byte sectors: 16H 63S/T 16383C) SMP: AP CPU #1 Launched! Root mount waiting for: usbus4 usbus3 usbus2 usbus1 usbus0 uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub3: 2 ports with 2 removable, self powered Root mount waiting for: usbus4 Root mount waiting for: usbus4 Root mount waiting for: usbus4 uhub4: 8 ports with 8 removable, self powered ugen4.2: at usbus4 Trying to mount root from ufs:/dev/ada0p2 [rw]... louna# cat /etc/rc.conf hostname="louna" sshd_enable="YES" ntpd_enable="YES" powerd_enable="YES" cloned_interfaces="vlan2 vlan3 vlan4" ifconfig_alc0="up -tso" ifconfig_vlan3="vlan 3 vlandev alc0 DHCP" ifconfig_vlan4="vlan 4 vlandev alc0 192.168.84.254/24" ifconfig_vlan2="vlan 2 vlandev alc0 192.168.255.254/24" ifconfig_lo0_alias0="inet 172.31.1.1/32" mpd_enable="YES" gateway_enable="YES" devfs_set_rulesets="/usr/local/etc/unbound/dev=unbound_ruleset" unbound_enable="YES" dhcpd_enable="YES" samba_enable="YES" kernel config^ louna# cat LOUNA cpu HAMMER ident LOUNA options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journaling options NFSCL # New Network Filesystem Client options NFSD # New Network Filesystem Server options NFSLOCKD # Network Lock Manager options NFS_ROOT # NFS usable as /, requires NFSCL options MSDOSFS # MSDOS Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed. options KBD_INSTALL_CDEV # install a CDEV entry in /dev options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4) options AUDIT # Security event auditing options MAC # TrustedBSD MAC Framework options INCLUDE_CONFIG_FILE # Include this file in kernel options KDB # Kernel debugger related code options KDB_TRACE # Print a stack trace for a panic options SMP # Symmetric MultiProcessor Kernel device cpufreq device acpi device pci device ahci # AHCI-compatible SATA controllers device scbus # SCSI bus (required for ATA/SCSI) device da # Direct Access (disks) device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver device sc options SC_PIXEL_MODE # add support for the raster text mode device agp # support several AGP chipsets device uart # Generic UART driver device miibus # MII bus support device alc # Atheros AR8131/AR8132 Ethernet device wlan # 802.11 support options IEEE80211_DEBUG # enable debug msgs options IEEE80211_AMPDU_AGE # age frames in AMPDU reorder q's options IEEE80211_SUPPORT_MESH # enable 802.11s draft support device wlan_wep # 802.11 WEP support device wlan_ccmp # 802.11 CCMP support device wlan_tkip # 802.11 TKIP support device wlan_amrr # AMRR transmit rate control algorithm device ath # Atheros NIC's device ath_pci # Atheros pci/cardbus glue device ath_hal # pci/cardbus chip support options AH_SUPPORT_AR5416 # enable AR5416 tx/rx descriptors device ath_rate_sample # SampleRate tx rate control for ath options ATH_ENABLE_11N options ATH_DEBUG options ATH_DIAGAPI options IEEE80211_DEBUG device loop # Network loopback device random # Entropy device device ether # Ethernet support device vlan # 802.1Q VLAN support device tun # Packet tunnel. device pty # BSD-style compatibility pseudo ttys device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translation) device firmware # firmware assist module device bpf # Berkeley packet filter options USB_DEBUG # enable debug msgs device uhci # UHCI PCI->USB interface device ohci # OHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device xhci # XHCI PCI->USB interface (USB 3.0) device usb # USB Bus (required) device uhid # "Human Interface Devices" device ukbd # Keyboard device umass # Disks/Mass storage - Requires scbus and da device u3g # USB-based 3G modems (Option, Huawei, Sierra) device uplcom # Prolific PL-2303 serial adapters device uslcom # SI Labs CP2101/CP2102 serial adapters device sound # Generic sound driver (required) device snd_hda # Intel High Definition Audio device snd_ich # Intel, NVidia and other ICH AC'97 Audio options IPFIREWALL options IPFIREWALL_DEFAULT_TO_ACCEPT options IPFIREWALL_VERBOSE options IPFIREWALL_FORWARD options DEVICE_POLLING options DUMMYNET options HZ=1000 options LIBALIAS options NETGRAPH options NETGRAPH_IPFW options NETGRAPH_PPP options NETGRAPH_PPTPGRE options NETGRAPH_KSOCKET options NETGRAPH_IFACE options NETGRAPH_TCPMSS options NETGRAPH_CAR options NETGRAPH_NAT options NETGRAPH_SOCKET options NETGRAPH_TEE device smbios device coretemp device cpuctl louna# 2012/10/18 > Synopsis: [alc] alc network driver + tso + vlan does not work. > > State-Changed-From-To: open->feedback > State-Changed-By: yongari > State-Changed-When: Thu Oct 18 01:40:32 UTC 2012 > State-Changed-Why: > I'm pretty sure TSO over VLAN worked well on my box. > Could you share your exact network configuration and let me know > how I can reproduce it? > > > Responsible-Changed-From-To: freebsd-net->yongari > Responsible-Changed-By: yongari > Responsible-Changed-When: Thu Oct 18 01:40:32 UTC 2012 > Responsible-Changed-Why: > Grab. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=171520 > From owner-freebsd-net@FreeBSD.ORG Sat Oct 20 13:36:37 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0D84D37F; Sat, 20 Oct 2012 13:36:37 +0000 (UTC) (envelope-from universite@ukr.net) Received: from ffe10.ukr.net (ffe10.ukr.net [195.214.192.60]) by mx1.freebsd.org (Postfix) with ESMTP id A56708FC0C; Sat, 20 Oct 2012 13:36:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net; s=ffe; h=Date:Message-Id:From:To:References:In-Reply-To:Subject:Cc:Content-Type:Content-Transfer-Encoding:MIME-Version; bh=v2rhJHGvJQav7sHTNTPV8saJ0U2EfHXkkOqvwplCpL4=; b=QKn2mPFuC+oMSKKDDEfqcadHNTFwhzEEAH6YagqcbHz6UMByjuhrMGQo+zBKN/VuH/0K87aaEoaYnDe5xZqsWZoHCPOFeiqBJI9lU1eCL/0zTHeLdxccNk11+xytl+vtXuiYO6sEpvlReyJvry+qdbBnu3QtvdT8UESRPxbBTUA=; Received: from mail by ffe10.ukr.net with local ID 1TPYvN-000KJX-8A ; Sat, 20 Oct 2012 16:17:09 +0300 MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: binary Content-Type: text/plain; charset="windows-1251" Subject: Re[2]: kern/171520: [alc] alc network driver + tso + vlan does not work. In-Reply-To: References: <201210180141.q9I1f53s052539@freefall.freebsd.org> To: "Nikolay Nevzorov" From: "Vladislav Prodan" X-Mailer: freemail.ukr.net 4.0 Message-Id: <76980.1350739029.4178195639830380544@ffe10.ukr.net> X-Browser: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20100101 Firefox/15.0.1 Date: Sat, 20 Oct 2012 16:17:09 +0300 Cc: freebsd-net@freebsd.org, yongari@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Oct 2012 13:36:37 -0000 I have worked with options driver alc: ifconfig_alc0="inet 10.1.0.18/24 -tso media 100baseTX mediaopt full-duplex up" vlans_alc0="255" ifconfig_alc0_255="inet ZZZ.YYY.XXX.247/24" --- Исходное сообщение --- От кого: "Nikolay Nevzorov" Кому: yongari@freebsd.org Дата: 20 октября 2012, 16:07:42 Тема: Re: kern/171520: [alc] alc network driver + tso + vlan does not work. > On my netbook TSO over VLAN doesn't on generic and my kernel in any network > config. > > #ifconfig > alc0: flags=8843 metric 0 mtu 1500 > -- Vladislav V. Prodan System & Network Administrator http://support.od.ua +380 67 4584408, +380 99 4060508 VVP88-RIPE From owner-freebsd-net@FreeBSD.ORG Sat Oct 20 20:54:45 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2235BCD8; Sat, 20 Oct 2012 20:54:45 +0000 (UTC) (envelope-from egrosbein@rdtc.ru) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13::5]) by mx1.freebsd.org (Postfix) with ESMTP id 6D4BC8FC08; Sat, 20 Oct 2012 20:54:43 +0000 (UTC) Received: from eg.sd.rdtc.ru (localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.14.5/8.14.5) with ESMTP id q9KKseNZ028325; Sun, 21 Oct 2012 03:54:40 +0700 (NOVT) (envelope-from egrosbein@rdtc.ru) Message-ID: <50830F8B.4030204@rdtc.ru> Date: Sun, 21 Oct 2012 03:54:35 +0700 From: Eugene Grosbein User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7 MIME-Version: 1.0 To: Nikolay Nevzorov Subject: Re: kern/171520: [alc] alc network driver + tso + vlan does not work. References: <201210180141.q9I1f53s052539@freefall.freebsd.org> In-Reply-To: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, yongari@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Oct 2012 20:54:45 -0000 20.10.2012 20:06, Nikolay Nevzorov wrote: >> Synopsis: [alc] alc network driver + tso + vlan does not work. It seems you use libalias-based NAT, don't you? man ipfw says in the BUGS section: Due to the architecture of libalias(3), ipfw nat is not compatible with the TCP segmentation offloading (TSO). Thus, to reliably nat your net- work traffic, please disable TSO on your NICs using ifconfig(8). Eugene Grosbein