From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 01:16:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A4D2A453 for ; Sun, 2 Feb 2014 01:16:51 +0000 (UTC) Received: from mail-ea0-x231.google.com (mail-ea0-x231.google.com [IPv6:2a00:1450:4013:c01::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3CAB11ED0 for ; Sun, 2 Feb 2014 01:16:51 +0000 (UTC) Received: by mail-ea0-f177.google.com with SMTP id n15so3068574ead.36 for ; Sat, 01 Feb 2014 17:16:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=jgdoD+ip3nLWHpcx2lRMhOK3owCbrehWKyxvfwhLEAs=; b=HfnjOPRr6+oRhHo+R3qtVcB8Qzm5aSynI7qwToAEC7MTFJYMRq1p2B5yMU3T3qsMW7 cUMxsbKVOC5SnQUggkgA2tu5T6OWKHAwnnvQRiZfdnxAcTQCV7NLVjf5Zktgz3ADGNho wpdoejUgQNuOuYggUGsNkM2SeqjIlHCt11D9hqSXUTCYoojjyyoMnOyaLIfI04aFtsoZ p7cx/RktQ8cRPj9KSsnA56c00cb2N6byrMB2bG0rZDCFykqdC/5UOCNDVflpQl4jLB1U vHo9K386cU39QLOF0LWqG4bA8GhNZqbJZAIW8xnutcR6TZj1y6geYtg5t4CEgfYYjx1k rR0g== MIME-Version: 1.0 X-Received: by 10.14.126.9 with SMTP id a9mr466965eei.95.1391303808483; Sat, 01 Feb 2014 17:16:48 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sat, 1 Feb 2014 17:16:48 -0800 (PST) In-Reply-To: References: Date: Sat, 1 Feb 2014 17:16:48 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 01:16:51 -0000 I rebuilt the kernel with following change to check failure: Index: if_bridge.c =================================================================== --- if_bridge.c (revision 260789) +++ if_bridge.c (working copy) @@ -2536,6 +2536,7 @@ struct bridge_iflist *bif; struct ifnet *dst_if; struct mbuf *mc; + int error = 0; if (LIST_EMPTY(&sc->sc_spanlist)) return; @@ -2552,7 +2553,9 @@ continue; } - bridge_enqueue(sc, dst_if, mc); + error = bridge_enqueue(sc, dst_if, mc); + if (error) + printf("%s: bridge_enqueue failed\n", __func__); } } After this change and reboot, I see packets on ix3 without those bad-len errors seen before. (I also want to carefully inspect packets to make sure nothing else is going wrong) I never saw "bridge_enqueue failed" error ever. So its failing somewhere else in the stack. ix1: flags=8943 metric 0 mtu 1500 options=8403bb ether 38:ea:a7:8b:af:c4 inet6 fe80::3aea:a7ff:fe8b:afc4%ix1 prefixlen 64 scopeid 0x6 inet 10.73.149.91 netmask 0xffffff00 broadcast 10.73.149.255 nd6 options=29 media: Ethernet autoselect (10Gbase-Twinax ) status: active ix2: flags=8943 metric 0 mtu 1500 options=8407bb ether 90:e2:ba:30:73:40 inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255 nd6 options=29 media: Ethernet autoselect (10Gbase-Twinax ) status: active ix3: flags=8943 metric 0 mtu 1500 options=8407bb ether 90:e2:ba:30:73:41 inet 192.168.0.3 netmask 0xffffff00 broadcast 192.168.0.255 nd6 options=29 media: Ethernet autoselect (10Gbase-Twinax ) status: active bridge0: flags=8843 metric 0 mtu 1500 ether 02:a1:25:9a:8f:00 nd6 options=9 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: ix1 flags=143 ifmaxaddr 0 port 6 priority 128 path cost 2000 member: ix2 flags=8 ifmaxaddr 0 port 7 priority 128 path cost 2000 Interestingly I still see: -bash-4.2$ sysctl -a dev.ix | grep total_pkts_rcvd dev.ix.0.mac_stats.total_pkts_rcvd: 638917 dev.ix.1.mac_stats.total_pkts_rcvd: 3587878 <--- where I send traffic via iperf dev.ix.2.mac_stats.total_pkts_rcvd: 1 dev.ix.3.mac_stats.total_pkts_rcvd: 3555298 <--- where I snoop traffic. -bash-4.2$ sysctl -a | grep mac_stats.checksum_errs dev.ix.0.mac_stats.checksum_errs: 0 dev.ix.1.mac_stats.checksum_errs: 0 dev.ix.2.mac_stats.checksum_errs: 0 dev.ix.3.mac_stats.checksum_errs: 3371119 <-- most of them fail checksum I also noticed: dev.ix.3.queue4.lro_queued: 50223 dev.ix.3.queue4.lro_flushed: 42632 dev.ix.3.queue5.lro_queued: 25356 dev.ix.3.queue5.lro_flushed: 25249 I disabled tso and lro both and I stopped seeing these number increase. But the general behavior remained the same. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 02:29:21 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EAD307CC for ; Sun, 2 Feb 2014 02:29:21 +0000 (UTC) Received: from mail-ee0-x234.google.com (mail-ee0-x234.google.com [IPv6:2a00:1450:4013:c00::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 825CC127B for ; Sun, 2 Feb 2014 02:29:21 +0000 (UTC) Received: by mail-ee0-f52.google.com with SMTP id e53so2983147eek.11 for ; Sat, 01 Feb 2014 18:29:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=31DaIaMDR20RJZAyH3oQZR06cytu5zapkBLSJ3itRkU=; b=T0DHKkK8B5d6AjFUrhQROt9ZiYmqK+gUZQLBYKgvMTAI4h2Dw/wXPZ0/2XAi+EE+11 NrQVwirjtYCBDHHgsZXVfkZu5diB8mPMLcGk3J8CAgE6uQPkK8TZukWabvVPYp9hP88L JT/GxuUCvRNdpmy1/6rXmX5anxVTFFHw6SM6OPxn+VK7bS876Yc72AN3nWYp7P+d8LiG lHdearpXmjNkXzuMhDjOZnaJu4NEtBhw6x3TAlGsO5Vy/8nsOyKIUGNuuSZhtzeRdbbL lcyfe6qZBN1tKn184E63tKLUwtIbkMRQ5FQHl4t8LfrFXYGpsdP7ZWHgabPwgyw35Ovd xZMQ== MIME-Version: 1.0 X-Received: by 10.15.43.141 with SMTP id x13mr33772636eev.35.1391308159903; Sat, 01 Feb 2014 18:29:19 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sat, 1 Feb 2014 18:29:19 -0800 (PST) In-Reply-To: References: Date: Sat, 1 Feb 2014 18:29:19 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 02:29:22 -0000 On Sat, Feb 1, 2014 at 5:16 PM, hiren panchasara wrote: > I rebuilt the kernel with following change to check failure: > > Index: if_bridge.c > =================================================================== > --- if_bridge.c (revision 260789) > +++ if_bridge.c (working copy) > @@ -2536,6 +2536,7 @@ > struct bridge_iflist *bif; > struct ifnet *dst_if; > struct mbuf *mc; > + int error = 0; > > if (LIST_EMPTY(&sc->sc_spanlist)) > return; > @@ -2552,7 +2553,9 @@ > continue; > } > > - bridge_enqueue(sc, dst_if, mc); > + error = bridge_enqueue(sc, dst_if, mc); > + if (error) > + printf("%s: bridge_enqueue failed\n", __func__); > } > } > > After this change and reboot, I see packets on ix3 without those > bad-len errors seen before. Just an update that it was not this change which was triggering anything. I reverted it and things are still the same. I guess reboot caused those bad-len errors to go away. Trying to find how to narrow down what is causing this checksum errors. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 02:33:32 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7A54C91B; Sun, 2 Feb 2014 02:33:32 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 2B55B12F3; Sun, 2 Feb 2014 02:33:31 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAKat7VKDaFve/2dsb2JhbABYg0RXgwG6EE+BHnSCJQEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBHASHXAgNqzWhCxeBKY0PAQEbNAeCb4FJBIlJjA6EBZBvg0seMYEEOQ X-IronPort-AV: E=Sophos;i="4.95,760,1384318800"; d="scan'208";a="92397775" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 01 Feb 2014 21:33:24 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 59BAFB4023; Sat, 1 Feb 2014 21:33:24 -0500 (EST) Date: Sat, 1 Feb 2014 21:33:24 -0500 (EST) From: Rick Macklem To: J David Message-ID: <2004489072.1389912.1391308404358.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Terrible NFS performance under 9.2-RELEASE? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 02:33:32 -0000 J David wrote: > On Fri, Jan 31, 2014 at 6:16 PM, Rick Macklem > wrote: > > You can certainly try "-o rsize=61440,wsize=61440" (assuming a 4K > > page size) > > for the mount, if you'd like. > > This has previously been tested with all 4k steps between 16k and > 32k. > All of them perform worse than > > With 61440, NFS fails outright on the random read test: > > $ iozone -e -I -s 1g -r 4k -i 0 -i 2 > Just curious. Are you always using "-I" (which sets O_DIRECT, I think?) or was it just this particular test? rick > Iozone: Performance Test of File I/O > > Version $Revision: 3.420 $ > > Compiled for 64 bit mode. > > Build: freebsd > > [...] > > Include fsync in write timing > > O_DIRECT feature enabled > > File size set to 1048576 KB > > Record Size 4 KB > > Command line used: iozone -e -I -s 1g -r 4k -i 0 -i 2 > > Output is in Kbytes/sec > > Time Resolution = 0.000005 seconds. > > Processor cache size set to 1024 Kbytes. > > Processor cache line size set to 32 bytes. > > File stride size set to 17 * record size. > > random > random bkwd record stride > > KB reclen write rewrite read reread read > write read rewrite read fwrite frewrite fread freread > > 1048576 4 24688 23891 > > Error reading block at 1073729536 > > read: Bad file descriptor > > > Upon using the -w option, which leaves the file intact on exit, it's > possible to see that it's not even 1gig in length: > > $ ls -aln iozone.tmp > > -rw-r----- 1 1000 0 1073709056 Feb 1 01:18 iozone.tmp > > > It's 32k short, which is a pretty surprising result. > > Thanks! > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 03:54:05 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 651FEF55; Sun, 2 Feb 2014 03:54:05 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 65FD2189C; Sun, 2 Feb 2014 03:54:03 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAF7A7VKDaFve/2dsb2JhbABYg0RXgwG6EE+BHnSCJQEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBHASHXAgNqyihDheBKY0PAQEbNAeCb4FJBIlJjA6EBZBvg0seMYEEOQ X-IronPort-AV: E=Sophos;i="4.95,760,1384318800"; d="scan'208";a="92403260" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 01 Feb 2014 22:53:48 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 4F479B408D; Sat, 1 Feb 2014 22:53:48 -0500 (EST) Date: Sat, 1 Feb 2014 22:53:48 -0500 (EST) From: Rick Macklem To: J David Message-ID: <251642279.1400386.1391313228315.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Terrible NFS performance under 9.2-RELEASE? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 03:54:05 -0000 J David wrote: > On Fri, Jan 31, 2014 at 6:16 PM, Rick Macklem > wrote: > > You can certainly try "-o rsize=61440,wsize=61440" (assuming a 4K > > page size) > > for the mount, if you'd like. > > This has previously been tested with all 4k steps between 16k and > 32k. > All of them perform worse than > > With 61440, NFS fails outright on the random read test: > > $ iozone -e -I -s 1g -r 4k -i 0 -i 2 > Btw, if you do want to test with O_DIRECT ("-I"), you should enable direct io in the client. sysctl vfs.nfs.nfs_directio_enable=1 I just noticed that it is disabled by default. This means that your "-I" was essentially being ignored by the FreeBSD client. It also explains why Linux isn't doing a read before write, since that wouldn't happen for direct I/O. You should test Linux without "-I" and see if it still doesn't do the read before write, including a "-r 2k" to avoid the "just happens to be a page size" case. rick > Iozone: Performance Test of File I/O > > Version $Revision: 3.420 $ > > Compiled for 64 bit mode. > > Build: freebsd > > [...] > > Include fsync in write timing > > O_DIRECT feature enabled > > File size set to 1048576 KB > > Record Size 4 KB > > Command line used: iozone -e -I -s 1g -r 4k -i 0 -i 2 > > Output is in Kbytes/sec > > Time Resolution = 0.000005 seconds. > > Processor cache size set to 1024 Kbytes. > > Processor cache line size set to 32 bytes. > > File stride size set to 17 * record size. > > random > random bkwd record stride > > KB reclen write rewrite read reread read > write read rewrite read fwrite frewrite fread freread > > 1048576 4 24688 23891 > > Error reading block at 1073729536 > > read: Bad file descriptor > > > Upon using the -w option, which leaves the file intact on exit, it's > possible to see that it's not even 1gig in length: > > $ ls -aln iozone.tmp > > -rw-r----- 1 1000 0 1073709056 Feb 1 01:18 iozone.tmp > > > It's 32k short, which is a pretty surprising result. > > Thanks! > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 05:54:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 00D75C5C for ; Sun, 2 Feb 2014 05:54:50 +0000 (UTC) Received: from mail-ee0-x234.google.com (mail-ee0-x234.google.com [IPv6:2a00:1450:4013:c00::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8F9E31F77 for ; Sun, 2 Feb 2014 05:54:50 +0000 (UTC) Received: by mail-ee0-f52.google.com with SMTP id e53so3004142eek.39 for ; Sat, 01 Feb 2014 21:54:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=MdxEnKHfXg47nB1KEkVXS85WOmUs2f6iUoNb9UkvU2g=; b=vC/luQnCZXfiwpMvcwhi/Ua8l4D37GeMZFcX2dfn5OnkqPQT3xEEzlFRAKgz4Ffcbc eiLxRdPVXdNlS39JGaODFLxtBpksFTxxhYbSKQAgEJicWqqE9ee3Lr3TUFLKK0qfOkmk 35+yJzgeOJiklNzVx9SAH0vYq4zUSPY5VLIkmjF71Q29IQ+LJ6R4LnyNTbr0nCevVgDn ydP2GuN9TOS64nIfD2kNZIx/BOjxxRp+x0aD1pS8aGPYg3Ra+7AV6rSkbplCTfAmx5Rj RIPanqYghA8p14DLMxsbpqCdXDmc+Lc0Tj2mOE1ja0nJix/vwkUFQIqvuKBCTnfb/mNO KQZg== MIME-Version: 1.0 X-Received: by 10.14.209.129 with SMTP id s1mr35821305eeo.21.1391320489055; Sat, 01 Feb 2014 21:54:49 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sat, 1 Feb 2014 21:54:48 -0800 (PST) In-Reply-To: References: Date: Sat, 1 Feb 2014 21:54:48 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 05:54:51 -0000 On Fri, Jan 31, 2014 at 3:45 PM, hiren panchasara wrote: > > 23:30:01.308691 IP bad-hlen 0 > 23:30:01.308700 IP bad-hlen 0 > 23:30:01.308711 IP bad-hlen 0 rxcsum and txcsum were disabled when I saw these messages. If I enable them, I do not see these bad-hlen messages anymore. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 09:06:02 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3A918551 for ; Sun, 2 Feb 2014 09:06:02 +0000 (UTC) Received: from mail-ee0-x234.google.com (mail-ee0-x234.google.com [IPv6:2a00:1450:4013:c00::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C53771A30 for ; Sun, 2 Feb 2014 09:06:01 +0000 (UTC) Received: by mail-ee0-f52.google.com with SMTP id e53so3018752eek.25 for ; Sun, 02 Feb 2014 01:06:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=PiuNq322K/VwoY94hJFAgbp+CxVnfusTl5/38kkZYTA=; b=eZEAhdZmGSGLkHPKz4NfbMorNOK21RxSW6kCOjTvlECkcOSnlpWF3LlWaInAxPQktZ Cpx5ZIKIy25U3+7MnDFpvhPeLjZu8F5MPOLgVrIwPIF+A1AdYW7hL1pNlWsG2ew7UWM6 MxUzcp+aY+dZonVT9O5KcPepoq14SYlhERyQlbTH1Uc7j9hQmHY5EmSnI0YlxYvbust1 xX8cd+h8r/BfVR6aO0UtEnSf0Sd8cxLtKGsOrO/ZurBU4lQX4QxoCQbsYgWvaRV/NRTt Dx5DDtZ0z96jwsIN+RR3KBtAkZGGLKrsKp7yQVTZnEcXX8jjEDcwfHj1mH8nrPEgoRc6 5xcQ== MIME-Version: 1.0 X-Received: by 10.15.61.7 with SMTP id h7mr1226759eex.49.1391331960037; Sun, 02 Feb 2014 01:06:00 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sun, 2 Feb 2014 01:05:59 -0800 (PST) In-Reply-To: References: Date: Sun, 2 Feb 2014 01:05:59 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 09:06:02 -0000 On Sat, Feb 1, 2014 at 6:29 PM, hiren panchasara wrote: > On Sat, Feb 1, 2014 at 5:16 PM, hiren panchasara > wrote: > Trying to find how to narrow down what is causing this checksum errors. >From ixgbe.c: ixgbe_rx_checksum() 4643 if (status & IXGBE_RXD_STAT_IPCS) { 4644 if (!(errors & IXGBE_RXD_ERR_IPE)) { xxx 4645 /* IP Checksum Good */ 4646 mp->m_pkthdr.csum_flags = CSUM_IP_CHECKED; 4647 mp->m_pkthdr.csum_flags |= CSUM_IP_VALID; 4648 4649 } else { yyy 4650 mp->m_pkthdr.csum_flags = 0; 4651 } 4652 4653 } If I put a device_printf() at xxx, everything slows down and I see very less packets making through with near-zero checksum errors. A device_printf() at yyy (and nothing at xxx) never gets called. I was expecting it to be called as I thought checksum errors would cause that code-path to be traversed. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 11:06:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2FD56415 for ; Sun, 2 Feb 2014 11:06:20 +0000 (UTC) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A78991352 for ; Sun, 2 Feb 2014 11:06:19 +0000 (UTC) Received: from th-04.cs.huji.ac.il ([132.65.80.125]) by kabab.cs.huji.ac.il with esmtp id 1W9usL-0005yh-1O; Sun, 02 Feb 2014 13:06:09 +0200 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: Terrible NFS performance under 9.2-RELEASE? From: Daniel Braniss In-Reply-To: <482557096.17290094.1390873872231.JavaMail.root@uoguelph.ca> Date: Sun, 2 Feb 2014 13:04:31 +0200 Message-Id: References: <482557096.17290094.1390873872231.JavaMail.root@uoguelph.ca> To: Rick Macklem X-Mailer: Apple Mail (2.1827) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: Pyun YongHyeon , FreeBSD Net , Adam McDougall , Jack Vogel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 11:06:20 -0000 hi Rick, et.all. tried your patch but it didn=92t help,the server is stuck. just for fun, I tried a different client/host, this one has a broadcom = NextXtreme II that was MFC=92ed lately, and the results are worse than the Intel (5hs instead = of 4hs) but faster without TSO with TSO enabled and bs=3D32k: 5.09hs 18325.62 real 1109.23 user 4591.60 sys without TSO: 4.75hs 17120.40 real 1114.08 user 3537.61 sys So what is the advantage of using TSO? (no complain here, just curious) I=92ll try to see if as a server it has the same TSO related issues.=20 cheers, danny On Jan 28, 2014, at 3:51 AM, Rick Macklem wrote: > Jack Vogel wrote: >> That header file is for the VF driver :) which I don't believe is >> being >> used in this case. >> The driver is capable of handling 256K but its limited by the stack >> to 64K >> (look in >> ixgbe.h), so its not a few bytes off due to the vlan header. >>=20 >> The scatter size is not an arbitrary one, its due to hardware >> limitations >> in Niantic >> (82599). Turning off TSO in the 10G environment is not practical, >> you will >> have >> trouble getting good performance. >>=20 >> Jack >>=20 > Well, if you look at this thread, Daniel got much better performance > by turning off TSO. However, I agree that this is not an ideal = solution. > = http://docs.FreeBSD.org/cgi/mid.cgi?2C287272-7B57-4AAD-B22F-6A65D9F8677B >=20 > rick >=20 >>=20 >>=20 >> On Mon, Jan 27, 2014 at 4:58 PM, Yonghyeon PYUN >> wrote: >>=20 >>> On Mon, Jan 27, 2014 at 06:27:19PM -0500, Rick Macklem wrote: >>>> pyunyh@gmail.com wrote: >>>>> On Sun, Jan 26, 2014 at 09:16:54PM -0500, Rick Macklem wrote: >>>>>> Adam McDougall wrote: >>>>>>> Also try rsize=3D32768,wsize=3D32768 in your mount options, >>>>>>> made a >>>>>>> huge >>>>>>> difference for me. I've noticed slow file transfers on NFS >>>>>>> in 9 >>>>>>> and >>>>>>> finally did some searching a couple months ago, someone >>>>>>> suggested >>>>>>> it >>>>>>> and >>>>>>> they were on to something. >>>>>>>=20 >>>>>> I have a "hunch" that might explain why 64K NFS reads/writes >>>>>> perform >>>>>> poorly for some network environments. >>>>>> A 64K NFS read reply/write request consists of a list of 34 >>>>>> mbufs >>>>>> when >>>>>> passed to TCP via sosend() and a total data length of around >>>>>> 65680bytes. >>>>>> Looking at a couple of drivers (virtio and ixgbe), they seem >>>>>> to >>>>>> expect >>>>>> no more than 32-33 mbufs in a list for a 65535 byte TSO xmit. >>>>>> I >>>>>> think >>>>>> (I don't have anything that does TSO to confirm this) that >>>>>> NFS will >>>>>> pass >>>>>> a list that is longer (34 plus a TCP/IP header). >>>>>> At a glance, it appears that the drivers call m_defrag() or >>>>>> m_collapse() >>>>>> when the mbuf list won't fit in their scatter table (32 or 33 >>>>>> elements) >>>>>> and if this fails, just silently drop the data without >>>>>> sending it. >>>>>> If I'm right, there would considerable overhead from >>>>>> m_defrag()/m_collapse() >>>>>> and near disaster if they fail to fix the problem and the >>>>>> data is >>>>>> silently >>>>>> dropped instead of xmited. >>>>>>=20 >>>>>=20 >>>>> I think the actual number of DMA segments allocated for the >>>>> mbuf >>>>> chain is determined by bus_dma(9). bus_dma(9) will coalesce >>>>> current segment with previous segment if possible. >>>>>=20 >>>> Ok, I'll have to take a look, but I thought that an array of >>>> sized >>>> by "num_segs" is passed in as an argument. (And num_segs is set >>>> to >>>> either IXGBE_82598_SCATTER (100) or IXGBE_82599_SCATTER (32).) >>>> It looked to me that the ixgbe driver called itself ix, so it >>>> isn't >>>> obvious to me which we are talking about. (I know that Daniel >>>> Braniss >>>> had an ix0 and ix1, which were fixed for NFS by disabling TSO.) >>>>=20 >>>=20 >>> It's ix(4). ixbge(4) is a different driver. >>>=20 >>>> I'll admit I mostly looked at virtio's network driver, since that >>>> was the one being used by J David. >>>>=20 >>>> Problems w.r.t. TSO enabled for NFS using 64K rsize/wsize have >>>> been >>>> cropping up for quite a while, and I am just trying to find out >>>> why. >>>> (I have no hardware/software that exhibits the problem, so I can >>>> only look at the sources and ask others to try testing stuff.) >>>>=20 >>>>> I'm not sure whether you're referring to ixgbe(4) or ix(4) but >>>>> I >>>>> see the total length of all segment size of ix(4) is 65535 so >>>>> it has no room for ethernet/VLAN header of the mbuf chain. The >>>>> driver should be fixed to transmit a 64KB datagram. >>>> Well, if_hw_tsomax is set to 65535 by the generic code (the >>>> driver >>>> doesn't set it) and the code in tcp_output() seems to subtract >>>> the >>>> size of an tcp/ip header from that before passing data to the >>>> driver, >>>> so I think the mbuf chain passed to the driver will fit in one >>>> ip datagram. (I'd assume all sorts of stuff would break for TSO >>>> enabled drivers if that wasn't the case?) >>>=20 >>> I believe the generic code is doing right. I'm under the >>> impression the non-working TSO indicates a bug in driver. Some >>> drivers didn't account for additional ethernet/VLAN header so the >>> total size of DMA segments exceeded 65535. I've attached a diff >>> for ix(4). It wasn't tested at all as I don't have hardware to >>> test. >>>=20 >>>>=20 >>>>> I think the use of m_defrag(9) in TSO is suboptimal. All TSO >>>>> capable controllers are able to handle multiple TX buffers so >>>>> it >>>>> should have used m_collapse(9) rather than copying entire chain >>>>> with m_defrag(9). >>>>>=20 >>>> I haven't looked at these closely yet (plan on doing so to-day), >>>> but >>>> even m_collapse() looked like it copied data between mbufs and >>>> that >>>> is certainly suboptimal, imho. I don't see why a driver can't >>>> split >>>> the mbuf list, if there are too many entries for the >>>> scatter/gather >>>> and do it in two iterations (much like tcp_output() does already, >>>> since the data length exceeds 65535 - tcp/ip header size). >>>>=20 >>>=20 >>> It can split the mbuf list if controllers supports increased number >>> of TX buffers. Because controller shall consume the same number of >>> DMA descriptors for the mbuf list, drivers tend to impose a limit >>> on the number of TX buffers to save resources. >>>=20 >>>> However, at this point, I just want to find out if the long chain >>>> of mbufs is why TSO is problematic for these drivers, since I'll >>>> admit I'm getting tired of telling people to disable TSO (and I >>>> suspect some don't believe me and never try it). >>>>=20 >>>=20 >>> TSO capable controllers tend to have various limitations(the first >>> TX buffer should have complete ethernet/IP/TCP header, ip_len of IP >>> header should be reset to 0, TCP pseudo checksum should be >>> recomputed etc) and cheap controllers need more assistance from >>> driver to let its firmware know various IP/TCP header offset >>> location in the mbuf. Because this requires a IP/TCP header >>> parsing, it's error prone and very complex. >>>=20 >>>>>> Anyhow, I have attached a patch that makes NFS use >>>>>> MJUMPAGESIZE >>>>>> clusters, >>>>>> so the mbuf count drops from 34 to 18. >>>>>>=20 >>>>>=20 >>>>> Could we make it conditional on size? >>>>>=20 >>>> Not sure what you mean? If you mean "the size of the read/write", >>>> that would be possible for NFSv3, but less so for NFSv4. (The >>>> read/write >>>> is just one Op. in the compound for NFSv4 and there is no way to >>>> predict how much more data is going to be generated by subsequent >>>> Ops.) >>>>=20 >>>=20 >>> Sorry, I should have been more clearer. You already answered my >>> question. Thanks. >>>=20 >>>> If by "size" you mean amount of memory in the machine then, yes, >>>> it >>>> certainly could be conditional on that. (I plan to try and look >>>> at >>>> the allocator to-day as well, but if others know of disadvantages >>>> with >>>> using MJUMPAGESIZE instead of MCLBYTES, please speak up.) >>>>=20 >>>> Garrett Wollman already alluded to the MCLBYTES case being >>>> pre-allocated, >>>> but I'll admit I have no idea what the implications of that are >>>> at this >>>> time. >>>>=20 >>>>>> If anyone has a TSO scatter/gather enabled net interface and >>>>>> can >>>>>> test this >>>>>> patch on it with NFS I/O (default of 64K rsize/wsize) when >>>>>> TSO is >>>>>> enabled >>>>>> and see what effect it has, that would be appreciated. >>>>>>=20 >>>>>> Btw, thanks go to Garrett Wollman for suggesting the change >>>>>> to >>>>>> MJUMPAGESIZE >>>>>> clusters. >>>>>>=20 >>>>>> rick >>>>>> ps: If the attachment doesn't make it through and you want >>>>>> the >>>>>> patch, just >>>>>> email me and I'll send you a copy. >>>>>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to >>> "freebsd-net-unsubscribe@freebsd.org" >>>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to >> "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 11:49:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1F158B9A for ; Sun, 2 Feb 2014 11:49:47 +0000 (UTC) Received: from mail-vc0-x232.google.com (mail-vc0-x232.google.com [IPv6:2607:f8b0:400c:c03::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CF8C015ED for ; Sun, 2 Feb 2014 11:49:46 +0000 (UTC) Received: by mail-vc0-f178.google.com with SMTP id ik5so4208884vcb.9 for ; Sun, 02 Feb 2014 03:49:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=hSR4qZ2YiONrSAcX1P09qxoRN0RP8a7oE3LoHx7Cw8Y=; b=GsqeBNU7y7iv43NjntDnekC8AA2tSIwHuMdbEeLOk3ongh4OQnfI2PYe/520F2VAH6 XUzaSXjki/iR6FHs62LlF87Q5qyVupcwXvOGPyK2CcOA2OLv54zJ2DBnvK0ttDiJrL9f gPYrcudXXg4S2FlVXXnA/ATH/70lbW34m3ELFGqjQbn6oQo2iDuwxzqT17I8P1LLmwwc 0zBSHqmTqh9Jz/w/LOCjjMUxHkVZjTnYMacePj36xcBPdGngMCylsXTJmxOqmNMD8doK 1MmGswUD35ld1tEo03HS92Kvzj3LE+f3oHlozJ2/bhsiBM8xlyTAYk2kvLaXo4i8GL0g 5pAg== MIME-Version: 1.0 X-Received: by 10.58.66.137 with SMTP id f9mr24311566vet.11.1391341785965; Sun, 02 Feb 2014 03:49:45 -0800 (PST) Sender: ndenev@gmail.com Received: by 10.220.78.84 with HTTP; Sun, 2 Feb 2014 03:49:45 -0800 (PST) In-Reply-To: <52EC573B.109@sentex.net> References: <52EC573B.109@sentex.net> Date: Sun, 2 Feb 2014 11:49:45 +0000 X-Google-Sender-Auth: kP41reusPt2zQgoMYMpXXwQ9Nkw Message-ID: Subject: Re: missing missing packets in igb stats ? From: Nikolay Denev To: Mike Tancsa Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 11:49:47 -0000 On Sat, Feb 1, 2014 at 2:08 AM, Mike Tancsa wrote: > Hi Jack, > I was testing out forwarding and firewalling speeds of the igb driver on > RELENG_10 and noticed something odd. > > I have 2 boxes connected to a FreeBSD box in the middle > > FreeBSD-A(em1)-----------(igb1)Router-1(igb0)----------(em1)FreeBSD-B > > So Box A generates packets as fast as it can to FreeBSD box B's em1 nic. > Router-1 is a FreeBSD box as releng10. Watching ifstat on Router-1 as I > execute the command on FreeBSD-A > # ./netblast 1.1.1.2 500 100 20 > > start: 1391219372.477992294 > finish: 1391219392.496952108 > send calls: 10877557 > send errors: 0 > approx send rate: 543877 > approx error rate: 0 > > I see on the router-1 box > > igb0 igb1 > Kbps in Kbps out Kbps in Kbps out > 0.00 0.00 0.00 0.00 > 1600.61 191639.1 280888.7 0.00 > 3669.84 434348.6 636134.9 0.00 > 3706.56 438636.7 596650.5 0.00 > 3755.10 444358.9 562814.3 0.00 > 3714.89 439478.5 562056.4 0.00 > 3796.79 449397.9 562042.9 0.00 > 3786.02 447957.2 577561.4 0.00 > 3629.18 429453.4 601285.7 0.00 > 3728.48 441312.7 597785.3 0.00 > 3806.67 450401.2 596247.0 0.00 > 3854.79 456150.2 597865.7 0.00 > 3690.11 436552.1 596695.8 0.00 > 3676.08 435002.6 596462.8 0.00 > 3730.35 441535.2 597132.1 0.00 > 3680.43 435518.2 596960.3 0.00 > 3741.41 442685.3 597750.8 0.00 > 3691.93 436870.6 596236.9 0.00 > 3627.31 429120.5 594116.8 0.00 > 3661.97 433492.7 595812.0 0.00 > 3693.86 437169.0 597826.9 0.00 > 2046.18 240635.3 331656.3 0.00 > 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > > Notice the rate of traffic coming in on igb1 is higher than what is going > out on igb0. Box A thinks it sent traffic at some 536,616 pkts per second > or 590Mb/s. However, traffic going out is slower, and what is seen at box B > is less. It sees the traffic at 286Mb/s and 357,873 pps > > Given the lost packets, should this not show up somewhere in the igb > statistics ? > > dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.4.0 > dev.igb.0.%driver: igb > dev.igb.0.%location: slot=0 function=0 handle=\_SB_.PCI0.PEG0.PEGP > dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10c9 subvendor=0x8086 > subdevice=0xa03c class=0x020000 > dev.igb.0.%parent: pci1 > dev.igb.0.nvm: -1 > dev.igb.0.enable_aim: 1 > dev.igb.0.fc: 3 > dev.igb.0.rx_processing_limit: 100 > dev.igb.0.link_irq: 2 > dev.igb.0.dropped: 0 > dev.igb.0.tx_dma_fail: 0 > dev.igb.0.rx_overruns: 0 > dev.igb.0.watchdog_timeouts: 0 > dev.igb.0.device_control: 1488978497 > dev.igb.0.rx_control: 67141634 > dev.igb.0.interrupt_mask: 4 > dev.igb.0.extended_int_mask: 2147483655 > dev.igb.0.tx_buf_alloc: 0 > dev.igb.0.rx_buf_alloc: 0 > dev.igb.0.fc_high_water: 58976 > dev.igb.0.fc_low_water: 58960 > dev.igb.0.queue0.no_desc_avail: 19682298 > dev.igb.0.queue0.tx_packets: 20962740 > dev.igb.0.queue0.rx_packets: 1101622 > dev.igb.0.queue0.rx_bytes: 66097424 > dev.igb.0.queue0.lro_queued: 0 > dev.igb.0.queue0.lro_flushed: 0 > dev.igb.0.queue1.no_desc_avail: 32582207 > dev.igb.0.queue1.tx_packets: 50082567 > dev.igb.0.queue1.rx_packets: 6598 > dev.igb.0.queue1.rx_bytes: 462728 > dev.igb.0.queue1.lro_queued: 0 > dev.igb.0.queue1.lro_flushed: 0 > dev.igb.0.mac_stats.excess_coll: 0 > dev.igb.0.mac_stats.single_coll: 0 > dev.igb.0.mac_stats.multiple_coll: 0 > dev.igb.0.mac_stats.late_coll: 0 > dev.igb.0.mac_stats.collision_count: 0 > dev.igb.0.mac_stats.symbol_errors: 0 > dev.igb.0.mac_stats.sequence_errors: 0 > dev.igb.0.mac_stats.defer_count: 138912 > dev.igb.0.mac_stats.missed_packets: 0 > dev.igb.0.mac_stats.recv_no_buff: 0 > dev.igb.0.mac_stats.recv_undersize: 0 > dev.igb.0.mac_stats.recv_fragmented: 0 > dev.igb.0.mac_stats.recv_oversize: 0 > dev.igb.0.mac_stats.recv_jabber: 0 > dev.igb.0.mac_stats.recv_errs: 0 > dev.igb.0.mac_stats.crc_errs: 0 > dev.igb.0.mac_stats.alignment_errs: 0 > dev.igb.0.mac_stats.coll_ext_errs: 0 > dev.igb.0.mac_stats.xon_recvd: 550808 > dev.igb.0.mac_stats.xon_txd: 0 > dev.igb.0.mac_stats.xoff_recvd: 550808 > dev.igb.0.mac_stats.xoff_txd: 0 > dev.igb.0.mac_stats.total_pkts_recvd: 1108220 > dev.igb.0.mac_stats.good_pkts_recvd: 6604 > dev.igb.0.mac_stats.bcast_pkts_recvd: 0 > dev.igb.0.mac_stats.mcast_pkts_recvd: 0 > dev.igb.0.mac_stats.rx_frames_64: 1 > dev.igb.0.mac_stats.rx_frames_65_127: 6603 > dev.igb.0.mac_stats.rx_frames_128_255: 0 > dev.igb.0.mac_stats.rx_frames_256_511: 0 > dev.igb.0.mac_stats.rx_frames_512_1023: 0 > dev.igb.0.mac_stats.rx_frames_1024_1522: 0 > dev.igb.0.mac_stats.good_octets_recvd: 489608 > dev.igb.0.mac_stats.good_octets_txd: 10120060648 > dev.igb.0.mac_stats.total_pkts_txd: 71045307 > dev.igb.0.mac_stats.good_pkts_txd: 71045307 > dev.igb.0.mac_stats.bcast_pkts_txd: 2 > dev.igb.0.mac_stats.mcast_pkts_txd: 0 > dev.igb.0.mac_stats.tx_frames_64: 2 > dev.igb.0.mac_stats.tx_frames_65_127: 5051081 > dev.igb.0.mac_stats.tx_frames_128_255: 65994224 > dev.igb.0.mac_stats.tx_frames_256_511: 0 > dev.igb.0.mac_stats.tx_frames_512_1023: 0 > dev.igb.0.mac_stats.tx_frames_1024_1522: 0 > dev.igb.0.mac_stats.tso_txd: 0 > dev.igb.0.mac_stats.tso_ctx_fail: 0 > dev.igb.0.interrupts.asserts: 6564060 > dev.igb.0.interrupts.rx_pkt_timer: 1108207 > dev.igb.0.interrupts.rx_abs_timer: 0 > dev.igb.0.interrupts.tx_pkt_timer: 0 > dev.igb.0.interrupts.tx_abs_timer: 1108220 > dev.igb.0.interrupts.tx_queue_empty: 71044772 > dev.igb.0.interrupts.tx_queue_min_thresh: 0 > dev.igb.0.interrupts.rx_desc_min_thresh: 0 > dev.igb.0.interrupts.rx_overrun: 0 > dev.igb.0.host.breaker_tx_pkt: 0 > dev.igb.0.host.host_tx_pkt_discard: 0 > dev.igb.0.host.rx_pkt: 13 > dev.igb.0.host.breaker_rx_pkts: 0 > dev.igb.0.host.breaker_rx_pkt_drop: 0 > dev.igb.0.host.tx_good_pkt: 535 > dev.igb.0.host.breaker_tx_pkt_drop: 0 > dev.igb.0.host.rx_good_bytes: 70993032 > dev.igb.0.host.tx_good_bytes: 10120060648 > dev.igb.0.host.length_errors: 0 > dev.igb.0.host.serdes_violation_pkt: 0 > dev.igb.0.host.header_redir_missed: 0 > dev.igb.0.wake: 0 > dev.igb.1.%desc: Intel(R) PRO/1000 Network Connection version - 2.4.0 > dev.igb.1.%driver: igb > dev.igb.1.%location: slot=0 function=1 > dev.igb.1.%pnpinfo: vendor=0x8086 device=0x10c9 subvendor=0x8086 > subdevice=0xa03c class=0x020000 > dev.igb.1.%parent: pci1 > dev.igb.1.nvm: -1 > dev.igb.1.enable_aim: 1 > dev.igb.1.fc: 3 > dev.igb.1.rx_processing_limit: 100 > dev.igb.1.link_irq: 2 > dev.igb.1.dropped: 0 > dev.igb.1.tx_dma_fail: 0 > dev.igb.1.rx_overruns: 0 > dev.igb.1.watchdog_timeouts: 0 > dev.igb.1.device_control: 1488978497 > dev.igb.1.rx_control: 67141634 > dev.igb.1.interrupt_mask: 4 > dev.igb.1.extended_int_mask: 2147483655 > dev.igb.1.tx_buf_alloc: 0 > dev.igb.1.rx_buf_alloc: 0 > dev.igb.1.fc_high_water: 58976 > dev.igb.1.fc_low_water: 58960 > dev.igb.1.queue0.no_desc_avail: 0 > dev.igb.1.queue0.tx_packets: 14 > dev.igb.1.queue0.rx_packets: 27770289 > dev.igb.1.queue0.rx_bytes: 3632804418 > dev.igb.1.queue0.lro_queued: 0 > dev.igb.1.queue0.lro_flushed: 0 > dev.igb.1.queue1.no_desc_avail: 0 > dev.igb.1.queue1.tx_packets: 6599 > dev.igb.1.queue1.rx_packets: 58098597 > dev.igb.1.queue1.rx_bytes: 8250006086 > dev.igb.1.queue1.lro_queued: 0 > dev.igb.1.queue1.lro_flushed: 0 > dev.igb.1.mac_stats.excess_coll: 0 > dev.igb.1.mac_stats.single_coll: 0 > dev.igb.1.mac_stats.multiple_coll: 0 > dev.igb.1.mac_stats.late_coll: 0 > dev.igb.1.mac_stats.collision_count: 0 > dev.igb.1.mac_stats.symbol_errors: 0 > dev.igb.1.mac_stats.sequence_errors: 0 > dev.igb.1.mac_stats.defer_count: 0 > dev.igb.1.mac_stats.missed_packets: 0 > dev.igb.1.mac_stats.recv_no_buff: 0 > dev.igb.1.mac_stats.recv_undersize: 0 > dev.igb.1.mac_stats.recv_fragmented: 0 > dev.igb.1.mac_stats.recv_oversize: 0 > dev.igb.1.mac_stats.recv_jabber: 0 > dev.igb.1.mac_stats.recv_errs: 0 > dev.igb.1.mac_stats.crc_errs: 0 > dev.igb.1.mac_stats.alignment_errs: 0 > dev.igb.1.mac_stats.coll_ext_errs: 0 > dev.igb.1.mac_stats.xon_recvd: 0 > dev.igb.1.mac_stats.xon_txd: 0 > dev.igb.1.mac_stats.xoff_recvd: 0 > dev.igb.1.mac_stats.xoff_txd: 0 > dev.igb.1.mac_stats.total_pkts_recvd: 85868886 > dev.igb.1.mac_stats.good_pkts_recvd: 85868886 > dev.igb.1.mac_stats.bcast_pkts_recvd: 31 > dev.igb.1.mac_stats.mcast_pkts_recvd: 0 > dev.igb.1.mac_stats.rx_frames_64: 5 > dev.igb.1.mac_stats.rx_frames_65_127: 6211527 > dev.igb.1.mac_stats.rx_frames_128_255: 79657327 > dev.igb.1.mac_stats.rx_frames_256_511: 27 > dev.igb.1.mac_stats.rx_frames_512_1023: 0 > dev.igb.1.mac_stats.rx_frames_1024_1522: 0 > dev.igb.1.mac_stats.good_octets_recvd: 12226286048 > dev.igb.1.mac_stats.good_octets_txd: 490260 > dev.igb.1.mac_stats.total_pkts_txd: 6613 > dev.igb.1.mac_stats.good_pkts_txd: 6613 > dev.igb.1.mac_stats.bcast_pkts_txd: 4 > dev.igb.1.mac_stats.mcast_pkts_txd: 0 > dev.igb.1.mac_stats.tx_frames_64: 8 > dev.igb.1.mac_stats.tx_frames_65_127: 6605 > dev.igb.1.mac_stats.tx_frames_128_255: 0 > dev.igb.1.mac_stats.tx_frames_256_511: 0 > dev.igb.1.mac_stats.tx_frames_512_1023: 0 > dev.igb.1.mac_stats.tx_frames_1024_1522: 0 > dev.igb.1.mac_stats.tso_txd: 0 > dev.igb.1.mac_stats.tso_ctx_fail: 0 > dev.igb.1.interrupts.asserts: 8707927 > dev.igb.1.interrupts.rx_pkt_timer: 85867976 > dev.igb.1.interrupts.rx_abs_timer: 0 > dev.igb.1.interrupts.tx_pkt_timer: 0 > dev.igb.1.interrupts.tx_abs_timer: 85868886 > dev.igb.1.interrupts.tx_queue_empty: 6613 > dev.igb.1.interrupts.tx_queue_min_thresh: 0 > dev.igb.1.interrupts.rx_desc_min_thresh: 0 > dev.igb.1.interrupts.rx_overrun: 0 > dev.igb.1.host.breaker_tx_pkt: 0 > dev.igb.1.host.host_tx_pkt_discard: 0 > dev.igb.1.host.rx_pkt: 910 > dev.igb.1.host.breaker_rx_pkts: 0 > dev.igb.1.host.breaker_rx_pkt_drop: 0 > dev.igb.1.host.tx_good_pkt: 0 > dev.igb.1.host.breaker_tx_pkt_drop: 0 > dev.igb.1.host.rx_good_bytes: 12226288092 > dev.igb.1.host.tx_good_bytes: 490260 > dev.igb.1.host.length_errors: 0 > dev.igb.1.host.serdes_violation_pkt: 0 > dev.igb.1.host.header_redir_missed: 0 > > > Motherboard is Intel > > Base Board Information > Manufacturer: Intel Corporation > Product Name: DH87RL > Version: AAG74240-401 > Serial Number: BQRL330000Q9 > > > NIC is dual port > > igb0@pci0:1:0:0: class=0x020000 card=0xa03c8086 chip=0x10c98086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > device = '82576 Gigabit Network Connection' > class = network > subclass = ethernet > bar [10] = type Memory, range 32, base 0xf7c20000, size 131072, > enabled > bar [14] = type Memory, range 32, base 0xf7800000, size 4194304, > enabled > bar [18] = type I/O Port, range 32, base 0xe020, size 32, enabled > bar [1c] = type Memory, range 32, base 0xf7c44000, size 16384, enabled > cap 01[40] = powerspec 3 supports D0 D3 current D0 > cap 05[50] = MSI supports 1 message, 64 bit, vector masks > cap 11[70] = MSI-X supports 10 messages, enabled > Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] > cap 10[a0] = PCI-Express 2 endpoint max data 128(512) FLR link x4(x4) > speed 2.5(2.5) ASPM disabled(L0s/L1) > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected > ecap 0003[140] = Serial 1 90e2baffff5eb48a > ecap 000e[150] = ARI 1 > ecap 0010[160] = SRIOV 1 > igb1@pci0:1:0:1: class=0x020000 card=0xa03c8086 chip=0x10c98086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > device = '82576 Gigabit Network Connection' > class = network > subclass = ethernet > bar [10] = type Memory, range 32, base 0xf7c00000, size 131072, > enabled > bar [14] = type Memory, range 32, base 0xf7000000, size 4194304, > enabled > bar [18] = type I/O Port, range 32, base 0xe000, size 32, enabled > bar [1c] = type Memory, range 32, base 0xf7c40000, size 16384, enabled > cap 01[40] = powerspec 3 supports D0 D3 current D0 > cap 05[50] = MSI supports 1 message, 64 bit, vector masks > cap 11[70] = MSI-X supports 10 messages, enabled > Table in map 0x1c[0x0], PBA in map 0x1c[0x2000] > cap 10[a0] = PCI-Express 2 endpoint max data 128(512) FLR link x4(x4) > speed 2.5(2.5) ASPM disabled(L0s/L1) > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected > ecap 0003[140] = Serial 1 90e2baffff5eb48a > ecap 000e[150] = ARI 1 > ecap 0010[160] = SRIOV 1 > > > root@intel4gen-9:/usr/home/mdtancsa # netstat -m > 6141/6489/12630 mbufs in use (current/cache/total) > 6139/5871/12010/487416 mbuf clusters in use (current/cache/total/max) > 6139/5861 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/5/5/243708 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/72209 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/40618 16k jumbo clusters in use (current/cache/total/max) > 13813K/13384K/27197K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > root@intel4gen-9:/usr/home/mdtancsa # > > > > > -- > ------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada http://www.tancsa.com/ > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" Just a guess, but this might be happening before the driver. Anything interesting in "netstat -s" for ip and udp? There is some inbound traffic on igb0. Are these ICMP udp port unreach? --Nikolay From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 16:15:39 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CF2CD9A4 for ; Sun, 2 Feb 2014 16:15:39 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 73D3017E5 for ; Sun, 2 Feb 2014 16:15:39 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: X-IronPort-AV: E=Sophos;i="4.95,766,1384318800"; d="scan'208";a="93007239" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 02 Feb 2014 11:15:31 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DE79DB3F62; Sun, 2 Feb 2014 11:15:30 -0500 (EST) Date: Sun, 2 Feb 2014 11:15:30 -0500 (EST) From: Rick Macklem To: Daniel Braniss Message-ID: <906704123.1485103.1391357730899.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Terrible NFS performance under 9.2-RELEASE? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: Pyun YongHyeon , FreeBSD Net , Adam McDougall , Jack Vogel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 16:15:39 -0000 Daniel Braniss wrote: > hi Rick, et.all. >=20 > tried your patch but it didn=E2=80=99t help,the server is stuck. Oh well. I was hoping that was going to make TSO work reliably. Just to comfirm it, this server works reliably when TSO is disabled? Thanks for doing the testing, rick > just for fun, I tried a different client/host, this one has a > broadcom NextXtreme II that was > MFC=E2=80=99ed lately, and the results are worse than the Intel (5hs inst= ead > of 4hs) but faster without TSO >=20 > with TSO enabled and bs=3D32k: > 5.09hs=09=0918325.62 real 1109.23 user 4591.60 sys >=20 > without TSO: > 4.75hs=09=0917120.40 real 1114.08 user 3537.61 sys >=20 > So what is the advantage of using TSO? (no complain here, just > curious) >=20 > I=E2=80=99ll try to see if as a server it has the same TSO related issues= . >=20 > cheers, > =09danny >=20 > On Jan 28, 2014, at 3:51 AM, Rick Macklem > wrote: >=20 > > Jack Vogel wrote: > >> That header file is for the VF driver :) which I don't believe is > >> being > >> used in this case. > >> The driver is capable of handling 256K but its limited by the > >> stack > >> to 64K > >> (look in > >> ixgbe.h), so its not a few bytes off due to the vlan header. > >>=20 > >> The scatter size is not an arbitrary one, its due to hardware > >> limitations > >> in Niantic > >> (82599). Turning off TSO in the 10G environment is not practical, > >> you will > >> have > >> trouble getting good performance. > >>=20 > >> Jack > >>=20 > > Well, if you look at this thread, Daniel got much better > > performance > > by turning off TSO. However, I agree that this is not an ideal > > solution. > > http://docs.FreeBSD.org/cgi/mid.cgi?2C287272-7B57-4AAD-B22F-6A65D9F8677= B > >=20 > > rick > >=20 > >>=20 > >>=20 > >> On Mon, Jan 27, 2014 at 4:58 PM, Yonghyeon PYUN > >> wrote: > >>=20 > >>> On Mon, Jan 27, 2014 at 06:27:19PM -0500, Rick Macklem wrote: > >>>> pyunyh@gmail.com wrote: > >>>>> On Sun, Jan 26, 2014 at 09:16:54PM -0500, Rick Macklem wrote: > >>>>>> Adam McDougall wrote: > >>>>>>> Also try rsize=3D32768,wsize=3D32768 in your mount options, > >>>>>>> made a > >>>>>>> huge > >>>>>>> difference for me. I've noticed slow file transfers on NFS > >>>>>>> in 9 > >>>>>>> and > >>>>>>> finally did some searching a couple months ago, someone > >>>>>>> suggested > >>>>>>> it > >>>>>>> and > >>>>>>> they were on to something. > >>>>>>>=20 > >>>>>> I have a "hunch" that might explain why 64K NFS reads/writes > >>>>>> perform > >>>>>> poorly for some network environments. > >>>>>> A 64K NFS read reply/write request consists of a list of 34 > >>>>>> mbufs > >>>>>> when > >>>>>> passed to TCP via sosend() and a total data length of around > >>>>>> 65680bytes. > >>>>>> Looking at a couple of drivers (virtio and ixgbe), they seem > >>>>>> to > >>>>>> expect > >>>>>> no more than 32-33 mbufs in a list for a 65535 byte TSO xmit. > >>>>>> I > >>>>>> think > >>>>>> (I don't have anything that does TSO to confirm this) that > >>>>>> NFS will > >>>>>> pass > >>>>>> a list that is longer (34 plus a TCP/IP header). > >>>>>> At a glance, it appears that the drivers call m_defrag() or > >>>>>> m_collapse() > >>>>>> when the mbuf list won't fit in their scatter table (32 or 33 > >>>>>> elements) > >>>>>> and if this fails, just silently drop the data without > >>>>>> sending it. > >>>>>> If I'm right, there would considerable overhead from > >>>>>> m_defrag()/m_collapse() > >>>>>> and near disaster if they fail to fix the problem and the > >>>>>> data is > >>>>>> silently > >>>>>> dropped instead of xmited. > >>>>>>=20 > >>>>>=20 > >>>>> I think the actual number of DMA segments allocated for the > >>>>> mbuf > >>>>> chain is determined by bus_dma(9). bus_dma(9) will coalesce > >>>>> current segment with previous segment if possible. > >>>>>=20 > >>>> Ok, I'll have to take a look, but I thought that an array of > >>>> sized > >>>> by "num_segs" is passed in as an argument. (And num_segs is set > >>>> to > >>>> either IXGBE_82598_SCATTER (100) or IXGBE_82599_SCATTER (32).) > >>>> It looked to me that the ixgbe driver called itself ix, so it > >>>> isn't > >>>> obvious to me which we are talking about. (I know that Daniel > >>>> Braniss > >>>> had an ix0 and ix1, which were fixed for NFS by disabling TSO.) > >>>>=20 > >>>=20 > >>> It's ix(4). ixbge(4) is a different driver. > >>>=20 > >>>> I'll admit I mostly looked at virtio's network driver, since > >>>> that > >>>> was the one being used by J David. > >>>>=20 > >>>> Problems w.r.t. TSO enabled for NFS using 64K rsize/wsize have > >>>> been > >>>> cropping up for quite a while, and I am just trying to find out > >>>> why. > >>>> (I have no hardware/software that exhibits the problem, so I can > >>>> only look at the sources and ask others to try testing stuff.) > >>>>=20 > >>>>> I'm not sure whether you're referring to ixgbe(4) or ix(4) but > >>>>> I > >>>>> see the total length of all segment size of ix(4) is 65535 so > >>>>> it has no room for ethernet/VLAN header of the mbuf chain. The > >>>>> driver should be fixed to transmit a 64KB datagram. > >>>> Well, if_hw_tsomax is set to 65535 by the generic code (the > >>>> driver > >>>> doesn't set it) and the code in tcp_output() seems to subtract > >>>> the > >>>> size of an tcp/ip header from that before passing data to the > >>>> driver, > >>>> so I think the mbuf chain passed to the driver will fit in one > >>>> ip datagram. (I'd assume all sorts of stuff would break for TSO > >>>> enabled drivers if that wasn't the case?) > >>>=20 > >>> I believe the generic code is doing right. I'm under the > >>> impression the non-working TSO indicates a bug in driver. Some > >>> drivers didn't account for additional ethernet/VLAN header so the > >>> total size of DMA segments exceeded 65535. I've attached a diff > >>> for ix(4). It wasn't tested at all as I don't have hardware to > >>> test. > >>>=20 > >>>>=20 > >>>>> I think the use of m_defrag(9) in TSO is suboptimal. All TSO > >>>>> capable controllers are able to handle multiple TX buffers so > >>>>> it > >>>>> should have used m_collapse(9) rather than copying entire chain > >>>>> with m_defrag(9). > >>>>>=20 > >>>> I haven't looked at these closely yet (plan on doing so to-day), > >>>> but > >>>> even m_collapse() looked like it copied data between mbufs and > >>>> that > >>>> is certainly suboptimal, imho. I don't see why a driver can't > >>>> split > >>>> the mbuf list, if there are too many entries for the > >>>> scatter/gather > >>>> and do it in two iterations (much like tcp_output() does > >>>> already, > >>>> since the data length exceeds 65535 - tcp/ip header size). > >>>>=20 > >>>=20 > >>> It can split the mbuf list if controllers supports increased > >>> number > >>> of TX buffers. Because controller shall consume the same number > >>> of > >>> DMA descriptors for the mbuf list, drivers tend to impose a limit > >>> on the number of TX buffers to save resources. > >>>=20 > >>>> However, at this point, I just want to find out if the long > >>>> chain > >>>> of mbufs is why TSO is problematic for these drivers, since I'll > >>>> admit I'm getting tired of telling people to disable TSO (and I > >>>> suspect some don't believe me and never try it). > >>>>=20 > >>>=20 > >>> TSO capable controllers tend to have various limitations(the > >>> first > >>> TX buffer should have complete ethernet/IP/TCP header, ip_len of > >>> IP > >>> header should be reset to 0, TCP pseudo checksum should be > >>> recomputed etc) and cheap controllers need more assistance from > >>> driver to let its firmware know various IP/TCP header offset > >>> location in the mbuf. Because this requires a IP/TCP header > >>> parsing, it's error prone and very complex. > >>>=20 > >>>>>> Anyhow, I have attached a patch that makes NFS use > >>>>>> MJUMPAGESIZE > >>>>>> clusters, > >>>>>> so the mbuf count drops from 34 to 18. > >>>>>>=20 > >>>>>=20 > >>>>> Could we make it conditional on size? > >>>>>=20 > >>>> Not sure what you mean? If you mean "the size of the > >>>> read/write", > >>>> that would be possible for NFSv3, but less so for NFSv4. (The > >>>> read/write > >>>> is just one Op. in the compound for NFSv4 and there is no way to > >>>> predict how much more data is going to be generated by > >>>> subsequent > >>>> Ops.) > >>>>=20 > >>>=20 > >>> Sorry, I should have been more clearer. You already answered my > >>> question. Thanks. > >>>=20 > >>>> If by "size" you mean amount of memory in the machine then, yes, > >>>> it > >>>> certainly could be conditional on that. (I plan to try and look > >>>> at > >>>> the allocator to-day as well, but if others know of > >>>> disadvantages > >>>> with > >>>> using MJUMPAGESIZE instead of MCLBYTES, please speak up.) > >>>>=20 > >>>> Garrett Wollman already alluded to the MCLBYTES case being > >>>> pre-allocated, > >>>> but I'll admit I have no idea what the implications of that are > >>>> at this > >>>> time. > >>>>=20 > >>>>>> If anyone has a TSO scatter/gather enabled net interface and > >>>>>> can > >>>>>> test this > >>>>>> patch on it with NFS I/O (default of 64K rsize/wsize) when > >>>>>> TSO is > >>>>>> enabled > >>>>>> and see what effect it has, that would be appreciated. > >>>>>>=20 > >>>>>> Btw, thanks go to Garrett Wollman for suggesting the change > >>>>>> to > >>>>>> MJUMPAGESIZE > >>>>>> clusters. > >>>>>>=20 > >>>>>> rick > >>>>>> ps: If the attachment doesn't make it through and you want > >>>>>> the > >>>>>> patch, just > >>>>>> email me and I'll send you a copy. > >>>>>>=20 > >>>=20 > >>> _______________________________________________ > >>> freebsd-net@freebsd.org mailing list > >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net > >>> To unsubscribe, send any mail to > >>> "freebsd-net-unsubscribe@freebsd.org" > >>>=20 > >> _______________________________________________ > >> freebsd-net@freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-net > >> To unsubscribe, send any mail to > >> "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" >=20 From owner-freebsd-net@FreeBSD.ORG Sun Feb 2 19:19:43 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ACB0283F for ; Sun, 2 Feb 2014 19:19:43 +0000 (UTC) Received: from mail-ea0-x236.google.com (mail-ea0-x236.google.com [IPv6:2a00:1450:4013:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4529215F0 for ; Sun, 2 Feb 2014 19:19:43 +0000 (UTC) Received: by mail-ea0-f182.google.com with SMTP id r15so3357611ead.27 for ; Sun, 02 Feb 2014 11:19:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=YqBd/XTQtNHDT0FD6vTSB1163K5yQiKb5sqlc7bHy00=; b=JIaGJW4hxaBt+/013r881S5f2MVIZMPxFX0+biCOl+dnLxZ7BbbcApwl+/ptXvf7R3 f4Iveu4pKCe93SxzXkbvHE5yztJo9tOHhKGVpUKBcQPdgoXt9xBBnS+OYuAf2v72xdS7 5xfZycq1rVbKqHOf/X+j2kN/03V7BaxCJ5eopB16iWTXFMKbmbg5YsQ0svpxmd80odPY i7ueiGihsBXDnXDPvDtyMYvGB9bwhEWEmmjLa8VlQDTbIEI8Wep06Jc3AyA0xfDytCSW sQW9QD7JYoljFO4rqgcPj/dAdQpi1Flv9PheWM+fjBgTVoSCEi+AZ6Css4pfFIaDdyF0 yweQ== MIME-Version: 1.0 X-Received: by 10.15.43.141 with SMTP id x13mr38293485eev.35.1391368781619; Sun, 02 Feb 2014 11:19:41 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sun, 2 Feb 2014 11:19:41 -0800 (PST) In-Reply-To: References: Date: Sun, 2 Feb 2014 11:19:41 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 02 Feb 2014 19:19:43 -0000 On Sat, Feb 1, 2014 at 5:16 PM, hiren panchasara wrote: > > -bash-4.2$ sysctl -a | grep mac_stats.checksum_errs > dev.ix.0.mac_stats.checksum_errs: 0 > dev.ix.1.mac_stats.checksum_errs: 0 > dev.ix.2.mac_stats.checksum_errs: 0 > dev.ix.3.mac_stats.checksum_errs: 3371119 <-- most of them fail checksum In ixgbe.c : ixgbe_add_hw_stats() has SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "checksum_errs", CTLFLAG_RD, &stats->xec, "Checksum Errors"); which is updated in ixgbe_update_stats_counters() adapter->stats.xec += IXGBE_READ_REG(hw, IXGBE_XEC); ixgbe_type.h defines #define IXGBE_XEC 0x04120 So, this is something that firmware of the card updates? We just read that register from drivers. How do I know why this number is increasing? cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 03:32:05 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 18D923AE; Mon, 3 Feb 2014 03:32:05 +0000 (UTC) Received: from mail-ea0-x22f.google.com (mail-ea0-x22f.google.com [IPv6:2a00:1450:4013:c01::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7ECC7160A; Mon, 3 Feb 2014 03:32:04 +0000 (UTC) Received: by mail-ea0-f175.google.com with SMTP id z10so3449625ead.34 for ; Sun, 02 Feb 2014 19:32:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ACDnZMr2ZQo8IHXdiwSYRmq400xW2P42s/sQWSw+CMs=; b=EIFhWH+kLPsdiVFvEZI3HvJciIzioIcCnMsgwFK6kAXvGUbCoA5qDq7WA4pxDoADwc XfrjiN6NVg1X8njwa/t7wgcqpvuPE7ab+SXagPrhFzotquoTrkWqn1vpdN/jXzpvhui9 IlxewYFJfak9ixIMsIppxMFmVTtFiDcODbwwuXa3jZN02rhGNjNEBPWk4bL4PLfNYVN6 HMiADdcmGpsH/2v7gbd8asenEjVWhBGitARH4zTysc9vxLbFJ1jeWz7lFRfDi1d7uLSy TKN/sa41U3iq0KUImt2IDVeTZDI2lGnc5X5CALmFCusgsoOEKA/dy+F6qLWiiMcN+jAg WmkQ== MIME-Version: 1.0 X-Received: by 10.14.220.193 with SMTP id o41mr40866205eep.22.1391398322780; Sun, 02 Feb 2014 19:32:02 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sun, 2 Feb 2014 19:32:02 -0800 (PST) In-Reply-To: References: Date: Sun, 2 Feb 2014 19:32:02 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" , Jack F Vogel Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 03:32:05 -0000 On Sun, Feb 2, 2014 at 11:19 AM, hiren panchasara wrote: > On Sat, Feb 1, 2014 at 5:16 PM, hiren panchasara > wrote: > >> >> -bash-4.2$ sysctl -a | grep mac_stats.checksum_errs >> dev.ix.0.mac_stats.checksum_errs: 0 >> dev.ix.1.mac_stats.checksum_errs: 0 >> dev.ix.2.mac_stats.checksum_errs: 0 >> dev.ix.3.mac_stats.checksum_errs: 3371119 <-- most of them fail checksum > > In ixgbe.c : > ixgbe_add_hw_stats() has > SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "checksum_errs", > CTLFLAG_RD, &stats->xec, > "Checksum Errors"); > > which is updated in ixgbe_update_stats_counters() > adapter->stats.xec += IXGBE_READ_REG(hw, IXGBE_XEC); > > ixgbe_type.h defines > #define IXGBE_XEC 0x04120 > > So, this is something that firmware of the card updates? We just read > that register from drivers. ix3@pci0:66:0:1: class=0x020000 card=0x7a118086 chip=0x10fb8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82599EB 10-Gigabit SFI/SFP+ Network Connection' class = network subclass = ethernet I found following in 82599 controller spec update: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82599-10-gbe-controller-spec-update.pdf#G2.1474004 http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82599-10-gbe-controller-spec-update.pdf#G2.2301787 Thought here I am sending unicast tcp traffic with iperf3. Adding Jeff to get some more insights if possible. cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 03:34:19 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 83351470; Mon, 3 Feb 2014 03:34:19 +0000 (UTC) Received: from mail-ea0-x22f.google.com (mail-ea0-x22f.google.com [IPv6:2a00:1450:4013:c01::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E6D6B1628; Mon, 3 Feb 2014 03:34:18 +0000 (UTC) Received: by mail-ea0-f175.google.com with SMTP id z10so3408826ead.20 for ; Sun, 02 Feb 2014 19:34:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=vPXqM36Rw1GjupXZ9s0Z86bK9S5+ylVSa1YBx8s4jvg=; b=hQQpfbVWOk1RHO0mWpwtUrcny1qZ2Mkk23DGr7s1CGZyTRjuVaHp9ZLgVabARJJBUX vA9OC00ejCBHGZDM5KC+eL62DzfpUJ4DCogfh5YDI2kHivjIGCr1A1MbkY801tvNz3Bh dH7iJA7BcMR+tRP3N+j3/HGh9nmEOlfnCNK2NdpoMhIR1MPS6reQTkFIy/5/qJfUpcsw hvZ7bzPn6EN+3qLc4Xj5i8JLV0ls/gH68g/wqU36FwEEoPzZezZLCBXJrXnzHrRJdRdR Zqj+QRPiRn2VihLXVx2PhbNyMI+35dYYWVLd64sN/TkbfavXe2kzGTvPtQR9rT/iacbu BLnQ== MIME-Version: 1.0 X-Received: by 10.15.33.193 with SMTP id c41mr37898eev.79.1391398457389; Sun, 02 Feb 2014 19:34:17 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sun, 2 Feb 2014 19:34:17 -0800 (PST) In-Reply-To: References: Date: Sun, 2 Feb 2014 19:34:17 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" , Jack F Vogel Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 03:34:19 -0000 On Sun, Feb 2, 2014 at 7:32 PM, hiren panchasara wrote: > On Sun, Feb 2, 2014 at 11:19 AM, hiren panchasara > wrote: >> On Sat, Feb 1, 2014 at 5:16 PM, hiren panchasara >> wrote: >> >>> >>> -bash-4.2$ sysctl -a | grep mac_stats.checksum_errs >>> dev.ix.0.mac_stats.checksum_errs: 0 >>> dev.ix.1.mac_stats.checksum_errs: 0 >>> dev.ix.2.mac_stats.checksum_errs: 0 >>> dev.ix.3.mac_stats.checksum_errs: 3371119 <-- most of them fail checksum >> >> In ixgbe.c : >> ixgbe_add_hw_stats() has >> SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "checksum_errs", >> CTLFLAG_RD, &stats->xec, >> "Checksum Errors"); >> >> which is updated in ixgbe_update_stats_counters() >> adapter->stats.xec += IXGBE_READ_REG(hw, IXGBE_XEC); >> >> ixgbe_type.h defines >> #define IXGBE_XEC 0x04120 >> >> So, this is something that firmware of the card updates? We just read >> that register from drivers. > > ix3@pci0:66:0:1: class=0x020000 card=0x7a118086 chip=0x10fb8086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > device = '82599EB 10-Gigabit SFI/SFP+ Network Connection' > class = network > subclass = ethernet > > I found following in 82599 controller spec update: > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82599-10-gbe-controller-spec-update.pdf#G2.1474004 > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82599-10-gbe-controller-spec-update.pdf#G2.2301787 > > Thought here I am sending unicast tcp traffic with iperf3. > > Adding Jeff to get some more insights if possible. Bah. Jack and not Jeff :-) cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 04:21:34 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 20132217; Mon, 3 Feb 2014 04:21:34 +0000 (UTC) Received: from mail-ig0-x234.google.com (mail-ig0-x234.google.com [IPv6:2607:f8b0:4001:c05::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D7B85192A; Mon, 3 Feb 2014 04:21:33 +0000 (UTC) Received: by mail-ig0-f180.google.com with SMTP id m12so3511698iga.1 for ; Sun, 02 Feb 2014 20:21:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=ShsrJyn6VlwVLfMhvTmr+aoDikT3wuRRxnEtUDN55xY=; b=d99IvbnYagtWJ9EiTUItd3RD4IwW2W3ag1KkWs3I/aPBZeHOVBOtYtxsQ+2HVSu92+ 2pI38w0CHGWyeusu+I6y8cH+WSykocrTXwn4bUwq7ydu1UdeOvUU5keQwvBeuoo+rgNU 9UEMTq8E357pMqXasncBodzgcmn8JwJ0m/Rdyr6RuIKTRKLODgSb07I5vPqqD3vo67r0 Yl+oE9X+6fhM3KeDoS1WKbhoSDKIrYY/xSYfxDxF6QhvpGNA83Lb0aA9i4pewQWwLO/H 7I64OEY/sFw8fSuag+meei0ZmxTxJH1US5WR/EUKORdi2njsxoiPteL+eC7dUL0yasF0 R2wg== MIME-Version: 1.0 X-Received: by 10.50.13.9 with SMTP id d9mr10143156igc.25.1391401292575; Sun, 02 Feb 2014 20:21:32 -0800 (PST) Sender: jdavidlists@gmail.com Received: by 10.42.170.8 with HTTP; Sun, 2 Feb 2014 20:21:32 -0800 (PST) In-Reply-To: <251642279.1400386.1391313228315.JavaMail.root@uoguelph.ca> References: <251642279.1400386.1391313228315.JavaMail.root@uoguelph.ca> Date: Sun, 2 Feb 2014 23:21:32 -0500 X-Google-Sender-Auth: WxYqNDZcRMCqRDAMGF4PjslCpCM Message-ID: Subject: Re: Terrible NFS performance under 9.2-RELEASE? From: J David To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 04:21:34 -0000 On Sat, Feb 1, 2014 at 10:53 PM, Rick Macklem wrote: > Btw, if you do want to test with O_DIRECT ("-I"), you should enable > direct io in the client. > sysctl vfs.nfs.nfs_directio_enable=1 > > I just noticed that it is disabled by default. This means that your > "-I" was essentially being ignored by the FreeBSD client. Ouch. Yes, that appears to be correct. > It also explains why Linux isn't doing a read before write, since > that wouldn't happen for direct I/O. You should test Linux without "-I" > and see if it still doesn't do the read before write, including a "-r 2k" > to avoid the "just happens to be a page size" case. With O_DIRECT, the Linux client reads only during the read tests. Without O_DIRECT, the Linux client does *no reads at all*, not even for the read tests. It caches the whole file and returns commensurately irrelevant/silly performance numbers (e.g. 7.2GiB/sec random reads). Setting the sysctl on the FreeBSD client does stop it from doing the excess reads. Ironically this actually makes O_DIRECT improve performance for all the workloads punished by that behavior. It also creates a fairly consistent pattern indicating performance being bottlenecked by the FreeBSD NFS server. Here is a sample of test results, showing both throughput and IOPS achieved: https://imageshack.com/i/4jiljhp In this chart, the 64k test is run 4 times, once with FreeBSD as both client and server (64k), once with Linux as the client and FreeBSD as the server (L/F 64k), once with FreeBSD as the client and Linux as the server (F/L 64k, hands down the best NFS combo), and once with Linux as the client and server (L/L 64k). For reference, the native performance of the md0 filesystem is also included. The TLDR version of this chart is that the FreeBSD NFS server is the primary bottleneck; it is not being held back by the network or the underlying disk. Ideally, it would be nice to see the 64k column for FreeBSD client / FreeBSD server as high as or higher than the column for FreeBSD client / Linux Server. (Also, the bottleneck of the F/L 64k test appears to be CPU on the FreeBSD client.) The more detailed findings are: 1) The Linux client runs at about half the IOPS of the FreeBSD client regardless of server type. The gut-level suspicion is that it must be doing twice as many NFS operations per write. (Possibly commit?) 2) The FreeBSD NFS server seems capped at around 6300 IOPS. This is neither a limit of the network (at least 28k IOPS) nor the filesystem (about 40k IOPs). 3) When O_DIRECT is not used (not shown), the excess read operations pull from the same 6300 IOPS bucket, and that's what kills small writes. 4) It's possible that the sharp drop off visible at the 64k/64k test is a result of doubling the number of packets traversing the TSO-capable network. Here's a representative top from the server while the test is running, showing all the nfsd kernel threads being utilized, and spare RAM and CPU: last pid: 14996; load averages: 0.58, 0.17, 0.10 up 5+01:42:13 04:02:24 255 processes: 4 running, 223 sleeping, 28 waiting CPU: 0.0% user, 0.0% nice, 55.9% system, 9.2% interrupt, 34.9% idle Mem: 2063M Active, 109M Inact, 1247M Wired, 1111M Buf, 4492M Free ARC: 930K Total, 41K MFU, 702K MRU, 16K Anon, 27K Header, 143K Other Swap: 8192M Total, 8192M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0K 32K RUN 1 120.8H 65.87% idle{idle: cpu1} 11 root 155 ki31 0K 32K RUN 0 120.6H 48.63% idle{idle: cpu0} 12 root -92 - 0K 448K WAIT 0 13:41 14.55% intr{irq268: virtio_p} 1001 root -8 - 0K 16K mdwait 0 19:37 12.99% md0 13 root -8 - 0K 48K - 0 10:53 6.05% geom{g_down} 12 root -92 - 0K 448K WAIT 1 3:13 4.64% intr{irq269: virtio_p} 859 root -4 0 9912K 1824K ufs 1 2:00 3.08% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 0 2:11 2.83% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 0 2:04 2.64% nfsd{nfsd: service} 859 root -4 0 9912K 1824K ufs 1 2:00 2.29% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 1 6:08 2.20% nfsd{nfsd: service} 859 root -4 0 9912K 1824K ufs 1 5:40 2.20% nfsd{nfsd: master} 859 root -4 0 9912K 1824K ufs 1 2:00 1.95% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 0 2:50 1.90% nfsd{nfsd: service} 859 root -4 0 9912K 1824K ufs 1 2:47 1.66% nfsd{nfsd: service} 859 root -8 0 9912K 1824K RUN 0 2:13 1.66% nfsd{nfsd: service} 13 root -8 - 0K 48K - 0 1:55 1.46% geom{g_up} 859 root -8 0 9912K 1824K rpcsvc 0 2:39 1.42% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 0 5:18 1.32% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 0 1:55 1.12% nfsd{nfsd: service} 859 root -8 0 9912K 1824K rpcsvc 0 2:00 0.98% nfsd{nfsd: service} 859 root -4 0 9912K 1824K ufs 1 2:01 0.73% nfsd{nfsd: service} 859 root -4 0 9912K 1824K ufs 1 5:56 0.49% nfsd{nfsd: service} All of this tends to exonerate the client. So what would be the next step to track down the cause of poor performance on the server-side? Thanks! From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 05:56:46 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 97222E1; Mon, 3 Feb 2014 05:56:46 +0000 (UTC) Received: from mail-ee0-x22d.google.com (mail-ee0-x22d.google.com [IPv6:2a00:1450:4013:c00::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 08F2610BB; Mon, 3 Feb 2014 05:56:45 +0000 (UTC) Received: by mail-ee0-f45.google.com with SMTP id b15so3397875eek.4 for ; Sun, 02 Feb 2014 21:56:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=UhurzXXKFQEQxYbbi34FpdyS8WOou1E/cVuvL+uVVO8=; b=ICpK0CXeeaZ/iscIx2/Uz+sxF29n5IDDAhjsRiSIyQBa8oanTgq0Xw+g9eUTIR6v5H di4XdeX0MzJ3F5S2lzOV7epViD7h/UR58uxgcxYTUUB8Qrmy+c21SPSZs3ORovdKi42Z mWueGhTUtCFRqo98hWgxGNDKv86GJ3+oQWJoysf3uyvgcxlTlA97Ap17R7hntHYd5O2r uIZnyBYGHHHbJjyL7pTb3P3OwFodHlD2XuX8PTMQl3+VNn/Kgh9KfFvuuoLP00XiKDz4 MK4aJTTyrgBsrt5CT+wYCieGr5bxD5DaTCgLbwC0a3u/Lx/mwHbkyFGH6FpHyyfm+6iS GOBQ== MIME-Version: 1.0 X-Received: by 10.14.32.132 with SMTP id o4mr41551178eea.14.1391407004286; Sun, 02 Feb 2014 21:56:44 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sun, 2 Feb 2014 21:56:44 -0800 (PST) In-Reply-To: References: Date: Sun, 2 Feb 2014 21:56:44 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" , Jack F Vogel Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 05:56:46 -0000 On Sun, Feb 2, 2014 at 7:32 PM, hiren panchasara wrote: > On Sun, Feb 2, 2014 at 11:19 AM, hiren panchasara > wrote: >> On Sat, Feb 1, 2014 at 5:16 PM, hiren panchasara >> wrote: >> >>> >>> -bash-4.2$ sysctl -a | grep mac_stats.checksum_errs >>> dev.ix.0.mac_stats.checksum_errs: 0 >>> dev.ix.1.mac_stats.checksum_errs: 0 >>> dev.ix.2.mac_stats.checksum_errs: 0 >>> dev.ix.3.mac_stats.checksum_errs: 3371119 <-- most of them fail checksum >> >> In ixgbe.c : >> ixgbe_add_hw_stats() has >> SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "checksum_errs", >> CTLFLAG_RD, &stats->xec, >> "Checksum Errors"); >> >> which is updated in ixgbe_update_stats_counters() >> adapter->stats.xec += IXGBE_READ_REG(hw, IXGBE_XEC); >> >> ixgbe_type.h defines >> #define IXGBE_XEC 0x04120 >> >> So, this is something that firmware of the card updates? We just read >> that register from drivers. > > ix3@pci0:66:0:1: class=0x020000 card=0x7a118086 chip=0x10fb8086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > device = '82599EB 10-Gigabit SFI/SFP+ Network Connection' > class = network > subclass = ethernet > > I found following in 82599 controller spec update: > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82599-10-gbe-controller-spec-update.pdf#G2.1474004 > http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82599-10-gbe-controller-spec-update.pdf#G2.2301787 > > Thought here I am sending unicast tcp traffic with iperf3. I tried to use netperf to generate smaller size packets (64 bytes) and with that I saw only 0.004% packets reporting checksum errors. $ netperf -H 10.73.149.91 -t TCP_STREAM -- -m 64 -D TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.73.149.91 () port 0 AF_INET : nodelay : histogram : interval : dirty data : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 65536 65536 64 10.00 56.92 Anything larger was showing 93% packets with checksum error. (as I've been reporting in this thread). cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 07:06:36 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6564D9B6 for ; Mon, 3 Feb 2014 07:06:36 +0000 (UTC) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id DAB18159D for ; Mon, 3 Feb 2014 07:06:35 +0000 (UTC) Received: from th-04.cs.huji.ac.il ([132.65.80.125]) by kabab.cs.huji.ac.il with esmtp id 1WADbw-000PvC-NU; Mon, 03 Feb 2014 09:06:28 +0200 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: Terrible NFS performance under 9.2-RELEASE? From: Daniel Braniss In-Reply-To: <906704123.1485103.1391357730899.JavaMail.root@uoguelph.ca> Date: Mon, 3 Feb 2014 09:04:24 +0200 Message-Id: <4AA2405B-8C52-49E1-AC33-F92762156152@cs.huji.ac.il> References: <906704123.1485103.1391357730899.JavaMail.root@uoguelph.ca> To: Rick Macklem X-Mailer: Apple Mail (2.1827) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: Pyun YongHyeon , FreeBSD Net , Adam McDougall , Jack Vogel X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 07:06:36 -0000 On Feb 2, 2014, at 6:15 PM, Rick Macklem wrote: > Daniel Braniss wrote: >> hi Rick, et.all. >>=20 >> tried your patch but it didn=92t help,the server is stuck. > Oh well. I was hoping that was going to make TSO work reliably. > Just to comfirm it, this server works reliably when TSO is disabled? >=20 absolutely, with TSO disabled there is no problem,and it=92s slightly = faster, the host is =91server class=92 , a PC might be different. cheers, danny > Thanks for doing the testing, rick >=20 >> just for fun, I tried a different client/host, this one has a >> broadcom NextXtreme II that was >> MFC=92ed lately, and the results are worse than the Intel (5hs = instead >> of 4hs) but faster without TSO >>=20 >> with TSO enabled and bs=3D32k: >> 5.09hs 18325.62 real 1109.23 user 4591.60 sys >>=20 >> without TSO: >> 4.75hs 17120.40 real 1114.08 user 3537.61 sys >>=20 >> So what is the advantage of using TSO? (no complain here, just >> curious) >>=20 >> I=92ll try to see if as a server it has the same TSO related issues. >>=20 >> cheers, >> danny >>=20 >> On Jan 28, 2014, at 3:51 AM, Rick Macklem >> wrote: >>=20 >>> Jack Vogel wrote: >>>> That header file is for the VF driver :) which I don't believe is >>>> being >>>> used in this case. >>>> The driver is capable of handling 256K but its limited by the >>>> stack >>>> to 64K >>>> (look in >>>> ixgbe.h), so its not a few bytes off due to the vlan header. >>>>=20 >>>> The scatter size is not an arbitrary one, its due to hardware >>>> limitations >>>> in Niantic >>>> (82599). Turning off TSO in the 10G environment is not practical, >>>> you will >>>> have >>>> trouble getting good performance. >>>>=20 >>>> Jack >>>>=20 >>> Well, if you look at this thread, Daniel got much better >>> performance >>> by turning off TSO. However, I agree that this is not an ideal >>> solution. >>> = http://docs.FreeBSD.org/cgi/mid.cgi?2C287272-7B57-4AAD-B22F-6A65D9F8677B >>>=20 >>> rick >>>=20 >>>>=20 >>>>=20 >>>> On Mon, Jan 27, 2014 at 4:58 PM, Yonghyeon PYUN >>>> wrote: >>>>=20 >>>>> On Mon, Jan 27, 2014 at 06:27:19PM -0500, Rick Macklem wrote: >>>>>> pyunyh@gmail.com wrote: >>>>>>> On Sun, Jan 26, 2014 at 09:16:54PM -0500, Rick Macklem wrote: >>>>>>>> Adam McDougall wrote: >>>>>>>>> Also try rsize=3D32768,wsize=3D32768 in your mount options, >>>>>>>>> made a >>>>>>>>> huge >>>>>>>>> difference for me. I've noticed slow file transfers on NFS >>>>>>>>> in 9 >>>>>>>>> and >>>>>>>>> finally did some searching a couple months ago, someone >>>>>>>>> suggested >>>>>>>>> it >>>>>>>>> and >>>>>>>>> they were on to something. >>>>>>>>>=20 >>>>>>>> I have a "hunch" that might explain why 64K NFS reads/writes >>>>>>>> perform >>>>>>>> poorly for some network environments. >>>>>>>> A 64K NFS read reply/write request consists of a list of 34 >>>>>>>> mbufs >>>>>>>> when >>>>>>>> passed to TCP via sosend() and a total data length of around >>>>>>>> 65680bytes. >>>>>>>> Looking at a couple of drivers (virtio and ixgbe), they seem >>>>>>>> to >>>>>>>> expect >>>>>>>> no more than 32-33 mbufs in a list for a 65535 byte TSO xmit. >>>>>>>> I >>>>>>>> think >>>>>>>> (I don't have anything that does TSO to confirm this) that >>>>>>>> NFS will >>>>>>>> pass >>>>>>>> a list that is longer (34 plus a TCP/IP header). >>>>>>>> At a glance, it appears that the drivers call m_defrag() or >>>>>>>> m_collapse() >>>>>>>> when the mbuf list won't fit in their scatter table (32 or 33 >>>>>>>> elements) >>>>>>>> and if this fails, just silently drop the data without >>>>>>>> sending it. >>>>>>>> If I'm right, there would considerable overhead from >>>>>>>> m_defrag()/m_collapse() >>>>>>>> and near disaster if they fail to fix the problem and the >>>>>>>> data is >>>>>>>> silently >>>>>>>> dropped instead of xmited. >>>>>>>>=20 >>>>>>>=20 >>>>>>> I think the actual number of DMA segments allocated for the >>>>>>> mbuf >>>>>>> chain is determined by bus_dma(9). bus_dma(9) will coalesce >>>>>>> current segment with previous segment if possible. >>>>>>>=20 >>>>>> Ok, I'll have to take a look, but I thought that an array of >>>>>> sized >>>>>> by "num_segs" is passed in as an argument. (And num_segs is set >>>>>> to >>>>>> either IXGBE_82598_SCATTER (100) or IXGBE_82599_SCATTER (32).) >>>>>> It looked to me that the ixgbe driver called itself ix, so it >>>>>> isn't >>>>>> obvious to me which we are talking about. (I know that Daniel >>>>>> Braniss >>>>>> had an ix0 and ix1, which were fixed for NFS by disabling TSO.) >>>>>>=20 >>>>>=20 >>>>> It's ix(4). ixbge(4) is a different driver. >>>>>=20 >>>>>> I'll admit I mostly looked at virtio's network driver, since >>>>>> that >>>>>> was the one being used by J David. >>>>>>=20 >>>>>> Problems w.r.t. TSO enabled for NFS using 64K rsize/wsize have >>>>>> been >>>>>> cropping up for quite a while, and I am just trying to find out >>>>>> why. >>>>>> (I have no hardware/software that exhibits the problem, so I can >>>>>> only look at the sources and ask others to try testing stuff.) >>>>>>=20 >>>>>>> I'm not sure whether you're referring to ixgbe(4) or ix(4) but >>>>>>> I >>>>>>> see the total length of all segment size of ix(4) is 65535 so >>>>>>> it has no room for ethernet/VLAN header of the mbuf chain. The >>>>>>> driver should be fixed to transmit a 64KB datagram. >>>>>> Well, if_hw_tsomax is set to 65535 by the generic code (the >>>>>> driver >>>>>> doesn't set it) and the code in tcp_output() seems to subtract >>>>>> the >>>>>> size of an tcp/ip header from that before passing data to the >>>>>> driver, >>>>>> so I think the mbuf chain passed to the driver will fit in one >>>>>> ip datagram. (I'd assume all sorts of stuff would break for TSO >>>>>> enabled drivers if that wasn't the case?) >>>>>=20 >>>>> I believe the generic code is doing right. I'm under the >>>>> impression the non-working TSO indicates a bug in driver. Some >>>>> drivers didn't account for additional ethernet/VLAN header so the >>>>> total size of DMA segments exceeded 65535. I've attached a diff >>>>> for ix(4). It wasn't tested at all as I don't have hardware to >>>>> test. >>>>>=20 >>>>>>=20 >>>>>>> I think the use of m_defrag(9) in TSO is suboptimal. All TSO >>>>>>> capable controllers are able to handle multiple TX buffers so >>>>>>> it >>>>>>> should have used m_collapse(9) rather than copying entire chain >>>>>>> with m_defrag(9). >>>>>>>=20 >>>>>> I haven't looked at these closely yet (plan on doing so to-day), >>>>>> but >>>>>> even m_collapse() looked like it copied data between mbufs and >>>>>> that >>>>>> is certainly suboptimal, imho. I don't see why a driver can't >>>>>> split >>>>>> the mbuf list, if there are too many entries for the >>>>>> scatter/gather >>>>>> and do it in two iterations (much like tcp_output() does >>>>>> already, >>>>>> since the data length exceeds 65535 - tcp/ip header size). >>>>>>=20 >>>>>=20 >>>>> It can split the mbuf list if controllers supports increased >>>>> number >>>>> of TX buffers. Because controller shall consume the same number >>>>> of >>>>> DMA descriptors for the mbuf list, drivers tend to impose a limit >>>>> on the number of TX buffers to save resources. >>>>>=20 >>>>>> However, at this point, I just want to find out if the long >>>>>> chain >>>>>> of mbufs is why TSO is problematic for these drivers, since I'll >>>>>> admit I'm getting tired of telling people to disable TSO (and I >>>>>> suspect some don't believe me and never try it). >>>>>>=20 >>>>>=20 >>>>> TSO capable controllers tend to have various limitations(the >>>>> first >>>>> TX buffer should have complete ethernet/IP/TCP header, ip_len of >>>>> IP >>>>> header should be reset to 0, TCP pseudo checksum should be >>>>> recomputed etc) and cheap controllers need more assistance from >>>>> driver to let its firmware know various IP/TCP header offset >>>>> location in the mbuf. Because this requires a IP/TCP header >>>>> parsing, it's error prone and very complex. >>>>>=20 >>>>>>>> Anyhow, I have attached a patch that makes NFS use >>>>>>>> MJUMPAGESIZE >>>>>>>> clusters, >>>>>>>> so the mbuf count drops from 34 to 18. >>>>>>>>=20 >>>>>>>=20 >>>>>>> Could we make it conditional on size? >>>>>>>=20 >>>>>> Not sure what you mean? If you mean "the size of the >>>>>> read/write", >>>>>> that would be possible for NFSv3, but less so for NFSv4. (The >>>>>> read/write >>>>>> is just one Op. in the compound for NFSv4 and there is no way to >>>>>> predict how much more data is going to be generated by >>>>>> subsequent >>>>>> Ops.) >>>>>>=20 >>>>>=20 >>>>> Sorry, I should have been more clearer. You already answered my >>>>> question. Thanks. >>>>>=20 >>>>>> If by "size" you mean amount of memory in the machine then, yes, >>>>>> it >>>>>> certainly could be conditional on that. (I plan to try and look >>>>>> at >>>>>> the allocator to-day as well, but if others know of >>>>>> disadvantages >>>>>> with >>>>>> using MJUMPAGESIZE instead of MCLBYTES, please speak up.) >>>>>>=20 >>>>>> Garrett Wollman already alluded to the MCLBYTES case being >>>>>> pre-allocated, >>>>>> but I'll admit I have no idea what the implications of that are >>>>>> at this >>>>>> time. >>>>>>=20 >>>>>>>> If anyone has a TSO scatter/gather enabled net interface and >>>>>>>> can >>>>>>>> test this >>>>>>>> patch on it with NFS I/O (default of 64K rsize/wsize) when >>>>>>>> TSO is >>>>>>>> enabled >>>>>>>> and see what effect it has, that would be appreciated. >>>>>>>>=20 >>>>>>>> Btw, thanks go to Garrett Wollman for suggesting the change >>>>>>>> to >>>>>>>> MJUMPAGESIZE >>>>>>>> clusters. >>>>>>>>=20 >>>>>>>> rick >>>>>>>> ps: If the attachment doesn't make it through and you want >>>>>>>> the >>>>>>>> patch, just >>>>>>>> email me and I'll send you a copy. >>>>>>>>=20 >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to >>>>> "freebsd-net-unsubscribe@freebsd.org" >>>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to >>>> "freebsd-net-unsubscribe@freebsd.org" >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to >> "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 07:25:03 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EBA1DBE1; Mon, 3 Feb 2014 07:25:03 +0000 (UTC) Received: from mail-ee0-x233.google.com (mail-ee0-x233.google.com [IPv6:2a00:1450:4013:c00::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5DED816B6; Mon, 3 Feb 2014 07:25:03 +0000 (UTC) Received: by mail-ee0-f51.google.com with SMTP id b57so3461711eek.10 for ; Sun, 02 Feb 2014 23:25:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=SDb/XulB9WCVRuWy4y51DrK8TGTA+OOOuy9B3XtMIgM=; b=sShcawe+quWD6RH9UjAIwFxwUvdlcpqPufteDBtPS3jzTluBOH98wqOl5KLt1plxPs 74eCL63i5Ulk7CPfjbyHmgcENnIiKlVTsDnq7aVJaSk09FMhfRbrau3fgwCKnrO5aDfO VAzGX4B/ajPly2cYn4oaFVGUsgtGad3yjxJettQ9j2NHwuzdH5VERGalcTkJJO8o4N6h rl15KbsdzbIQlQbkk6Tgi4i77Qcq1Br4MPiDJN23j42kK/XO+8dUEchDrEQFq1Y5r8Hl L9BKv4bHSHP0OdzEX8Ai0DkTd5UYC4bZTBKHYeQCJ8T51Un8PhUmDUQ88xQ13D/2wO3/ BIaw== MIME-Version: 1.0 X-Received: by 10.14.39.3 with SMTP id c3mr6654199eeb.42.1391412300906; Sun, 02 Feb 2014 23:25:00 -0800 (PST) Received: by 10.14.65.4 with HTTP; Sun, 2 Feb 2014 23:25:00 -0800 (PST) In-Reply-To: References: Date: Sun, 2 Feb 2014 23:25:00 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" , Jack F Vogel Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 07:25:04 -0000 Alright. I am having a "mind blown" moment right now. While reporting this checksum error issue, I always had ix3 (the culprit interface) being monitored via tcpdump in other tmux session. Something gotten into me and I stopped monitoring it and everything was kosher after that. Not a single checksum error! Just to be sure, I kldunloaded/loaded if_ixgbe (as I've built that as a module) to reset all the stat counters and ran my tests again using both iperf3 and netperf - and saw same behavior. If I start tcpdump on interface again, the dev.ix.3.mac_stats.checksum_errs starts going up. Stopping tcpdump would stop the counter from incrementing. Why is that happening? I have no clue. How/why is tcpdump affecting this interface traffic stats in such a way? Just as a recap, bridge0 has ix1 as a member and ix2 as a span interface. ix2 and ix3 are connected back to back so that I can monitor traffic coming to bridge0 on ix3 (as ix2 will forward it all being a span interface). cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 08:18:24 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CD7BEDEC for ; Mon, 3 Feb 2014 08:18:24 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 842001A59 for ; Mon, 3 Feb 2014 08:18:24 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id D484F1029DA for ; Mon, 3 Feb 2014 09:18:15 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:content-type :content-transfer-encoding; s=dkim-2012; bh=YPkmEep+WMMU6Nn/a0Tq uansoLf4AqK5hFG4QKBKX54=; b=LPJuHJwoYnWh5SVYZqOYkHV2QElFxjYjWqFZ gYjmvwbGwLhlnQ4sxC0T+BCAcntXoFTkaRFbOVnxzA6DQ4jYmkAF0qksnMF/bBs7 CBv4z0PgKo1h6ZYmRnNoIxIQhr2V2bQHIzl38qzvjiN9EShVsaSyFuev8oTYrKI0 suJmxts= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:content-type :content-transfer-encoding; q=dns; s=dkim-2012; b=TE7FzH8pVChVwZ M6w5fcB/B0bga+TZB0Z9gckhlJpdCWO1KSt2YnxEC1m7jZGqQMUemvrluWG6ezrT TTIk1syewqTju0+sYFW8VvtBYZyB/3sJfApb+F7iGewYA0lPlXgBh8TFint3eX8y +HchpBI9HtMiVcjYqxOLxofMTyngI= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id 059C11029D9 for ; Mon, 3 Feb 2014 09:18:14 +0100 (CET) Message-ID: <52EF50A7.1050205@niessen.ch> Date: Mon, 03 Feb 2014 09:17:43 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 08:18:24 -0000 Hi, I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 was configured to use LACP with two igb devices. Now it stopped working after the upgrade. This is a screenshot of ifconfig -a after the upgrade to FreeBSD 10.0-RELEASE: http://tinypic.com/view.php?pic=28jvgpw&s=5#.Uu9PXT1dVPM A PR is currently open: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/185967 It is set to low, but I would like somebody to have a look into it as it obviously has a great influence on our infrastructure. The only way to "solve" it is currently switching back to FreeBSD 9.2. The suggested fix "use failover" seems not to work. Thank you for your help. Best regards Ben From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 08:31:39 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 05CB5136 for ; Mon, 3 Feb 2014 08:31:39 +0000 (UTC) Received: from nm3-vm2.bullet.mail.ne1.yahoo.com (nm3-vm2.bullet.mail.ne1.yahoo.com [98.138.91.19]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id BEEDF1BAA for ; Mon, 3 Feb 2014 08:31:38 +0000 (UTC) Received: from [98.138.101.129] by nm3.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 08:31:31 -0000 Received: from [98.138.84.46] by tm17.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 08:31:31 -0000 Received: from [127.0.0.1] by smtp114.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 08:31:31 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391416291; bh=+XiT4AqJKrKWkmrqt3h8Afl+oki6weS7AMeFW7nEc6Y=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=nLJWWjR25CUIzXwNbPHRIYon8ied/nHGaGTJI6CoLfmtWON85rHrp6Z8O75KYcDN4YZgYyOD97nGoCfGg53y09tWSctAj+ZekKaFuu5HNb3YFwRFL+DCK4RVamioy42lP3XQ2TOZMFFBthTDunRZCBuknaM+kpj0pU80izUAEEk= X-Yahoo-Newman-Id: 227577.3780.bm@smtp114.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: g1JRfewVM1lDXkhxEjBmYY_Mkv47uAmBEtw0dc3WAMQMF7_ YA076GWk5gaQDeTw0rdFPBohCWg0_6FenYZZ8p4ALgBU9GWdoxI0L8OfWzAR YZXJLNaOzfsukR37b.FG1JV_ATWRbboR23EbwcGdhyG4mXlVBZuaraUH.3G0 Yf9L4AIxixelRYpqfYOSMXFpe0ncnThFvl9yvhNDgoolhEY7Gq0M9o3llzCN l0gN8a5M4CH5krZ8sItAhhTkIHAZKxMdsFPAhVq5g0lc93hkkMSVA.8FX9Lr SX73.FKvRRAWKIMk0ttsUNILV7BMbXNXNgUVPRrtXXHpmOyssjmwgYE5i_f6 P9f4.Ysw_RYmJSRfounTOwExZfK1grBl2baZIWFGpf_2w9pGtysbt7CCjWov qqVCI74jWfr1JtxqF2EzDl9vduo9urRlGigdaKDlrH065B5OFcioN0Qz2iRe ClrSyd5AYGXfTf4JLgbM58pWJe1xQmU.36DHLn6AKJNHqd.69bpAPGQVeqL5 7RX4nuCdYP7uJplBm0UfvH4xuSinhjhBStjWrLVnNB.TOcGfMGXRH.3EYcbX c8YnxspxuDAGYZ2pMO78pNUJnNIu1HZVvsdQyx5lJyiVnPKfGNpC6fNwTiuE XheUmz8MhW2cR13ZuzkYqQ_YWbTUjRYQcoWizMuok X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from phobos.samsco.home (scott4long@168.103.85.57 with plain [63.250.193.228]) by smtp114.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 00:31:31 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <52EF50A7.1050205@niessen.ch> Date: Mon, 3 Feb 2014 01:31:29 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> References: <52EF50A7.1050205@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 08:31:39 -0000 Hi, You=92re probably running into the consequences of r253687. Check to = see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. If = it=92s =911=92 then set it to 0. My original intention was for this to = default to 0, but apparently that didn=92t happen. However, the fact = that strict mode doesn=92t seem to work at all for you might hint that = your switch either isn=92t configured correctly for LACP, or doesn=92t = actually support LACP at all. You might want to investigate that. Scott On Feb 3, 2014, at 1:17 AM, Ben wrote: > Hi, >=20 > I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 was = configured to use LACP with two igb devices. >=20 > Now it stopped working after the upgrade. >=20 > This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >=20 > A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >=20 > It is set to low, but I would like somebody to have a look into it as = it obviously has a great influence on our infrastructure. The only way = to "solve" it is currently switching back to FreeBSD 9.2. >=20 > The suggested fix "use failover" seems not to work. >=20 > Thank you for your help. >=20 > Best regards > Ben > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 08:40:55 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ACC8634C for ; Mon, 3 Feb 2014 08:40:55 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 34D701C41 for ; Mon, 3 Feb 2014 08:40:54 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id 296EF102AA4 for ; Mon, 3 Feb 2014 09:40:53 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=qqTLKiA hbwR5Hgm18Av6Xin2egFXy5t9hKafm2hS/UE=; b=DvW1RZkuRrE85/SRP2wSULa 4Ci4DvspkrK42X2MdlCQ4n5Ys6eqbcd7F/ec8fKgIL7384C9oRtPTfbZCRQCDjjA +7r3k3LwsebZ79m5n5lNXS3RizhoiKNc9351iTeTIMuXlqWudnWawjYOORxq7PVG a1YeyFXaK5xnIEJExBBw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=f Dch9NVZ6Rb+ENG6DN615GM9/mrESO3Bpy1yYyNqzzehkvYaUp0LytqLCHzUPLv7q +Y5SPKhis4IYgTjNNCyCN6/9vz1Zj6u9KS9CpflYF1Y/B+lOZL44cx0Pb9aXDejW xIwnlsMLRaXPsbtI1HQWC/2a+TGbkIorh+6h0N/SB0= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id F313D102AA3 for ; Mon, 3 Feb 2014 09:40:52 +0100 (CET) Message-ID: <52EF55FE.8030901@niessen.ch> Date: Mon, 03 Feb 2014 09:40:30 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> In-Reply-To: <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 08:40:55 -0000 Hi Scott, I had tried to set it in /etc/sysctl.conf but seems it didnt work. But=20 will I try again and report back. The settings of the switch have not been changed and are set to LACP. It=20 worked before so I guess the switch should not be the problem. Maybe=20 some incompatibility between FreeBSD + igb-driver + switch (Juniper=20 EX3300-48T). I will update you after setting the sysctl setting. It seems to be=20 "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I=20 switch off the strict mode globally in /etc/sysctl.conf? Thanks for your help. Regards Ben On 03.02.2014 09:31, Scott Long wrote: > Hi, > > You=92re probably running into the consequences of r253687. Check to s= ee the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. If i= t=92s =911=92 then set it to 0. My original intention was for this to de= fault to 0, but apparently that didn=92t happen. However, the fact that = strict mode doesn=92t seem to work at all for you might hint that your sw= itch either isn=92t configured correctly for LACP, or doesn=92t actually = support LACP at all. You might want to investigate that. > > Scott > > On Feb 3, 2014, at 1:17 AM, Ben wrote: > >> Hi, >> >> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 was c= onfigured to use LACP with two igb devices. >> >> Now it stopped working after the upgrade. >> >> This is a screenshot of ifconfig -a after the upgrade to FreeBSD 10.0-= RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >> >> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dk= ern/185967 >> >> It is set to low, but I would like somebody to have a look into it as = it obviously has a great influence on our infrastructure. The only way to= "solve" it is currently switching back to FreeBSD 9.2. >> >> The suggested fix "use failover" seems not to work. >> >> Thank you for your help. >> >> Best regards >> Ben >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef53f7888821422342440! > > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:06:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 41CE479F for ; Mon, 3 Feb 2014 09:06:20 +0000 (UTC) Received: from nm11-vm6.bullet.mail.ne1.yahoo.com (nm11-vm6.bullet.mail.ne1.yahoo.com [98.138.91.104]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EC2621DC0 for ; Mon, 3 Feb 2014 09:06:18 +0000 (UTC) Received: from [98.138.100.115] by nm11.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:03:35 -0000 Received: from [98.138.226.127] by tm106.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:03:35 -0000 Received: from [127.0.0.1] by smtp206.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:03:35 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391418215; bh=Mw1qqMv75/tBoqcWWCEunxxuh/M/0bcBydojGgM8XEA=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=2Ip4jLuBfGyQKSb+KPhiaPE8yF88xNE3oKdw6QxVrY89iKN12Rur3i4gWGnO8IVCt4PLlL5SKOEorwEZgsyMtzpbXwxERV1gF7U+dNnqq8UJqGR7xfOp0JT6fiCsGqA49b+tjXSlr78Djtj/dvx17w7AygVYoK98nXKjq6tIPP4= X-Yahoo-Newman-Id: 653174.39248.bm@smtp206.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: 5Zv9riYVM1nu37_CJTlWikfFL1Pwrx1IVoVltUapCf4JFMi jy1f6Vr2RgZu7TbUeIsVgPe3ywBfZ1s4ZxZCF60e63pVbciJhhKCNJ6AJ01R xCKVh0k7G.BKIQAVg51GnsJyjyX6n1_Zm3S8OmeuvJO6ppYrV7dsxBUZ4QwY LtMA.5mW5yPxYgOwfu0QbSiV_dLXXYbcTMS5rOVTsHfn_uhjOY4DMuNoBumu ..0_vXWbft8_IzTGlyPZk0TcZ3u4KgWLvQ_58FMN3mbuW6lacD923vi1xd7. 2EJ5wI1pLnTOpGF_F2DbESxQEHT1l3Dsj99DxQCtikqqARMgcWgY_L30L_iV 2RoPCLmwYnj_t8yEQOv9d9oZSQHdLy4X1AdaThIJoJJTJHQB9z19AypOgdth St.1vfRc0qSGuDdW3mSJKSwIYQzopOUdGEYkwzhgIcOd.sNPuTIbrtwUbqsM 4fpzLbyvz1tfz.96XH1YhIvOnPxdlFevWxKEyQWjw7sg40w2da0ZwnCmfapY 6JO04pS5Uscip.JQNtu1WNpuvfGdlGbWg5bEFwwbUqWFSua8eZfYgP291uYe xyjrosAKxJM818JD2tnOpRtOl6iI5nQmsFTb_EQhM4y05.iX2t6JnjAMov_7 1ZM7a00RFOPSHEzECajJDOMox3btnlDqjVVJjcd_w_DfvfWRVxEg- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [10.64.24.117] (scott4long@69.53.236.251 with plain [63.250.193.228]) by smtp206.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 01:03:35 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <52EF55FE.8030901@niessen.ch> Date: Mon, 3 Feb 2014 02:03:33 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:06:20 -0000 Hi, Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD = 10.1. If the sysctl doesn=92t help then maybe consider compiling a = custom kernel with it defaulted to 0. You=92ll need to open = /sys/net/ieee802ad_lacp.c and look for the function lacp_attach(). = You=92ll see the strict_mode assign underneath that. I=92ll also send = you a patch in a few minutes. Until then, try enabling = net.link.lagg.lacp.debug=3D1 and see if you=92re receiving heartbeat = PDU=92s from your switch. Scott On Feb 3, 2014, at 1:40 AM, Ben wrote: > Hi Scott, >=20 > I had tried to set it in /etc/sysctl.conf but seems it didnt work. But = will I try again and report back. >=20 > The settings of the switch have not been changed and are set to LACP. = It worked before so I guess the switch should not be the problem. Maybe = some incompatibility between FreeBSD + igb-driver + switch (Juniper = EX3300-48T). >=20 > I will update you after setting the sysctl setting. It seems to be = "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I = switch off the strict mode globally in /etc/sysctl.conf? >=20 > Thanks for your help. >=20 > Regards > Ben >=20 > On 03.02.2014 09:31, Scott Long wrote: >> Hi, >>=20 >> You=92re probably running into the consequences of r253687. Check to = see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. If = it=92s =911=92 then set it to 0. My original intention was for this to = default to 0, but apparently that didn=92t happen. However, the fact = that strict mode doesn=92t seem to work at all for you might hint that = your switch either isn=92t configured correctly for LACP, or doesn=92t = actually support LACP at all. You might want to investigate that. >>=20 >> Scott >>=20 >> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>=20 >>> Hi, >>>=20 >>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 was = configured to use LACP with two igb devices. >>>=20 >>> Now it stopped working after the upgrade. >>>=20 >>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >>>=20 >>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>=20 >>> It is set to low, but I would like somebody to have a look into it = as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>=20 >>> The suggested fix "use failover" seems not to work. >>>=20 >>> Thank you for your help. >>>=20 >>> Best regards >>> Ben >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> !DSPAM:1,52ef53f7888821422342440! >>=20 >>=20 >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:11:19 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 21150924 for ; Mon, 3 Feb 2014 09:11:19 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9B7451E4B for ; Mon, 3 Feb 2014 09:11:18 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id EB60C102B85 for ; Mon, 3 Feb 2014 10:11:16 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=/ZRdm4B kZlJH+4/xmadhIigQt3Czmzk8ErruHF0Y7Ok=; b=AE+ZXC+Hgp5WjF7iP7BDm3S 98dlTqMpo8rDI+/xpErOC98KgxCoCLXihpBExaUgEpXwzTdNtJXipLoyMN9qyN2U R/xQbi/4Jh5cVeboTFv9n/e3E/9RxpnWXAXI2hHVSyk31n4ggRARaUA2oWI1UwRo X1uRIQZ9AaDBUe1BFn0A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=b KsYip8p+jEBCN0D1faTB874PNRUJ5W/ExuhrU1hB5Jc1DEsceSEITSokXkbyoU9N m1BKgk5K46hp+SdCUY0grsWwCIgMSr4BPrRF+6BMhpEZviNhELhh0axSUO9vXdQe FAAjvx17Ka6678JsMAnwL1iv36C/tgHlJzOzo3n2hM= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id B31E3102B84 for ; Mon, 3 Feb 2014 10:11:16 +0100 (CET) Message-ID: <52EF5D1E.2000306@niessen.ch> Date: Mon, 03 Feb 2014 10:10:54 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> In-Reply-To: <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:11:19 -0000 Hi, I set strict mode to 0 but no use. I do receive PDU messages. igb0: lacpdu transmit actor=3D(...) actor.state=3D4d partner=3D(...) partner.state=3D0 maxdelay=3D0 Thanks Ben On 03.02.2014 10:03, Scott Long wrote: > Hi, > > Unfortunately, you can=92t control the strict mode globally. My apolog= ies for this mess, I=92ll make sure that it=92s fixed for FreeBSD 10.1. = If the sysctl doesn=92t help then maybe consider compiling a custom kerne= l with it defaulted to 0. You=92ll need to open /sys/net/ieee802ad_lacp.= c and look for the function lacp_attach(). You=92ll see the strict_mode = assign underneath that. I=92ll also send you a patch in a few minutes. = Until then, try enabling net.link.lagg.lacp.debug=3D1 and see if you=92re= receiving heartbeat PDU=92s from your switch. > > Scott > > On Feb 3, 2014, at 1:40 AM, Ben wrote: > >> Hi Scott, >> >> I had tried to set it in /etc/sysctl.conf but seems it didnt work. But= will I try again and report back. >> >> The settings of the switch have not been changed and are set to LACP. = It worked before so I guess the switch should not be the problem. Maybe s= ome incompatibility between FreeBSD + igb-driver + switch (Juniper EX3300= -48T). >> >> I will update you after setting the sysctl setting. It seems to be "dy= namic", I guess 0 reflects the index of LACP lagg devices. Can I switch o= ff the strict mode globally in /etc/sysctl.conf? >> >> Thanks for your help. >> >> Regards >> Ben >> >> On 03.02.2014 09:31, Scott Long wrote: >>> Hi, >>> >>> You=92re probably running into the consequences of r253687. Check to= see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. If= it=92s =911=92 then set it to 0. My original intention was for this to = default to 0, but apparently that didn=92t happen. However, the fact tha= t strict mode doesn=92t seem to work at all for you might hint that your = switch either isn=92t configured correctly for LACP, or doesn=92t actuall= y support LACP at all. You might want to investigate that. >>> >>> Scott >>> >>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>> >>>> Hi, >>>> >>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 was= configured to use LACP with two igb devices. >>>> >>>> Now it stopped working after the upgrade. >>>> >>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD 10.= 0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >>>> >>>> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D= kern/185967 >>>> >>>> It is set to low, but I would like somebody to have a look into it a= s it obviously has a great influence on our infrastructure. The only way = to "solve" it is currently switching back to FreeBSD 9.2. >>>> >>>> The suggested fix "use failover" seems not to work. >>>> >>>> Thank you for your help. >>>> >>>> Best regards >>>> Ben >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>> >>> >>> >>> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef5c19888823082815771! > > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:25:16 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6E7D0FC9 for ; Mon, 3 Feb 2014 09:25:16 +0000 (UTC) Received: from nm13.bullet.mail.ne1.yahoo.com (nm13.bullet.mail.ne1.yahoo.com [98.138.90.76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1F42B10DB for ; Mon, 3 Feb 2014 09:25:15 +0000 (UTC) Received: from [98.138.100.117] by nm13.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:25:09 -0000 Received: from [98.138.226.56] by tm108.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:25:09 -0000 Received: from [127.0.0.1] by smtp207.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:25:09 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391419509; bh=8+1u7DVz81MFZYdczfqv2n5Co8wbDIj/G2D2oXy8MoA=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=1hfzr2rKhjAPhbRdTOFb/ev8cjh0wXknj2QIy1duNwXmeIBA/bb9SIYinSBEoBlcIlzk/x0CET2umTJCvqh+E0g+1hhZcy6QDYEZcdOm2IMRPvV9RogVNMB+qW+J0d4qLpKrJrsyDNLRxaTNXkExRQUuQo/vAAV4hW17E5NZ2QQ= X-Yahoo-Newman-Id: 616816.82018.bm@smtp207.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: nh1qtLcVM1muDQCx3vnWjqwkdchMlIH_E30pjq1ELr1L5iV o2ssr7ovuZmc0loHstlcAlrDCXdjWMaUkJ3ytNp_g_uTmIYnqlH3QGH5TvlY GzJVpkdgyY3kKc_molBEXjh5tttMT4VVFU9b_ob5fmO5zP7zlnPBgSkZ603X lQOwSxNZBoSfyO80gcot9M2PmBWJ4LvXthi4Zlo3NsIbMpIlrg.8MJLb08ti nJk1W7iJZB5Xw7EEfcrQg5e2Hjxhe06f_f4A28ECXFyomo3rwHMhGaHnUrLa D4AEXEFp2L8D8GPHD3m5CoktkUufqYpn9t9Y04kM1lLaQA7NHM3BlpCytKGD FRyET0lW5vqtZvFaR5SxUipjfxl6bf0eJYLz1GIMiXXtNypqwE8sWD2Tk1sa OLsk8h18OS6FxsC8HYRYK.FsabNgkiCSBOoMlBB6x4D.aIlv8ZNXNG9WTmBk tX4KfunxwtR0g1DUgzus1oeKCWl35K.jLPP1cXQfxDTU3UZv0qV62VrU4hGC HBB9U7LqNVTKswXrjSHjudrenmSOtC__GIgtI.GM.FqoLbMP59B1.GhjvdxI 3n9jzs6Jfb8uwfbRzRBubekO.ZY57bkYXmm4Cj1wfCLaXYj4ugVpKpStNcd7 xH0QSW_IIISrXJwTiIwN40KTP_nbLn1Tw19MNqlsPQt273jLUNf4- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [10.64.24.117] (scott4long@69.53.236.251 with plain [98.138.105.21]) by smtp207.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 01:25:09 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <52EF5D1E.2000306@niessen.ch> Date: Mon, 3 Feb 2014 02:25:06 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:25:16 -0000 Did you set it to 0 via the sysctl? You might need to wait for several = minutes if you set it after setting up the links. Also, the message that you=92re seeing is from your machine transmitting = PDU packets. Are you seeing any "lacpdu receive=94 messages on the = console? Thanks, Scott On Feb 3, 2014, at 2:10 AM, Ben wrote: > Hi, >=20 > I set strict mode to 0 but no use. I do receive PDU messages. >=20 > igb0: lacpdu transmit > actor=3D(...) > actor.state=3D4d > partner=3D(...) > partner.state=3D0 > maxdelay=3D0 >=20 > Thanks > Ben >=20 > On 03.02.2014 10:03, Scott Long wrote: >> Hi, >>=20 >> Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD = 10.1. If the sysctl doesn=92t help then maybe consider compiling a = custom kernel with it defaulted to 0. You=92ll need to open = /sys/net/ieee802ad_lacp.c and look for the function lacp_attach(). = You=92ll see the strict_mode assign underneath that. I=92ll also send = you a patch in a few minutes. Until then, try enabling = net.link.lagg.lacp.debug=3D1 and see if you=92re receiving heartbeat = PDU=92s from your switch. >>=20 >> Scott >>=20 >> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>=20 >>> Hi Scott, >>>=20 >>> I had tried to set it in /etc/sysctl.conf but seems it didnt work. = But will I try again and report back. >>>=20 >>> The settings of the switch have not been changed and are set to = LACP. It worked before so I guess the switch should not be the problem. = Maybe some incompatibility between FreeBSD + igb-driver + switch = (Juniper EX3300-48T). >>>=20 >>> I will update you after setting the sysctl setting. It seems to be = "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I = switch off the strict mode globally in /etc/sysctl.conf? >>>=20 >>> Thanks for your help. >>>=20 >>> Regards >>> Ben >>>=20 >>> On 03.02.2014 09:31, Scott Long wrote: >>>> Hi, >>>>=20 >>>> You=92re probably running into the consequences of r253687. Check = to see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. = If it=92s =911=92 then set it to 0. My original intention was for this = to default to 0, but apparently that didn=92t happen. However, the fact = that strict mode doesn=92t seem to work at all for you might hint that = your switch either isn=92t configured correctly for LACP, or doesn=92t = actually support LACP at all. You might want to investigate that. >>>>=20 >>>> Scott >>>>=20 >>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>=20 >>>>> Hi, >>>>>=20 >>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 = was configured to use LACP with two igb devices. >>>>>=20 >>>>> Now it stopped working after the upgrade. >>>>>=20 >>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >>>>>=20 >>>>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>>>=20 >>>>> It is set to low, but I would like somebody to have a look into it = as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>=20 >>>>> The suggested fix "use failover" seems not to work. >>>>>=20 >>>>> Thank you for your help. >>>>>=20 >>>>> Best regards >>>>> Ben >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> !DSPAM:1,52ef5c19888823082815771! >>=20 >>=20 >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:30:37 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7CCCA22B for ; Mon, 3 Feb 2014 09:30:37 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E730A1184 for ; Mon, 3 Feb 2014 09:30:36 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id 4AD14102BF3; Mon, 3 Feb 2014 10:30:19 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=DflFgmy OME2MIPQdyLtOhcJ2MpqzDNt+4lI1uqtlUxA=; b=EiDn69Fsm+O9cQ+N0U75T+C ytxS63m9OpTGv8/4ikGjtLkDlKyYNYZIHrwJsjA+Tpd1/iTD7F4ISx8V8OkG+zOH 2UhTeONsRDUOP8pO9LPyubzEmDoFDmho1PAmepPmnbfByYaKe6y0DD8hYlOM2biB zdv3AgqOE+BvPkZ6EeE0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=p cK9QiHZn1ke+p/jHLgpplrkKHSPWZ1sbDY5XL+xDPOM2P2EKGb26/hgH88lfkUO8 8Z69CmRlk0/HlC7SQF215fyVAYOrsQirPXRjqA8E+g0Z+oIAsqThGLPKgiANgD43 IbMpPOOqz3prK1Kq6zMDg3vs631HDh7O4p/WKyHi5s= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id EC3F3102BF2; Mon, 3 Feb 2014 10:30:18 +0100 (CET) Message-ID: <52EF6194.5060305@niessen.ch> Date: Mon, 03 Feb 2014 10:29:56 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Scott Long Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:30:37 -0000 Yes, via sysctl and /etc/sysctl.conf I waited now roughly 20 minutes without touching it but no difference. No, I only see these transmit messages, no receive. Thanks Ben On 03.02.2014 10:25, Scott Long wrote: > Did you set it to 0 via the sysctl? You might need to wait for several= minutes if you set it after setting up the links. > > Also, the message that you=92re seeing is from your machine transmittin= g PDU packets. Are you seeing any "lacpdu receive=94 messages on the con= sole? > > Thanks, > Scott > > On Feb 3, 2014, at 2:10 AM, Ben wrote: > >> Hi, >> >> I set strict mode to 0 but no use. I do receive PDU messages. >> >> igb0: lacpdu transmit >> actor=3D(...) >> actor.state=3D4d >> partner=3D(...) >> partner.state=3D0 >> maxdelay=3D0 >> >> Thanks >> Ben >> >> On 03.02.2014 10:03, Scott Long wrote: >>> Hi, >>> >>> Unfortunately, you can=92t control the strict mode globally. My apol= ogies for this mess, I=92ll make sure that it=92s fixed for FreeBSD 10.1.= If the sysctl doesn=92t help then maybe consider compiling a custom ker= nel with it defaulted to 0. You=92ll need to open /sys/net/ieee802ad_lac= p.c and look for the function lacp_attach(). You=92ll see the strict_mod= e assign underneath that. I=92ll also send you a patch in a few minutes.= Until then, try enabling net.link.lagg.lacp.debug=3D1 and see if you=92= re receiving heartbeat PDU=92s from your switch. >>> >>> Scott >>> >>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>> >>>> Hi Scott, >>>> >>>> I had tried to set it in /etc/sysctl.conf but seems it didnt work. B= ut will I try again and report back. >>>> >>>> The settings of the switch have not been changed and are set to LACP= . It worked before so I guess the switch should not be the problem. Maybe= some incompatibility between FreeBSD + igb-driver + switch (Juniper EX33= 00-48T). >>>> >>>> I will update you after setting the sysctl setting. It seems to be "= dynamic", I guess 0 reflects the index of LACP lagg devices. Can I switch= off the strict mode globally in /etc/sysctl.conf? >>>> >>>> Thanks for your help. >>>> >>>> Regards >>>> Ben >>>> >>>> On 03.02.2014 09:31, Scott Long wrote: >>>>> Hi, >>>>> >>>>> You=92re probably running into the consequences of r253687. Check = to see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. I= f it=92s =911=92 then set it to 0. My original intention was for this to= default to 0, but apparently that didn=92t happen. However, the fact th= at strict mode doesn=92t seem to work at all for you might hint that your= switch either isn=92t configured correctly for LACP, or doesn=92t actual= ly support LACP at all. You might want to investigate that. >>>>> >>>>> Scott >>>>> >>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 w= as configured to use LACP with two igb devices. >>>>>> >>>>>> Now it stopped working after the upgrade. >>>>>> >>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD 1= 0.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >>>>>> >>>>>> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cgi?pr= =3Dkern/185967 >>>>>> >>>>>> It is set to low, but I would like somebody to have a look into it= as it obviously has a great influence on our infrastructure. The only wa= y to "solve" it is currently switching back to FreeBSD 9.2. >>>>>> >>>>>> The suggested fix "use failover" seems not to work. >>>>>> >>>>>> Thank you for your help. >>>>>> >>>>>> Best regards >>>>>> Ben >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.= org" >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.o= rg" >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>> >>> >>> >>> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef6078888821231914487! > > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:32:02 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 99E2031A for ; Mon, 3 Feb 2014 09:32:02 +0000 (UTC) Received: from nm8-vm0.bullet.mail.ne1.yahoo.com (nm8-vm0.bullet.mail.ne1.yahoo.com [98.138.91.23]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4E24F11A9 for ; Mon, 3 Feb 2014 09:32:02 +0000 (UTC) Received: from [98.138.226.180] by nm8.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:31:55 -0000 Received: from [98.138.226.130] by tm15.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:31:55 -0000 Received: from [127.0.0.1] by smtp217.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:31:55 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391419915; bh=1l5rGoC7CjCjYb65Siq8ewOopckYkPYwL6ynRrSvEqE=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=Nomenv/Y7OnNRfPgg7eTpaMWooy/WM5jww6kG7OVnG5EeAVhNte35VjYZ8fzvtSHAFSfTP+OUI+K917p+N9XeNThJsbkG9CttkWA2NTheFkecmgCXKmbYGKUQ1dF72BEk5Gb2aORlXlF7t1mdTLa5nMNrZEw166H7YCx2FOlgKo= X-Yahoo-Newman-Id: 252341.20794.bm@smtp217.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: gKWNpGkVM1nBSyXJmx0nC1uaogLGt3sHp_1LTbkyUqzD2Xa nLdx6_SshDCgHWoey9c49AGRjhj_NMIuJTQYoazM.nERieHZ53PR.yTylu_y gBZXkBovA1bNygRlXGA6t2WC6CJDk_W0EGUyhsO8.Ttyt7NLFpHu3AS4DnNR xO9pa6ea0n.iq3CCfEtmnda7SnQ8FndYYfLlDdD.AGH7ocYmpvQcpfqcSykL vwrhyB6hoobq76VuXpoKfONInCA4lbGHz03aYvsTDJAx34UbMQdrI_a5i3mM w_m3nEBtvQoQJU5rKueog45dkInlhmD8rvVOPgqO4YNmnu7S1hmYSCCjsA_G 6PdnmdWFcruFYVu9KViyDjR1yub5z35APEHu_vMap13LrMrbWs4kLcg71GoR ZSsPOlqoSIMeQ.fc7qLG1vyKYA_18IGHoPkYZx3glrdBB2AGyrURYNqHNK3y DJ_VltMcn24pFSPQ2_NrH6..momGr8HsmQqHOOdjERqO.qkyMNoikjN5kVJs lPnlVDKVSvsvTrNr53vi2Y4FhYozEImZ0vAbOmmFmViHynXJ6_kZaWRiK_fD PbTxbf4TMr17sySRCVFrOu66ivTDb7W7G58vUI2C.lEnnW0tYTcWOh7_wviF Kf52JrcmnVY08MDff4QTBsuzCi4punE_VAZLk.PpDjjEWLvtyH4Y- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [10.64.24.117] (scott4long@69.53.236.251 with plain [98.138.105.21]) by smtp217.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 09:31:55 +0000 UTC Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: Date: Mon, 3 Feb 2014 02:31:52 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:32:02 -0000 Please try the following patch: Index: ieee8023ad_lacp.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- ieee8023ad_lacp.c (revision 261432) +++ ieee8023ad_lacp.c (working copy) @@ -192,6 +192,11 @@ SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, debug, CTLFLAG_RW | = CTLFLAG_TUN, &lacp_debug, 0, "Enable LACP debug logging (1=3Ddebug, 2=3Dtrace)"); TUNABLE_INT("net.link.lagg.lacp.debug", &lacp_debug); +static int lacp_strict =3D 0; +SYSCTL_INT(_net_link_lagg_lacp, OID_AUTO, lacp_strict_mode, + CTLFLAG_RW | CTLFLAG_TUN, &lacp_strict, 0, + "Enable LACP strict protocol compliance"); +TUNABLE_INT("net.link.lagg.lacp.lacp_strict_mode", &lacp_strict); =20 #define LACP_DPRINTF(a) if (lacp_debug & 0x01) { lacp_dprintf a ; } #define LACP_TRACE(a) if (lacp_debug & 0x02) { = lacp_dprintf(a,"%s\n",__func__); } @@ -791,7 +796,7 @@ =20 lsc->lsc_hashkey =3D arc4random(); lsc->lsc_active_aggregator =3D NULL; - lsc->lsc_strict_mode =3D 1; + lsc->lsc_strict_mode =3D lacp_strict; LACP_LOCK_INIT(lsc); TAILQ_INIT(&lsc->lsc_aggregators); LIST_INIT(&lsc->lsc_ports); On Feb 3, 2014, at 2:25 AM, Scott Long wrote: > Did you set it to 0 via the sysctl? You might need to wait for = several minutes if you set it after setting up the links. >=20 > Also, the message that you=92re seeing is from your machine = transmitting PDU packets. Are you seeing any "lacpdu receive=94 = messages on the console? >=20 > Thanks, > Scott >=20 > On Feb 3, 2014, at 2:10 AM, Ben wrote: >=20 >> Hi, >>=20 >> I set strict mode to 0 but no use. I do receive PDU messages. >>=20 >> igb0: lacpdu transmit >> actor=3D(...) >> actor.state=3D4d >> partner=3D(...) >> partner.state=3D0 >> maxdelay=3D0 >>=20 >> Thanks >> Ben >>=20 >> On 03.02.2014 10:03, Scott Long wrote: >>> Hi, >>>=20 >>> Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD = 10.1. If the sysctl doesn=92t help then maybe consider compiling a = custom kernel with it defaulted to 0. You=92ll need to open = /sys/net/ieee802ad_lacp.c and look for the function lacp_attach(). = You=92ll see the strict_mode assign underneath that. I=92ll also send = you a patch in a few minutes. Until then, try enabling = net.link.lagg.lacp.debug=3D1 and see if you=92re receiving heartbeat = PDU=92s from your switch. >>>=20 >>> Scott >>>=20 >>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>=20 >>>> Hi Scott, >>>>=20 >>>> I had tried to set it in /etc/sysctl.conf but seems it didnt work. = But will I try again and report back. >>>>=20 >>>> The settings of the switch have not been changed and are set to = LACP. It worked before so I guess the switch should not be the problem. = Maybe some incompatibility between FreeBSD + igb-driver + switch = (Juniper EX3300-48T). >>>>=20 >>>> I will update you after setting the sysctl setting. It seems to be = "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I = switch off the strict mode globally in /etc/sysctl.conf? >>>>=20 >>>> Thanks for your help. >>>>=20 >>>> Regards >>>> Ben >>>>=20 >>>> On 03.02.2014 09:31, Scott Long wrote: >>>>> Hi, >>>>>=20 >>>>> You=92re probably running into the consequences of r253687. Check = to see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92. = If it=92s =911=92 then set it to 0. My original intention was for this = to default to 0, but apparently that didn=92t happen. However, the fact = that strict mode doesn=92t seem to work at all for you might hint that = your switch either isn=92t configured correctly for LACP, or doesn=92t = actually support LACP at all. You might want to investigate that. >>>>>=20 >>>>> Scott >>>>>=20 >>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>=20 >>>>>> Hi, >>>>>>=20 >>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 = was configured to use LACP with two igb devices. >>>>>>=20 >>>>>> Now it stopped working after the upgrade. >>>>>>=20 >>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >>>>>>=20 >>>>>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>>>>=20 >>>>>> It is set to low, but I would like somebody to have a look into = it as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>=20 >>>>>> The suggested fix "use failover" seems not to work. >>>>>>=20 >>>>>> Thank you for your help. >>>>>>=20 >>>>>> Best regards >>>>>> Ben >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>=20 >>> !DSPAM:1,52ef5c19888823082815771! >>>=20 >>>=20 >>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:45:24 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 69690BB6 for ; Mon, 3 Feb 2014 09:45:24 +0000 (UTC) Received: from nm13-vm5.bullet.mail.ne1.yahoo.com (nm13-vm5.bullet.mail.ne1.yahoo.com [98.138.91.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 296F912BC for ; Mon, 3 Feb 2014 09:45:23 +0000 (UTC) Received: from [98.138.100.114] by nm13.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:45:17 -0000 Received: from [98.138.226.128] by tm105.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:45:17 -0000 Received: from [127.0.0.1] by smtp215.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:45:17 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391420717; bh=DnnYJ9F06W3nRbQyNUapAvSg3jVEzLjXVD0yJO9xZuI=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=tWq+ydi7Hzl9rzWuwE7UBQ3T1nRkC5zHfcRWTP7ZmuJ57d3ii6/cAnIrrDS+NJ7sZoeaNWMzmHZn2hx8lQ1adxnoGJFleVh15Juwei+5PDkmctSOKJ8RPqeHD8OCxbXbdhWZvvJIs9NlSWHYbntdAOAWCbA1PvmL5AmaEYASrB0= X-Yahoo-Newman-Id: 110000.62549.bm@smtp215.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: nFu00PUVM1nF5.hwjQZPljWuc7DMLzpZqv_s_K7SbUX3DBL lZ0H5lrPB4zTPquy7H.iMccOUWpACcSifWBcGihSxPv22yG4r2rHNQUXmG1p rpbqf831QJlmNWWj5n2WRq64AiQl.58Myv80zMO0Uvez3ReTNaHd4qQwi3EE WCnHR6oA1yh_3yoINFDIgh79kPZOmvFY4_N9uuoFSn3.k0HithoueUZsY8Tt 7UVxE7c2saci1U6HREBcz39gz8rhHU.2I7pbBEx.HTi2L.BGnKdquQWjJvQL UKfaBWDNytAt4KxJjDsrriXWOezCOM910W5v4iTfN77IEYvolgWOkSRybDMU 2J39YqxI5M3eMdUun1BrFk013RGY53KZJQ5T0LA9xFuMcdomxxbqEsYieIW6 V_rtkJaxubb.Vfnf5iO3BGXS6fg9M0wdJIhVKoIbYZiaglg0UIp0j8iUp0qR lw_oJ99doDAx2RgU8AsUdchugSDbVjJNgFBLY_OarupGOs6ybaNrrtdV02M4 AIoRNAGbnbGfn.97qe9Yxg.Qw7UIoa6EOyLCkiv.lNQozyIIQUIDg6NejUjz hdfZAF4s3ujIR5rzMa4xqVpkET23D27mnYj8dAtQgjVRZLfXysF_5SpUdVWR z6DzEcDP_PdmhIZG4x23quKL6Axv9rFdsL7_Sx3.yJnnSLIegcLvuKMuQ X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [10.64.24.117] (scott4long@69.53.236.251 with plain [98.139.211.125]) by smtp215.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 01:45:17 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <52EF6194.5060305@niessen.ch> Date: Mon, 3 Feb 2014 02:45:13 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:45:24 -0000 Ok, please try the patch I emailed earlier. Since you=92re not seeing = any receive messages, it means that your switch isn=92t generating any = LACP heartbeats. The difference between FreeBSD 9.x and 10 is that in = 9.x, it ran in =93optimistic=94 mode, meaning that it didn=92t rely on = getting receive messages from the switch, and only took a channel down = if the link state went down. In strict mode, it looks for the receive = messages and only transitions to a full operational state if it gets = them. So while I know it=92s easy to point at the problem being FreeBSD = 10, seeing as FreeBSD 9 worked for you, please check to make sure that = your switch is set up correctly. I authored the original change that went into FreeBSD 10, and I tried to = make it so that strict_mode=3D0 would keep everything working as it did = in 9. I guess that since you=92re getting no receive messages from the = switch at all that we need to disable strict mode on setup, not = afterwards. Apply the patch and everything should work as it did in = FreeBSD 9. Scott On Feb 3, 2014, at 2:29 AM, Ben wrote: > Yes, via sysctl and /etc/sysctl.conf >=20 > I waited now roughly 20 minutes without touching it but no difference. >=20 > No, I only see these transmit messages, no receive. >=20 > Thanks > Ben >=20 > On 03.02.2014 10:25, Scott Long wrote: >> Did you set it to 0 via the sysctl? You might need to wait for = several minutes if you set it after setting up the links. >>=20 >> Also, the message that you=92re seeing is from your machine = transmitting PDU packets. Are you seeing any "lacpdu receive=94 = messages on the console? >>=20 >> Thanks, >> Scott >>=20 >> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>=20 >>> Hi, >>>=20 >>> I set strict mode to 0 but no use. I do receive PDU messages. >>>=20 >>> igb0: lacpdu transmit >>> actor=3D(...) >>> actor.state=3D4d >>> partner=3D(...) >>> partner.state=3D0 >>> maxdelay=3D0 >>>=20 >>> Thanks >>> Ben >>>=20 >>> On 03.02.2014 10:03, Scott Long wrote: >>>> Hi, >>>>=20 >>>> Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD = 10.1. If the sysctl doesn=92t help then maybe consider compiling a = custom kernel with it defaulted to 0. You=92ll need to open = /sys/net/ieee802ad_lacp.c and look for the function lacp_attach(). = You=92ll see the strict_mode assign underneath that. I=92ll also send = you a patch in a few minutes. Until then, try enabling = net.link.lagg.lacp.debug=3D1 and see if you=92re receiving heartbeat = PDU=92s from your switch. >>>>=20 >>>> Scott >>>>=20 >>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>=20 >>>>> Hi Scott, >>>>>=20 >>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt work. = But will I try again and report back. >>>>>=20 >>>>> The settings of the switch have not been changed and are set to = LACP. It worked before so I guess the switch should not be the problem. = Maybe some incompatibility between FreeBSD + igb-driver + switch = (Juniper EX3300-48T). >>>>>=20 >>>>> I will update you after setting the sysctl setting. It seems to be = "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I = switch off the strict mode globally in /etc/sysctl.conf? >>>>>=20 >>>>> Thanks for your help. >>>>>=20 >>>>> Regards >>>>> Ben >>>>>=20 >>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>> Hi, >>>>>>=20 >>>>>> You=92re probably running into the consequences of r253687. = Check to see the value of =91sysctl = net.link.lagg.0.lacp.lacp_strict_mode=92. If it=92s =911=92 then set it = to 0. My original intention was for this to default to 0, but = apparently that didn=92t happen. However, the fact that strict mode = doesn=92t seem to work at all for you might hint that your switch either = isn=92t configured correctly for LACP, or doesn=92t actually support = LACP at all. You might want to investigate that. >>>>>>=20 >>>>>> Scott >>>>>>=20 >>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>>=20 >>>>>>> Hi, >>>>>>>=20 >>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 = was configured to use LACP with two igb devices. >>>>>>>=20 >>>>>>> Now it stopped working after the upgrade. >>>>>>>=20 >>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >>>>>>>=20 >>>>>>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>>>>>=20 >>>>>>> It is set to low, but I would like somebody to have a look into = it as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>=20 >>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>=20 >>>>>>> Thank you for your help. >>>>>>>=20 >>>>>>> Best regards >>>>>>> Ben >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>=20 >>>>=20 >>>>=20 >>>>=20 >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> !DSPAM:1,52ef6078888821231914487! From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:51:38 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4AAF2F4A for ; Mon, 3 Feb 2014 09:51:38 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B5AAB1360 for ; Mon, 3 Feb 2014 09:51:37 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id D8185102D01 for ; Mon, 3 Feb 2014 10:51:35 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=UOW6Sb4 UCLPcJiuZubd80Y4xgHNc+VTRZiZC3Vh/lUw=; b=WatekeGn0S5WjbGj9QX3s1C KJE/TUJGM+9sY5r4AmxL4WvwzbW/Xt60OO82yPXpCiRoYVXZEc0ma1ji0tVSWeS0 WQUj6GIntiMbiixnke0HEMfP3GuPe0Ysl66pT5Zzexd+3QHYCJEQJIigtwLKKfyU gpsHNmPtOs8mFoyEtHlM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=l TWXnaWTmCyb/tZucfsA/qwoUy143owpwl8ZCqzBKfWsbar605O/5TbsYSNd4uD64 UPXkefS3df/jZ5s86BVGiPCGMqrfB7jxri4rGVOZo7Nnw/ti0ZmGPHzXwqhP1bRB jrqXLAmOceQvYU6isJjKl1O170kPX33qHZ9s37LASU= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id 8C154102D00 for ; Mon, 3 Feb 2014 10:51:35 +0100 (CET) Message-ID: <52EF6690.3010509@niessen.ch> Date: Mon, 03 Feb 2014 10:51:12 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> In-Reply-To: <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:51:38 -0000 Thank you for your detailed explanation. If I understand correctly the switch is probably not set up correctly,=20 right? I will try to have it configured correctly first. Thanks a lot for your help! Regards Ben On 03.02.2014 10:45, Scott Long wrote: > Ok, please try the patch I emailed earlier. Since you=92re not seeing = any receive messages, it means that your switch isn=92t generating any LA= CP heartbeats. The difference between FreeBSD 9.x and 10 is that in 9.x,= it ran in =93optimistic=94 mode, meaning that it didn=92t rely on gettin= g receive messages from the switch, and only took a channel down if the l= ink state went down. In strict mode, it looks for the receive messages a= nd only transitions to a full operational state if it gets them. So whil= e I know it=92s easy to point at the problem being FreeBSD 10, seeing as = FreeBSD 9 worked for you, please check to make sure that your switch is s= et up correctly. > > I authored the original change that went into FreeBSD 10, and I tried t= o make it so that strict_mode=3D0 would keep everything working as it did= in 9. I guess that since you=92re getting no receive messages from the = switch at all that we need to disable strict mode on setup, not afterward= s. Apply the patch and everything should work as it did in FreeBSD 9. > > Scott > > On Feb 3, 2014, at 2:29 AM, Ben wrote: > >> Yes, via sysctl and /etc/sysctl.conf >> >> I waited now roughly 20 minutes without touching it but no difference. >> >> No, I only see these transmit messages, no receive. >> >> Thanks >> Ben >> >> On 03.02.2014 10:25, Scott Long wrote: >>> Did you set it to 0 via the sysctl? You might need to wait for sever= al minutes if you set it after setting up the links. >>> >>> Also, the message that you=92re seeing is from your machine transmitt= ing PDU packets. Are you seeing any "lacpdu receive=94 messages on the c= onsole? >>> >>> Thanks, >>> Scott >>> >>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>> >>>> Hi, >>>> >>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>> >>>> igb0: lacpdu transmit >>>> actor=3D(...) >>>> actor.state=3D4d >>>> partner=3D(...) >>>> partner.state=3D0 >>>> maxdelay=3D0 >>>> >>>> Thanks >>>> Ben >>>> >>>> On 03.02.2014 10:03, Scott Long wrote: >>>>> Hi, >>>>> >>>>> Unfortunately, you can=92t control the strict mode globally. My ap= ologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD 10.= 1. If the sysctl doesn=92t help then maybe consider compiling a custom ke= rnel with it defaulted to 0. You=92ll need to open /sys/net/ieee802ad_la= cp.c and look for the function lacp_attach(). You=92ll see the strict_mo= de assign underneath that. I=92ll also send you a patch in a few minutes= . Until then, try enabling net.link.lagg.lacp.debug=3D1 and see if you=92= re receiving heartbeat PDU=92s from your switch. >>>>> >>>>> Scott >>>>> >>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>> >>>>>> Hi Scott, >>>>>> >>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt work.= But will I try again and report back. >>>>>> >>>>>> The settings of the switch have not been changed and are set to LA= CP. It worked before so I guess the switch should not be the problem. May= be some incompatibility between FreeBSD + igb-driver + switch (Juniper EX= 3300-48T). >>>>>> >>>>>> I will update you after setting the sysctl setting. It seems to be= "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I swit= ch off the strict mode globally in /etc/sysctl.conf? >>>>>> >>>>>> Thanks for your help. >>>>>> >>>>>> Regards >>>>>> Ben >>>>>> >>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>> Hi, >>>>>>> >>>>>>> You=92re probably running into the consequences of r253687. Chec= k to see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92.= If it=92s =911=92 then set it to 0. My original intention was for this = to default to 0, but apparently that didn=92t happen. However, the fact = that strict mode doesn=92t seem to work at all for you might hint that yo= ur switch either isn=92t configured correctly for LACP, or doesn=92t actu= ally support LACP at all. You might want to investigate that. >>>>>>> >>>>>>> Scott >>>>>>> >>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2= was configured to use LACP with two igb devices. >>>>>>>> >>>>>>>> Now it stopped working after the upgrade. >>>>>>>> >>>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD= 10..0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1d= VPM >>>>>>>> >>>>>>>> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cgi?= pr=3Dkern/185967 >>>>>>>> >>>>>>>> It is set to low, but I would like somebody to have a look into = it as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>> >>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>> >>>>>>>> Thank you for your help. >>>>>>>> >>>>>>>> Best regards >>>>>>>> Ben >>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebs= d.org" >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd= .org" >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.= org" >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.o= rg" >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g" >>> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef6540888822133843295! > > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:56:42 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3384B25F for ; Mon, 3 Feb 2014 09:56:42 +0000 (UTC) Received: from mail-pb0-x22a.google.com (mail-pb0-x22a.google.com [IPv6:2607:f8b0:400e:c01::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 014C8139E for ; Mon, 3 Feb 2014 09:56:41 +0000 (UTC) Received: by mail-pb0-f42.google.com with SMTP id jt11so6910514pbb.1 for ; Mon, 03 Feb 2014 01:56:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ZRCuJeUmTVPyiLm59YpoJfdlrewVPw0p14ZTKipSs+o=; b=fzZtec7feMcyiQS0B8Y4/5MpsSwxLvq5QrW8POZyDAVyVcJx8dOwpzGqAjFBb/FFL0 1+y1XUQ19m8ZGgRDAEnOPgZETiU9yDVJAqtGeDOa6ZXsnBR9LSCw+oysIZIZNDuhXDOq e+fPoTxUoz9p+vZpcRiI0eW+H0T0/A9y81/41t+q7MZr929jZQt8v0+bG0eBxIWjQv1L NEnuFijCQ/1E+z/kGwhliNWvlB+9eoTR14ursk3qYLmMkqpZgbUBnSd9FSVL7mUdxzWu MrgaFCV2t2mKQbHjDZcsJH7UqL+1sJ/sVR17N309GhgXacH+4jbmWqgqtWXffhQCZcp4 MDAQ== MIME-Version: 1.0 X-Received: by 10.66.174.165 with SMTP id bt5mr2185434pac.151.1391421401395; Mon, 03 Feb 2014 01:56:41 -0800 (PST) Received: by 10.70.127.142 with HTTP; Mon, 3 Feb 2014 01:56:41 -0800 (PST) Received: by 10.70.127.142 with HTTP; Mon, 3 Feb 2014 01:56:41 -0800 (PST) In-Reply-To: <52EF6690.3010509@niessen.ch> References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> Date: Mon, 3 Feb 2014 11:56:41 +0200 Message-ID: Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Sami Halabi To: Ben Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:56:42 -0000 Hi, Changing in/etc/sysctl.conf isnt enough unless you restart the machine. Do in shell: sysctl net.link.lagg.0.lacp.lacp_strict_mode=3D0 Sami =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 3 =D7=91=D7=A4=D7=91=D7=A8 2014 11:51,= "Ben" =D7=9B=D7=AA=D7=91: > Thank you for your detailed explanation. > > If I understand correctly the switch is probably not set up correctly, > right? > > I will try to have it configured correctly first. > > Thanks a lot for your help! > > Regards > Ben > > On 03.02.2014 10:45, Scott Long wrote: > >> Ok, please try the patch I emailed earlier. Since you=E2=80=99re not se= eing any >> receive messages, it means that your switch isn=E2=80=99t generating any= LACP >> heartbeats. The difference between FreeBSD 9.x and 10 is that in 9.x, i= t >> ran in =E2=80=9Coptimistic=E2=80=9D mode, meaning that it didn=E2=80=99t= rely on getting receive >> messages from the switch, and only took a channel down if the link state >> went down. In strict mode, it looks for the receive messages and only >> transitions to a full operational state if it gets them. So while I kno= w >> it=E2=80=99s easy to point at the problem being FreeBSD 10, seeing as Fr= eeBSD 9 >> worked for you, please check to make sure that your switch is set up >> correctly. >> >> I authored the original change that went into FreeBSD 10, and I tried to >> make it so that strict_mode=3D0 would keep everything working as it did = in 9. >> I guess that since you=E2=80=99re getting no receive messages from the = switch at >> all that we need to disable strict mode on setup, not afterwards. Apply >> the patch and everything should work as it did in FreeBSD 9. >> >> Scott >> >> On Feb 3, 2014, at 2:29 AM, Ben wrote: >> >> Yes, via sysctl and /etc/sysctl.conf >>> >>> I waited now roughly 20 minutes without touching it but no difference. >>> >>> No, I only see these transmit messages, no receive. >>> >>> Thanks >>> Ben >>> >>> On 03.02.2014 10:25, Scott Long wrote: >>> >>>> Did you set it to 0 via the sysctl? You might need to wait for severa= l >>>> minutes if you set it after setting up the links. >>>> >>>> Also, the message that you=E2=80=99re seeing is from your machine tran= smitting >>>> PDU packets. Are you seeing any "lacpdu receive=E2=80=9D messages on = the console? >>>> >>>> Thanks, >>>> Scott >>>> >>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>> >>>> Hi, >>>>> >>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>> >>>>> igb0: lacpdu transmit >>>>> actor=3D(...) >>>>> actor.state=3D4d >>>>> partner=3D(...) >>>>> partner.state=3D0 >>>>> maxdelay=3D0 >>>>> >>>>> Thanks >>>>> Ben >>>>> >>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Unfortunately, you can=E2=80=99t control the strict mode globally. = My >>>>>> apologies for this mess, I=E2=80=99ll make sure that it=E2=80=99s fi= xed for FreeBSD 10.1. >>>>>> If the sysctl doesn=E2=80=99t help then maybe consider compiling a c= ustom kernel >>>>>> with it defaulted to 0. You=E2=80=99ll need to open /sys/net/ieee80= 2ad_lacp.c and >>>>>> look for the function lacp_attach(). You=E2=80=99ll see the strict_= mode assign >>>>>> underneath that. I=E2=80=99ll also send you a patch in a few minute= s. Until then, >>>>>> try enabling net.link.lagg.lacp.debug=3D1 and see if you=E2=80=99re = receiving >>>>>> heartbeat PDU=E2=80=99s from your switch. >>>>>> >>>>>> Scott >>>>>> >>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>> >>>>>> Hi Scott, >>>>>>> >>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt work. >>>>>>> But will I try again and report back. >>>>>>> >>>>>>> The settings of the switch have not been changed and are set to >>>>>>> LACP. It worked before so I guess the switch should not be the prob= lem. >>>>>>> Maybe some incompatibility between FreeBSD + igb-driver + switch (J= uniper >>>>>>> EX3300-48T). >>>>>>> >>>>>>> I will update you after setting the sysctl setting. It seems to be >>>>>>> "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I= switch >>>>>>> off the strict mode globally in /etc/sysctl.conf? >>>>>>> >>>>>>> Thanks for your help. >>>>>>> >>>>>>> Regards >>>>>>> Ben >>>>>>> >>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> You=E2=80=99re probably running into the consequences of r253687. = Check to >>>>>>>> see the value of =E2=80=98sysctl net.link.lagg.0.lacp.lacp_strict_= mode=E2=80=99. >>>>>>>> If it=E2=80=99s =E2=80=981=E2=80=99 then set it to 0. My original= intention was for this to >>>>>>>> default to 0, but apparently that didn=E2=80=99t happen. However,= the fact that >>>>>>>> strict mode doesn=E2=80=99t seem to work at all for you might hint= that your switch >>>>>>>> either isn=E2=80=99t configured correctly for LACP, or doesn=E2=80= =99t actually support >>>>>>>> LACP at all. You might want to investigate that. >>>>>>>> >>>>>>>> Scott >>>>>>>> >>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 >>>>>>>>> was configured to use LACP with two igb devices. >>>>>>>>> >>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>> >>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD >>>>>>>>> 10..0-RELEASE: http://tinypic.com/view.php? >>>>>>>>> pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >>>>>>>>> >>>>>>>>> A PR is currently open: http://www.freebsd.org/cgi/ >>>>>>>>> query-pr.cgi?pr=3Dkern/185967 >>>>>>>>> >>>>>>>>> It is set to low, but I would like somebody to have a look into i= t >>>>>>>>> as it obviously has a great influence on our infrastructure. The = only way >>>>>>>>> to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>>> >>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>> >>>>>>>>> Thank you for your help. >>>>>>>>> >>>>>>>>> Best regards >>>>>>>>> Ben >>>>>>>>> _______________________________________________ >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@ >>>>>>>>> freebsd.org" >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@ >>>>>>>> freebsd.org" >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@ >>>>>>> freebsd.org" >>>>>>> >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g >>>>>> " >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>>>> >>>> >>>> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> >> !DSPAM:1,52ef6540888822133843295! >> >> >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 09:57:39 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BE7183E2 for ; Mon, 3 Feb 2014 09:57:39 +0000 (UTC) Received: from mail.made4.biz (mail.made4.biz [IPv6:2001:41d0:2:c018::1:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 79D6613B1 for ; Mon, 3 Feb 2014 09:57:39 +0000 (UTC) Received: from [2001:1b48:10b:cafe:225:64ff:febe:589f] (helo=viking.yzserv.com) by mail.made4.biz with esmtpsa (TLSv1:DHE-RSA-CAMELLIA256-SHA:256) (Exim 4.82 (FreeBSD)) (envelope-from ) id 1WAGHX-000GMa-Rh; Mon, 03 Feb 2014 10:57:37 +0100 Message-ID: <52EF67EF.1000803@FreeBSD.org> Date: Mon, 03 Feb 2014 10:57:03 +0100 From: =?ISO-8859-1?Q?Jean-S=E9bastien_P=E9dron?= User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@FreeBSD.org Subject: Loosing TCP/IPv4 connections with jails+pf on 10.0-RELEASE X-Enigmail-Version: 1.6 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="wMeVt8JFm3k9gCEdwxul8KBTVPOt7GbIe" Cc: Christopher Faulet X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 09:57:39 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --wMeVt8JFm3k9gCEdwxul8KBTVPOt7GbIe Content-Type: multipart/mixed; boundary="------------050604020808060504070202" This is a multi-part message in MIME format. --------------050604020808060504070202 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello! We have one server with multiple jails, each jail runs a service (mail, web, etc.). sysutils/ezjail is used to setup and start the jails. Beside the public IP address, IPv4 and IPv6 aliases are added to the main NIC (em0); one per jail. The server has a second NIC (em1) which is unused. As we only have one public IPv4 address, pf is used to o redirect connections to jails o NAT connections from jails With 8.3-RELEASE on another server, this setup was working without problem. Now that we switched to a new server and 10.0-RELEASE (we skipped 9.x), we see that TCP connections to jails over IPv4 are having troubles: o After around 10 days of uptime, connections from an IRC client on the host (not a jail) connected to an IRC server on a jail are getting dropped during the night (maybe because of no activity on the IRC channel). It seems that packets from the host (or a remote computer) to the jail are fine. However, packets from the jail never reach the peer. This was tested with nc(1) on both sides, so the uptime of the IRC client or server isn't related. o As the time passes, connections are dropped faster and faster: even during the day, when there's activity on the IRC channel. o At some point, connections only live for a few seconds and this affects short-lived connections to the SMTP/IMAP and web jails. A reboot solves the problem, until it comes back a week or more later. Troubles start to appear again since this week-end. IPv6 connections are NOT affected: they work perfectly. This is stock FreeBSD 10.0-RELEASE amd64 with GENERIC kernel. You'll find attached the output of ifconfig(8), our pf rules and one jail configuration in ezjail (other jails have a similar setup). Note that the pf rules we used on FreeBSD 8.3 are commented out at the end of pf.conf; we simplified them by using ports lists. Do you see something wrong with this setup? PS: I'm not subscribed to the list, please CC me. --=20 Jean-S=E9bastien P=E9dron --------------050604020808060504070202 Content-Type: text/plain; charset=UTF-8; name="ifconfig.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="ifconfig.txt" em0: flags=3D8843 metric 0 mtu 15= 00 options=3D4219b ether 00:25:90:21:04:2c inet $PUBLIC_IP netmask 0xffffff00 broadcast $PUBLIC_BROADCAST=20 inet6 fe80::225:90ff:fe21:42c%em0 prefixlen 64 scopeid 0x1=20 inet6 $PUBLIC_IPV6::1 prefixlen 56=20 inet 10.0.0.1 netmask 0xffffffff broadcast 10.0.0.1=20 inet 10.0.0.3 netmask 0xffffffff broadcast 10.0.0.3=20 inet6 $PUBLIC_IPV6::1:3 prefixlen 64=20 inet 10.0.0.4 netmask 0xffffffff broadcast 10.0.0.4=20 inet6 $PUBLIC_IPV6::1:4 prefixlen 64=20 inet 10.0.0.2 netmask 0xffffffff broadcast 10.0.0.2=20 inet6 $PUBLIC_IPV6::1:2 prefixlen 64=20 nd6 options=3D21 media: Ethernet autoselect (1000baseT ) status: active em1: flags=3D8c02 metric 0 mtu 1500 options=3D4219b ether 00:25:90:21:04:2d nd6 options=3D29 media: Ethernet autoselect status: no carrier lo0: flags=3D8049 metric 0 mtu 16384 options=3D600003 inet6 ::1 prefixlen 128=20 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3=20 inet 127.0.0.1 netmask 0xff000000=20 nd6 options=3D21 --------------050604020808060504070202 Content-Type: text/plain; charset=UTF-8; name="pf.conf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="pf.conf" # Interface declarations ext_if=3D"em0" int_if=3D"lo0" all_if=3D"{em0, lo0}" # Internal network subnet jail_net=3D"10.0.0.0/24" # Name and IP of our webserver MYSQL=3D"10.0.0.1" HTTPD=3D"10.0.0.2" VEXIM=3D"10.0.0.3" IRCD=3D"10.0.0.4" PUBLIC_IP=3D"..." #scrub in all nat pass on $ext_if inet from $jail_net to any -> $PUBLIC_IP rdr pass on $all_if inet proto tcp from any to $ext_if port {6667,6668,70= 00} -> $IRCD rdr pass on $all_if inet proto tcp from any to $ext_if port {80,443,8140}= -> $HTTPD rdr pass on $all_if inet proto tcp from any to $ext_if port {25,143,465,9= 93,995} -> $VEXIM rdr pass on $int_if inet proto tcp from any to $int_if port 25 -> $VEXIM ### OLD RULES (FreeBSD 8.3) ### #rdr on $all_if inet proto tcp from any to $ext_if port 80 -> $HTTPD port= 80 #rdr on $all_if inet proto tcp from any to $ext_if port 443 -> $HTTPD por= t 443 #rdr on $all_if inet proto tcp from any to $ext_if port 8140 -> $HTTPD po= rt 8140 #rdr on $all_if inet proto tcp from any to $ext_if port 995 -> $VEXIM por= t 995 #rdr on $all_if inet proto tcp from any to $ext_if port 993 -> $VEXIM por= t 993 #rdr on $all_if inet proto tcp from any to $ext_if port 143 -> $VEXIM por= t 143 #rdr on $all_if inet proto tcp from any to $ext_if port 25 -> $VEXIM port= 25 #rdr on $all_if inet proto tcp from any to $ext_if port 465 -> $VEXIM por= t 465 #rdr on $all_if inet proto tcp from any to $int_if port 25 -> $VEXIM port= 25 #rdr on $all_if inet proto tcp from any to $ext_if port 7000 -> $IRCD por= t 7000 #rdr on $all_if inet proto tcp from any to $ext_if port 6667 -> $IRCD por= t 6667 #rdr on $all_if inet proto tcp from any to $ext_if port 6668 -> $IRCD por= t 6668 #nat on $ext_if inet from $MYSQL to any -> $PUBLIC_IP #nat on $ext_if inet from $HTTPD to any -> $PUBLIC_IP #nat on $ext_if inet from $VEXIM to any -> $PUBLIC_IP #nat on $ext_if inet from $IRCD to any -> $PUBLIC_IP --------------050604020808060504070202 Content-Type: text/plain; charset=UTF-8; name="ezjail.conf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="ezjail.conf" export jail_ircd_hostname=3D"ircd" export jail_ircd_ip=3D"em0|10.0.0.4,em0|$PUBLIC_IPV6::1:4" =2E.. export jail_ircd_parameters=3D"allow.raw_sockets=3D1" --------------050604020808060504070202-- --wMeVt8JFm3k9gCEdwxul8KBTVPOt7GbIe Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlLvaA8ACgkQa+xGJsFYOlMmAQCZARoq/RVaaJz7owyaUap6rf89 Zb0Anjuo1uSG9dJ8RSny+gC9J1DFYwQ2 =aAk+ -----END PGP SIGNATURE----- --wMeVt8JFm3k9gCEdwxul8KBTVPOt7GbIe-- From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 10:01:27 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BD6C75BE for ; Mon, 3 Feb 2014 10:01:27 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 339EC14A4 for ; Mon, 3 Feb 2014 10:01:26 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id CC137102DBC for ; Mon, 3 Feb 2014 11:01:09 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=r/U+wqh xrQ33LlCRmsH3YmPAiTg5tKx3P1yeLMrR1kE=; b=rVGVM4ZuD4pfuDP+L3nGy8A JNbd7NXRQ1iqcsdoquyQlcOsEAS5EfDthblNPC4Qk9QtgKhGxyQxSiT5+8FjYakX 63MDY9hUa3eNSzGrVOyC6q3Mny50+MDUhBOplZFJhrZlFh1QTeRFNDILDzkL6J5o byOtSWEVs1pgJKvhdhLg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=N Al44TKxXLfNXS0Zr5xVw7Gae0C/IciK247EG56gQCsdJu+8ylkVN0F/ovPIQ2wNT tA6AY53zGqKdxjdYeHESSdtLQeeCgB2N784s769MEJb/zxR6g0AHzTVYpf7bUinv Plw2q+8x1gU7jdArK3bhZlMtDzU+iJNoFaa02ru11s= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id 6C08F102DBB for ; Mon, 3 Feb 2014 11:01:09 +0100 (CET) Message-ID: <52EF68CD.8020501@niessen.ch> Date: Mon, 03 Feb 2014 11:00:45 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:01:27 -0000 Hi, Sorry for not making it clear. Sure I restarted the machine completely=20 and set it at runtime as well. Both made no difference. Regards Ben On 03.02.2014 10:56, Sami Halabi wrote: > Hi, > Changing in/etc/sysctl.conf isnt enough unless you restart the machine. > Do in shell: > sysctl net.link.lagg.0.lacp.lacp_strict_mode=3D0 > > Sami > =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 3 =D7=91=D7=A4=D7=91=D7=A8 2014 11= :51, "Ben" =D7=9B=D7=AA=D7=91: > >> Thank you for your detailed explanation. >> >> If I understand correctly the switch is probably not set up correctly, >> right? >> >> I will try to have it configured correctly first. >> >> Thanks a lot for your help! >> >> Regards >> Ben >> >> On 03.02.2014 10:45, Scott Long wrote: >> >>> Ok, please try the patch I emailed earlier. Since you=E2=80=99re not= seeing any >>> receive messages, it means that your switch isn=E2=80=99t generating = any LACP >>> heartbeats. The difference between FreeBSD 9.x and 10 is that in 9.x= , it >>> ran in =E2=80=9Coptimistic=E2=80=9D mode, meaning that it didn=E2=80=99= t rely on getting receive >>> messages from the switch, and only took a channel down if the link st= ate >>> went down. In strict mode, it looks for the receive messages and onl= y >>> transitions to a full operational state if it gets them. So while I = know >>> it=E2=80=99s easy to point at the problem being FreeBSD 10, seeing as= FreeBSD 9 >>> worked for you, please check to make sure that your switch is set up >>> correctly. >>> >>> I authored the original change that went into FreeBSD 10, and I tried= to >>> make it so that strict_mode=3D0 would keep everything working as it d= id in 9. >>> I guess that since you=E2=80=99re getting no receive messages from = the switch at >>> all that we need to disable strict mode on setup, not afterwards. Ap= ply >>> the patch and everything should work as it did in FreeBSD 9. >>> >>> Scott >>> >>> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>> >>> Yes, via sysctl and /etc/sysctl.conf >>>> I waited now roughly 20 minutes without touching it but no differenc= e. >>>> >>>> No, I only see these transmit messages, no receive. >>>> >>>> Thanks >>>> Ben >>>> >>>> On 03.02.2014 10:25, Scott Long wrote: >>>> >>>>> Did you set it to 0 via the sysctl? You might need to wait for sev= eral >>>>> minutes if you set it after setting up the links. >>>>> >>>>> Also, the message that you=E2=80=99re seeing is from your machine t= ransmitting >>>>> PDU packets. Are you seeing any "lacpdu receive=E2=80=9D messages = on the console? >>>>> >>>>> Thanks, >>>>> Scott >>>>> >>>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>> >>>>> Hi, >>>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>>> >>>>>> igb0: lacpdu transmit >>>>>> actor=3D(...) >>>>>> actor.state=3D4d >>>>>> partner=3D(...) >>>>>> partner.state=3D0 >>>>>> maxdelay=3D0 >>>>>> >>>>>> Thanks >>>>>> Ben >>>>>> >>>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Unfortunately, you can=E2=80=99t control the strict mode globally= . My >>>>>>> apologies for this mess, I=E2=80=99ll make sure that it=E2=80=99s= fixed for FreeBSD 10.1. >>>>>>> If the sysctl doesn=E2=80=99t help then maybe consider compiling = a custom kernel >>>>>>> with it defaulted to 0. You=E2=80=99ll need to open /sys/net/iee= e802ad_lacp.c and >>>>>>> look for the function lacp_attach(). You=E2=80=99ll see the stri= ct_mode assign >>>>>>> underneath that. I=E2=80=99ll also send you a patch in a few min= utes. Until then, >>>>>>> try enabling net.link.lagg.lacp.debug=3D1 and see if you=E2=80=99= re receiving >>>>>>> heartbeat PDU=E2=80=99s from your switch. >>>>>>> >>>>>>> Scott >>>>>>> >>>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>> >>>>>>> Hi Scott, >>>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt wor= k. >>>>>>>> But will I try again and report back. >>>>>>>> >>>>>>>> The settings of the switch have not been changed and are set to >>>>>>>> LACP. It worked before so I guess the switch should not be the p= roblem. >>>>>>>> Maybe some incompatibility between FreeBSD + igb-driver + switch= (Juniper >>>>>>>> EX3300-48T). >>>>>>>> >>>>>>>> I will update you after setting the sysctl setting. It seems to = be >>>>>>>> "dynamic", I guess 0 reflects the index of LACP lagg devices. Ca= n I switch >>>>>>>> off the strict mode globally in /etc/sysctl.conf? >>>>>>>> >>>>>>>> Thanks for your help. >>>>>>>> >>>>>>>> Regards >>>>>>>> Ben >>>>>>>> >>>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> You=E2=80=99re probably running into the consequences of r25368= 7. Check to >>>>>>>>> see the value of =E2=80=98sysctl net.link.lagg.0.lacp.lacp_stri= ct_mode=E2=80=99. >>>>>>>>> If it=E2=80=99s =E2=80=981=E2=80=99 then set it to 0. My origi= nal intention was for this to >>>>>>>>> default to 0, but apparently that didn=E2=80=99t happen. Howev= er, the fact that >>>>>>>>> strict mode doesn=E2=80=99t seem to work at all for you might h= int that your switch >>>>>>>>> either isn=E2=80=99t configured correctly for LACP, or doesn=E2= =80=99t actually support >>>>>>>>> LACP at all. You might want to investigate that. >>>>>>>>> >>>>>>>>> Scott >>>>>>>>> >>>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote= : >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9= .2 >>>>>>>>>> was configured to use LACP with two igb devices. >>>>>>>>>> >>>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>>> >>>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeB= SD >>>>>>>>>> 10..0-RELEASE: http://tinypic.com/view.php? >>>>>>>>>> pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >>>>>>>>>> >>>>>>>>>> A PR is currently open: http://www.freebsd.org/cgi/ >>>>>>>>>> query-pr.cgi?pr=3Dkern/185967 >>>>>>>>>> >>>>>>>>>> It is set to low, but I would like somebody to have a look int= o it >>>>>>>>>> as it obviously has a great influence on our infrastructure. T= he only way >>>>>>>>>> to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>>>> >>>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>>> >>>>>>>>>> Thank you for your help. >>>>>>>>>> >>>>>>>>>> Best regards >>>>>>>>>> Ben >>>>>>>>>> _______________________________________________ >>>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@ >>>>>>>>>> freebsd.org" >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@ >>>>>>>>> freebsd.org" >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@ >>>>>>>> freebsd.org" >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd= .org >>>>>>> " >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.= org" >>>>>> >>>>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>> >>> >>> >>> >>> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef67e5888828188920253! > > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 10:01:52 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7F774668 for ; Mon, 3 Feb 2014 10:01:52 +0000 (UTC) Received: from nm7-vm6.bullet.mail.ne1.yahoo.com (nm7-vm6.bullet.mail.ne1.yahoo.com [98.138.91.100]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3066A14AF for ; Mon, 3 Feb 2014 10:01:51 +0000 (UTC) Received: from [98.138.101.132] by nm7.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:58:29 -0000 Received: from [98.138.226.58] by tm20.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:58:29 -0000 Received: from [127.0.0.1] by smtp209.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 09:58:29 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391421509; bh=E93OgI/imELzsRZY6yqpolQ18HmQi1S7FrrA96/8UoQ=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=MIJCUIFNfhyhrgN/BCcpGrIc/SUU/2h/rsLDNTKlbIYl2aO6etKg/QYoaepXNvWsdR5OajrT9usJ8xFj0QYTcynPys5/T11RMELW41Vkdvdg7Er4FST4OvXnjAYLxSAq7silbQf2EL/HHeFj+f5GZIcofmokym4KD42iKFNRFyo= X-Yahoo-Newman-Id: 531205.5257.bm@smtp209.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: FtRZJe0VM1lWRH.V3d.Eey9Jgs0CxPrXHKHpoBMyHahOL5g IjO8rwjHkTPTxKqice0lx_SjBTw9qLVTLS5BXlvNc9j2iKYJtdaH602iZ.yS WJaeKSbeZhlgiJwsKnQW5vFKYx3fLa0rbxusbY0vImaJ6MH4BLkqquoWggO3 .qnbXEkouiPk.bjxIGNsmSw84eJ.ZqX1Djmj0weOzsc1CFIu4XShyVCZVTQP 3p35ONMarT.luUL7iPtHIoJf1_QFuVL1VJcIGq21PvP9U1zKevRJDFba8Eru Gxa8B6CHkIOdT3OooqRHyaeoEs15biGfbFHVVN3PdXCYW_TZ_2e0PnlkJ_Rn IzM0pIPvZBWsopO4O5Ksb2c91niLgKV3iuChONCQYSFnw4HvLJQz6yBJmjIR pc6OF4z.Nnx2vmcjYScGtSUKIE.u0EkU3pGBpRpCGgShWosPGVsCSoL6mZU6 6aQYqDKp4sVeuopgGvkN2I9juw7FBYBsqM2wEsMO174ewx4r.nxw_2xb1xnk CYOl3S9_yizYyRbHZpouKS894DEqNYrnRZ9jvD8V3Zn_TMdnRiauuYd3NhPX aPbBWLwlRxV8LhDvylgX5o5dCeYmE6UUQ92QGfrjvSMR9kgkJ5a1HoKD59rX m2xua_bp0r_z.WZc9XDop1QCUIYyD18OqS_u0kKJKPAJAOAnVmyI- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [10.64.24.117] (scott4long@69.53.236.251 with plain [98.139.211.125]) by smtp209.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 09:58:29 +0000 UTC Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <52EF6690.3010509@niessen.ch> Date: Mon, 3 Feb 2014 02:58:26 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:01:52 -0000 Hi, If you can, please test the patch I sent and let me know the results. = I=92ll check it into FreeBSD 11 and 10 if it works for you. Thanks, Scott On Feb 3, 2014, at 2:51 AM, Ben wrote: > Thank you for your detailed explanation. >=20 > If I understand correctly the switch is probably not set up correctly, = right? >=20 > I will try to have it configured correctly first. >=20 > Thanks a lot for your help! >=20 > Regards > Ben >=20 > On 03.02.2014 10:45, Scott Long wrote: >> Ok, please try the patch I emailed earlier. Since you=92re not = seeing any receive messages, it means that your switch isn=92t = generating any LACP heartbeats. The difference between FreeBSD 9.x and = 10 is that in 9.x, it ran in =93optimistic=94 mode, meaning that it = didn=92t rely on getting receive messages from the switch, and only took = a channel down if the link state went down. In strict mode, it looks = for the receive messages and only transitions to a full operational = state if it gets them. So while I know it=92s easy to point at the = problem being FreeBSD 10, seeing as FreeBSD 9 worked for you, please = check to make sure that your switch is set up correctly. >>=20 >> I authored the original change that went into FreeBSD 10, and I tried = to make it so that strict_mode=3D0 would keep everything working as it = did in 9. I guess that since you=92re getting no receive messages from = the switch at all that we need to disable strict mode on setup, not = afterwards. Apply the patch and everything should work as it did in = FreeBSD 9. >>=20 >> Scott >>=20 >> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>=20 >>> Yes, via sysctl and /etc/sysctl.conf >>>=20 >>> I waited now roughly 20 minutes without touching it but no = difference. >>>=20 >>> No, I only see these transmit messages, no receive. >>>=20 >>> Thanks >>> Ben >>>=20 >>> On 03.02.2014 10:25, Scott Long wrote: >>>> Did you set it to 0 via the sysctl? You might need to wait for = several minutes if you set it after setting up the links. >>>>=20 >>>> Also, the message that you=92re seeing is from your machine = transmitting PDU packets. Are you seeing any "lacpdu receive=94 = messages on the console? >>>>=20 >>>> Thanks, >>>> Scott >>>>=20 >>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>=20 >>>>> Hi, >>>>>=20 >>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>>=20 >>>>> igb0: lacpdu transmit >>>>> actor=3D(...) >>>>> actor.state=3D4d >>>>> partner=3D(...) >>>>> partner.state=3D0 >>>>> maxdelay=3D0 >>>>>=20 >>>>> Thanks >>>>> Ben >>>>>=20 >>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>>> Hi, >>>>>>=20 >>>>>> Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD = 10.1. If the sysctl doesn=92t help then maybe consider compiling a = custom kernel with it defaulted to 0. You=92ll need to open = /sys/net/ieee802ad_lacp.c and look for the function lacp_attach(). = You=92ll see the strict_mode assign underneath that. I=92ll also send = you a patch in a few minutes. Until then, try enabling = net.link.lagg.lacp.debug=3D1 and see if you=92re receiving heartbeat = PDU=92s from your switch. >>>>>>=20 >>>>>> Scott >>>>>>=20 >>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>=20 >>>>>>> Hi Scott, >>>>>>>=20 >>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt = work. But will I try again and report back. >>>>>>>=20 >>>>>>> The settings of the switch have not been changed and are set to = LACP. It worked before so I guess the switch should not be the problem. = Maybe some incompatibility between FreeBSD + igb-driver + switch = (Juniper EX3300-48T). >>>>>>>=20 >>>>>>> I will update you after setting the sysctl setting. It seems to = be "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I = switch off the strict mode globally in /etc/sysctl.conf? >>>>>>>=20 >>>>>>> Thanks for your help. >>>>>>>=20 >>>>>>> Regards >>>>>>> Ben >>>>>>>=20 >>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>>> Hi, >>>>>>>>=20 >>>>>>>> You=92re probably running into the consequences of r253687. = Check to see the value of =91sysctl = net.link.lagg.0.lacp.lacp_strict_mode=92. If it=92s =911=92 then set it = to 0. My original intention was for this to default to 0, but = apparently that didn=92t happen. However, the fact that strict mode = doesn=92t seem to work at all for you might hint that your switch either = isn=92t configured correctly for LACP, or doesn=92t actually support = LACP at all. You might want to investigate that. >>>>>>>>=20 >>>>>>>> Scott >>>>>>>>=20 >>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben = wrote: >>>>>>>>=20 >>>>>>>>> Hi, >>>>>>>>>=20 >>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD = 9.2 was configured to use LACP with two igb devices. >>>>>>>>>=20 >>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>>=20 >>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to = FreeBSD 10..0-RELEASE: = http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >>>>>>>>>=20 >>>>>>>>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>>>>>>>=20 >>>>>>>>> It is set to low, but I would like somebody to have a look = into it as it obviously has a great influence on our infrastructure. The = only way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>>>=20 >>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>>=20 >>>>>>>>> Thank you for your help. >>>>>>>>>=20 >>>>>>>>> Best regards >>>>>>>>> Ben >>>>>>>>> _______________________________________________ >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>>>=20 >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> !DSPAM:1,52ef6540888822133843295! >>=20 >>=20 >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 10:19:08 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 45577C8A for ; Mon, 3 Feb 2014 10:19:08 +0000 (UTC) Received: from mail-ig0-x22f.google.com (mail-ig0-x22f.google.com [IPv6:2607:f8b0:4001:c05::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EFC2115C2 for ; Mon, 3 Feb 2014 10:19:07 +0000 (UTC) Received: by mail-ig0-f175.google.com with SMTP id uq10so3863578igb.2 for ; Mon, 03 Feb 2014 02:19:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dataix.net; s=rsa; h=references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:from:subject:date:to; bh=6KdOhtPj1IdNkjMkOGCKH2WV4Uq6xWbZRcSEB/WXy3Q=; b=DpJNikQxtAT0qmn4QQ3+/q2yePsY4IAMxKF4c+CVdHxTiH+gFH/GPLbI2YjT606YJ4 9F+qRn6HbS7euiUD0b0oZT1zGfoDcXa7GFt5ypQy4YL7sNKoVUVbQ/JJOMRkIhc6TgN1 ymjie/Rf0IF69RyzPikNRaORCEt7LND+8oFaM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:from:subject:date:to; bh=6KdOhtPj1IdNkjMkOGCKH2WV4Uq6xWbZRcSEB/WXy3Q=; b=OnkPCnJeRH2CdVYS9zmbTC28h0cejheq7eLYElxRFBKF3rOrZDRKPrrJ9tMMDi/IPr ZFutkqSYP2infltgQq48T0Z3Y0eehD3SRjT9/9Sd4KSjCove5necIo6UljbUCYahkpVu wZsKCnE4XO6vVr00GfU1AdCYGvFvt7yWqoVJK++XgolD1RknAH2+O3AtOGx5LePse+Us Yw1klbd4znyh4oiUfd58S4kLt3rrhQv/U+RjOl8BfMMpT7Vrz8cq2i03NIJCmX5BXmht vSWkxZ/AAubO242ipviJ3L2UsGEvs1cKXvzVFgWEkIKkBM34DGUWJ5yeGrsLYFii0HUM M8QA== X-Gm-Message-State: ALoCoQm1DzhDzo2pel0NzM+Or90mu2G/0Qq8xDJbnqcXp/rtnqP47b4thrfk21HMEbNR5dxBoIK5 X-Received: by 10.50.194.130 with SMTP id hw2mr8262098igc.15.1391422747303; Mon, 03 Feb 2014 02:19:07 -0800 (PST) Received: from [172.31.35.2] (75-128-101-59.dhcp.sgnw.mi.charter.com. [75.128.101.59]) by mx.google.com with ESMTPSA id kz4sm29212683igb.4.2014.02.03.02.19.05 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Feb 2014 02:19:05 -0800 (PST) References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> Mime-Version: 1.0 (1.0) In-Reply-To: <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> Content-Type: multipart/signed; micalg=sha1; boundary=Apple-Mail-A7136F4D-2CDB-4270-8AEB-610F01220791; protocol="application/pkcs7-signature" Content-Transfer-Encoding: 7bit Message-Id: <15315BED-B868-4054-9D6B-DBBD453D89D6@dataix.net> X-Mailer: iPhone Mail (11B554a) From: Jason Hellenthal Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 Date: Mon, 3 Feb 2014 05:19:03 -0500 To: Scott Long X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: FreeBSD Net , Ben X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:19:08 -0000 --Apple-Mail-A7136F4D-2CDB-4270-8AEB-610F01220791 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Just wanted to add here that I've got a LACP setup on 10-STABLE three uplink= s that just won't quit. Negotiates quickly and have not had a problem with i= t whatsoever. --=20 Jason Hellenthal Voice: 95.30.17.6/616 JJH48-ARIN > On Feb 3, 2014, at 4:45, Scott Long wrote: >=20 > Ok, please try the patch I emailed earlier. Since you=E2=80=99re not seei= ng any receive messages, it means that your switch isn=E2=80=99t generating a= ny LACP heartbeats. The difference between FreeBSD 9.x and 10 is that in 9.= x, it ran in =E2=80=9Coptimistic=E2=80=9D mode, meaning that it didn=E2=80=99= t rely on getting receive messages from the switch, and only took a channel d= own if the link state went down. In strict mode, it looks for the receive m= essages and only transitions to a full operational state if it gets them. S= o while I know it=E2=80=99s easy to point at the problem being FreeBSD 10, s= eeing as FreeBSD 9 worked for you, please check to make sure that your switc= h is set up correctly. >=20 > I authored the original change that went into FreeBSD 10, and I tried to m= ake it so that strict_mode=3D0 would keep everything working as it did in 9.= I guess that since you=E2=80=99re getting no receive messages from the swi= tch at all that we need to disable strict mode on setup, not afterwards. Ap= ply the patch and everything should work as it did in FreeBSD 9. >=20 > Scott >=20 >> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>=20 >> Yes, via sysctl and /etc/sysctl.conf >>=20 >> I waited now roughly 20 minutes without touching it but no difference. >>=20 >> No, I only see these transmit messages, no receive. >>=20 >> Thanks >> Ben >>=20 >>> On 03.02.2014 10:25, Scott Long wrote: >>> Did you set it to 0 via the sysctl? You might need to wait for several m= inutes if you set it after setting up the links. >>>=20 >>> Also, the message that you=E2=80=99re seeing is from your machine transm= itting PDU packets. Are you seeing any "lacpdu receive=E2=80=9D messages on= the console? >>>=20 >>> Thanks, >>> Scott >>>=20 >>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>=20 >>>> Hi, >>>>=20 >>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>=20 >>>> igb0: lacpdu transmit >>>> actor=3D(...) >>>> actor.state=3D4d >>>> partner=3D(...) >>>> partner.state=3D0 >>>> maxdelay=3D0 >>>>=20 >>>> Thanks >>>> Ben >>>>=20 >>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>> Hi, >>>>>=20 >>>>> Unfortunately, you can=E2=80=99t control the strict mode globally. My= apologies for this mess, I=E2=80=99ll make sure that it=E2=80=99s fixed for= FreeBSD 10.1. If the sysctl doesn=E2=80=99t help then maybe consider compil= ing a custom kernel with it defaulted to 0. You=E2=80=99ll need to open /sy= s/net/ieee802ad_lacp.c and look for the function lacp_attach(). You=E2=80=99= ll see the strict_mode assign underneath that. I=E2=80=99ll also send you a= patch in a few minutes. Until then, try enabling net.link.lagg.lacp.debug=3D= 1 and see if you=E2=80=99re receiving heartbeat PDU=E2=80=99s from your swit= ch. >>>>>=20 >>>>> Scott >>>>>=20 >>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>=20 >>>>>> Hi Scott, >>>>>>=20 >>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt work. Bu= t will I try again and report back. >>>>>>=20 >>>>>> The settings of the switch have not been changed and are set to LACP.= It worked before so I guess the switch should not be the problem. Maybe som= e incompatibility between FreeBSD + igb-driver + switch (Juniper EX3300-48T)= . >>>>>>=20 >>>>>> I will update you after setting the sysctl setting. It seems to be "d= ynamic", I guess 0 reflects the index of LACP lagg devices. Can I switch off= the strict mode globally in /etc/sysctl.conf? >>>>>>=20 >>>>>> Thanks for your help. >>>>>>=20 >>>>>> Regards >>>>>> Ben >>>>>>=20 >>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>> Hi, >>>>>>>=20 >>>>>>> You=E2=80=99re probably running into the consequences of r253687. C= heck to see the value of =E2=80=98sysctl net.link.lagg.0.lacp.lacp_strict_mo= de=E2=80=99. If it=E2=80=99s =E2=80=981=E2=80=99 then set it to 0. My origi= nal intention was for this to default to 0, but apparently that didn=E2=80=99= t happen. However, the fact that strict mode doesn=E2=80=99t seem to work a= t all for you might hint that your switch either isn=E2=80=99t configured co= rrectly for LACP, or doesn=E2=80=99t actually support LACP at all. You migh= t want to investigate that. >>>>>>>=20 >>>>>>> Scott >>>>>>>=20 >>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>>>>=20 >>>>>>>> Hi, >>>>>>>>=20 >>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 wa= s configured to use LACP with two igb devices. >>>>>>>>=20 >>>>>>>> Now it stopped working after the upgrade. >>>>>>>>=20 >>>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD 10= .0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM >>>>>>>>=20 >>>>>>>> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D= kern/185967 >>>>>>>>=20 >>>>>>>> It is set to low, but I would like somebody to have a look into it a= s it obviously has a great influence on our infrastructure. The only way to "= solve" it is currently switching back to FreeBSD 9.2. >>>>>>>>=20 >>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>=20 >>>>>>>> Thank you for your help. >>>>>>>>=20 >>>>>>>> Best regards >>>>>>>> Ben >>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.o= rg" >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.or= g" >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>>>> _______________________________________________ >>>>> freebsd-net@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"= >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >>>=20 >>> !DSPAM:1,52ef6078888821231914487! >=20 > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" --Apple-Mail-A7136F4D-2CDB-4270-8AEB-610F01220791 Content-Type: application/pkcs7-signature; name=smime.p7s Content-Disposition: attachment; filename=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIUOTCCBjAw ggUYoAMCAQICAwaijjANBgkqhkiG9w0BAQsFADCBjDELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0 YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcx ODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRlcm1lZGlhdGUgQ2xpZW50IENB MB4XDTEzMDUxODA4NTA0OFoXDTE0MDUxOTIyMDk0N1owSDEfMB0GA1UEAwwWamhlbGxlbnRoYWxA ZGF0YWl4Lm5ldDElMCMGCSqGSIb3DQEJARYWamhlbGxlbnRoYWxAZGF0YWl4Lm5ldDCCASIwDQYJ KoZIhvcNAQEBBQADggEPADCCAQoCggEBALgnYFS1bWZr3KhKBzWAdRwrY+En+RRV8nCaYubqrMG+ YJbuenaIKSbIuFiDWipW4RHYTpE28pKaSnaVTG9WtAZvsWj0gYN9g2fYCnCOUceES2Yvi3RavxpB hsuzKIfsHb8iNNSEuczLu6gn4mQyaHwE4x6xSUKmbK8njR+YoF522F60wjsnq5dlOJdTrhDfObE5 5P23279WbRp8azgZX1VRB66wdKRDuSI1vBts4Nsha2paXd6HUUduHrPACBQREJTGXN8XtEKVwo63 aKUhRgtUwHNEuSWck/xwVl7PBUWH2dORAWTCqHjNuCKNOQ1/0LMiyMj7FdsBjN4dgL4YZpsCAwEA AaOCAtwwggLYMAkGA1UdEwQCMAAwCwYDVR0PBAQDAgSwMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggr BgEFBQcDBDAdBgNVHQ4EFgQU29qUrmZtgQ7ZVoDKogfpJOSfk+YwHwYDVR0jBBgwFoAUU3Ltkpzg 2ssBXHx+ljVO8tS4UYIwIQYDVR0RBBowGIEWamhlbGxlbnRoYWxAZGF0YWl4Lm5ldDCCAUwGA1Ud IASCAUMwggE/MIIBOwYLKwYBBAGBtTcBAgMwggEqMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3LnN0 YXJ0c3NsLmNvbS9wb2xpY3kucGRmMIH3BggrBgEFBQcCAjCB6jAnFiBTdGFydENvbSBDZXJ0aWZp Y2F0aW9uIEF1dGhvcml0eTADAgEBGoG+VGhpcyBjZXJ0aWZpY2F0ZSB3YXMgaXNzdWVkIGFjY29y ZGluZyB0byB0aGUgQ2xhc3MgMSBWYWxpZGF0aW9uIHJlcXVpcmVtZW50cyBvZiB0aGUgU3RhcnRD b20gQ0EgcG9saWN5LCByZWxpYW5jZSBvbmx5IGZvciB0aGUgaW50ZW5kZWQgcHVycG9zZSBpbiBj b21wbGlhbmNlIG9mIHRoZSByZWx5aW5nIHBhcnR5IG9ibGlnYXRpb25zLjA2BgNVHR8ELzAtMCug KaAnhiVodHRwOi8vY3JsLnN0YXJ0c3NsLmNvbS9jcnR1MS1jcmwuY3JsMIGOBggrBgEFBQcBAQSB gTB/MDkGCCsGAQUFBzABhi1odHRwOi8vb2NzcC5zdGFydHNzbC5jb20vc3ViL2NsYXNzMS9jbGll bnQvY2EwQgYIKwYBBQUHMAKGNmh0dHA6Ly9haWEuc3RhcnRzc2wuY29tL2NlcnRzL3N1Yi5jbGFz czEuY2xpZW50LmNhLmNydDAjBgNVHRIEHDAahhhodHRwOi8vd3d3LnN0YXJ0c3NsLmNvbS8wDQYJ KoZIhvcNAQELBQADggEBAHsw8/Hw07gsNTKYnld74NBFtHnQOPkXYuccWx3j0PGQe9nqNxeingBf 2yvx+xBQzBoi4J1u84Jbrbe8Ii3+LLD/QMW9cN0SBIgRStPQLVee4STdjeabGmpXQa7omC02wYYO 83qh6CgJEIbmrsBSZH8ZSVrjkC4UmZS8wAQMS3qTWAPF0ZQGWx2+Gks2fXuacyt2LpNR+p9ogjAZ 1/rmUKjNhQZLswytaLRUdwAwSfQ3+TNs68h6Kv1LC3bNGBT3NEtr2q/nzzb5MzuFcDE6f9exroAC 4BHmokAprhna/vZdb6BrPjpXgRAlWAh3wEMxw75M9S/Nbzj/jNp+I+lvUJYwggY0MIIEHKADAgEC AgEeMA0GCSqGSIb3DQEBBQUAMH0xCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQu MSswKQYDVQQLEyJTZWN1cmUgRGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMSkwJwYDVQQDEyBT dGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eTAeFw0wNzEwMjQyMTAxNTVaFw0xNzEwMjQy MTAxNTVaMIGMMQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMi U2VjdXJlIERpZ2l0YWwgQ2VydGlmaWNhdGUgU2lnbmluZzE4MDYGA1UEAxMvU3RhcnRDb20gQ2xh c3MgMSBQcmltYXJ5IEludGVybWVkaWF0ZSBDbGllbnQgQ0EwggEiMA0GCSqGSIb3DQEBAQUAA4IB DwAwggEKAoIBAQDHCYPMzi3YGrEppC4Tq5a+ijKDjKaIQZZVR63UbxIP6uq/I0fhCu+cQhoUfE6E RKKnu8zPf1Jwuk0tsvVCk6U9b+0UjM0dLep3ZdE1gblK/1FwYT5Pipsu2yOMluLqwvsuz9/9f1+1 PKHG/FaR/wpbfuIqu54qzHDYeqiUfsYzoVflR80DAC7hmJ+SmZnNTWyUGHJbBpA8Q89lGxahNvur yGaC/o2/ceD2uYDX9U8Eg5DpIpGQdcbQeGarV04WgAUjjXX5r/2dabmtxWMZwhZna//jdiSyrrSM TGKkDiXm6/3/4ebfeZuCYKzN2P8O2F/Xe2AC/Y7zeEsnR7FOp+uXAgMBAAGjggGtMIIBqTAPBgNV HRMBAf8EBTADAQH/MA4GA1UdDwEB/wQEAwIBBjAdBgNVHQ4EFgQUU3Ltkpzg2ssBXHx+ljVO8tS4 UYIwHwYDVR0jBBgwFoAUTgvvGqRAW6UXaYcwyjRoQ9BBrvIwZgYIKwYBBQUHAQEEWjBYMCcGCCsG AQUFBzABhhtodHRwOi8vb2NzcC5zdGFydHNzbC5jb20vY2EwLQYIKwYBBQUHMAKGIWh0dHA6Ly93 d3cuc3RhcnRzc2wuY29tL3Nmc2NhLmNydDBbBgNVHR8EVDBSMCegJaAjhiFodHRwOi8vd3d3LnN0 YXJ0c3NsLmNvbS9zZnNjYS5jcmwwJ6AloCOGIWh0dHA6Ly9jcmwuc3RhcnRzc2wuY29tL3Nmc2Nh LmNybDCBgAYDVR0gBHkwdzB1BgsrBgEEAYG1NwECATBmMC4GCCsGAQUFBwIBFiJodHRwOi8vd3d3 LnN0YXJ0c3NsLmNvbS9wb2xpY3kucGRmMDQGCCsGAQUFBwIBFihodHRwOi8vd3d3LnN0YXJ0c3Ns LmNvbS9pbnRlcm1lZGlhdGUucGRmMA0GCSqGSIb3DQEBBQUAA4ICAQAKgwh9eKssBly4Y4xerhy5 I3dNoXHYfYa8PlVLL/qtXnkFgdtY1o95CfegFJTwqBBmf8pyTUnFsukDFUI22zF5bVHzuJ+GxhnS qN2sD1qetbYwBYK2iyYA5Pg7Er1A+hKMIzEzcduRkIMmCeUTyMyikfbUFvIBivtvkR8ZFAk22BZy +pJfAoedO61HTz4qSfQoCRcLN5A0t4DkuVhTMXIzuQ8CnykhExD6x4e6ebIbrjZLb7L+ocR0y4Yj Cl/Pd4MXU91y0vTipgr/O75CDUHDRHCCKBVmz/Rzkc/b970MEeHt5LC3NiWTgBSvrLEuVzBKM586 YoRD9Dy3OHQgWI270g+5MYA8GfgI/EPT5G7xPbCDz+zjdH89PeR3U4So4lSXur6H6vp+m9TQXPF3 a0LwZrp8MQ+Z77U1uL7TelWO5lApsbAonrqASfTpaprFVkL4nyGH+NHST2ZJPWIBk81i6Vw0ny0q ZW2Niy/QvVNKbb43A43ny076khXO7cNbBIRdJ/6qQNq9Bqb5C0Q5nEsFcj75oxQRqlKf6TcvGbjx kJh8BYtv9ePsXklAxtm8J7GCUBthHSQgepbkOexhJ0wP8imUkyiPHQ0GvEnd83129fZjoEhdGwXV 27ioRKbj/cIq7JRXun0NbeY+UdMYu9jGfIpDLtUUGSgsg2zMGs5R4jCCB8kwggWxoAMCAQICAQEw DQYJKoZIhvcNAQEFBQAwfTELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzAp BgNVBAsTIlNlY3VyZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxKTAnBgNVBAMTIFN0YXJ0 Q29tIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTA2MDkxNzE5NDYzNloXDTM2MDkxNzE5NDYz NlowfTELMAkGA1UEBhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3Vy ZSBEaWdpdGFsIENlcnRpZmljYXRlIFNpZ25pbmcxKTAnBgNVBAMTIFN0YXJ0Q29tIENlcnRpZmlj YXRpb24gQXV0aG9yaXR5MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAwYjbCbxsRnx4 n5V7tTOQ8nJi1sE2ICIkXs7pd/JDCqIGZKTMjjb4OOYj8G5tsTzdcqOFHKHTPbQzK9Mvr/7qsEFZ Z7bEBn0KnnSF1nlMgDd63zkFUln39BtGQ6TShYXSw3HzdWI0uiyKfx6P7u000BHHls1SPboz1t1N 3gs7SkufwiYv+rUWHHI1d8o8XebK4SaLGjZ2XAHbdBQl/u21oIgP3XjKLR8HlzABLXJ5+kbWEyqo uaarg0kd5fLv3eQBjhgKj2NTFoViqQ4ZOsy1ZqbCa3QH5Cvhdj60bdj2ROFzYh87xL6gU1YlbFEJ 96qryr92/W2b853bvz1mvAxWqq+YSJU6S9+nWFDZOHWpW+pDDAL/mevobE1wWyllnN2qXcyvATHs DOvSjejqnHvmbvcnZgwaSNduQuM/3iE+e+ENcPtjqqhsGlS0XCV6yaLJixamuyx+F14FTVhuEh0B 7hIQDcYyfxj//PT6zW6R6DZJvhpIaYvClk0aErJpF8EKkNb6eSJIv7p7afhwx/p6N9jYDdJ2T1f/ kLfjkdLd78Jgt2c63f6qnPDUi39yIs7Gn5e2+K+KoBCo2fsYxra1XFI8ibYZKnMBCg8DsxJg8nov gdujbv8mMJf1i92JV7atPbOvK8W3dgLwpdYrmoYUKnL24zOMXQlLE9+7jHQTUksCAwEAAaOCAlIw ggJOMAwGA1UdEwQFMAMBAf8wCwYDVR0PBAQDAgGuMB0GA1UdDgQWBBROC+8apEBbpRdphzDKNGhD 0EGu8jBkBgNVHR8EXTBbMCygKqAohiZodHRwOi8vY2VydC5zdGFydGNvbS5vcmcvc2ZzY2EtY3Js LmNybDAroCmgJ4YlaHR0cDovL2NybC5zdGFydGNvbS5vcmcvc2ZzY2EtY3JsLmNybDCCAV0GA1Ud IASCAVQwggFQMIIBTAYLKwYBBAGBtTcBAQEwggE7MC8GCCsGAQUFBwIBFiNodHRwOi8vY2VydC5z dGFydGNvbS5vcmcvcG9saWN5LnBkZjA1BggrBgEFBQcCARYpaHR0cDovL2NlcnQuc3RhcnRjb20u b3JnL2ludGVybWVkaWF0ZS5wZGYwgdAGCCsGAQUFBwICMIHDMCcWIFN0YXJ0IENvbW1lcmNpYWwg KFN0YXJ0Q29tKSBMdGQuMAMCAQEagZdMaW1pdGVkIExpYWJpbGl0eSwgcmVhZCB0aGUgc2VjdGlv biAqTGVnYWwgTGltaXRhdGlvbnMqIG9mIHRoZSBTdGFydENvbSBDZXJ0aWZpY2F0aW9uIEF1dGhv cml0eSBQb2xpY3kgYXZhaWxhYmxlIGF0IGh0dHA6Ly9jZXJ0LnN0YXJ0Y29tLm9yZy9wb2xpY3ku cGRmMBEGCWCGSAGG+EIBAQQEAwIABzA4BglghkgBhvhCAQ0EKxYpU3RhcnRDb20gRnJlZSBTU0wg Q2VydGlmaWNhdGlvbiBBdXRob3JpdHkwDQYJKoZIhvcNAQEFBQADggIBABZsmfRmDDT10IVefQrs 2hBOOBxe36YlBUuRMsHoO/E93UQJWwdJiinLZgK3sZr3JZgJPI4b4d02hytLu2jTOWY9oCbH8jmR HVGrgnt+1c5a5OIDV3Bplwj5XlimCt+MBppFFhY4Cl5X9mLHegIF5rwetfKe9Kkpg/iyFONuKIdE w5Aa3jipPKxDTWRFzt0oqVzyc3sE+Bfoq7HzLlxkbnMxOhK4vLMR5H2PgVGaO42J9E2TZns8A+3T mh2a82VQ9aDQdZ8vr/DqgkOY+GmciXnEQ45GcuNkNhKv9yUeOImQd37Da2q5w8tES6x4kIvnxywe SxFEyDRSJ80KXZ+FwYnVGnjylRBTMt2AhGZ12bVoKPthLr6EqDjAmRKGpR5nZK0GLi+pcIXHlg98 iWX1jkNUDqvdpYA5lGDANMmWcCyjEvUfSHu9HH5rt52Q9CI7rvj8Ksr6glKg769LVZPrwbXwIous NE4mIgShhyx1SrflfRPXuAxkwDbSyS+GEowjCcEbgjtzSaNqV4eU5dZ4xZlDY+NN4Hct4WWZcmkE GkcJ5g8BViT7H78OealYLrnECQF+lbptAAY+supKEDnY0Cv1v+x1v5cCxQkbCNxVN+KB+zeEQ2Ig yudWS2Xq/mzBJJMkoTTrBf+aIq6bfT/xZVEKpjBqs/SIHIAN/HKK6INeMYIDbzCCA2sCAQEwgZQw gYwxCzAJBgNVBAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYDVQQLEyJTZWN1cmUg RGlnaXRhbCBDZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENvbSBDbGFzcyAxIFBy aW1hcnkgSW50ZXJtZWRpYXRlIENsaWVudCBDQQIDBqKOMAkGBSsOAwIaBQCgggGvMBgGCSqGSIb3 DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTE0MDIwMzEwMTkwNVowIwYJKoZIhvcN AQkEMRYEFOLe4d5VYhAPr1pY7+zGEEyYl1LPMIGlBgkrBgEEAYI3EAQxgZcwgZQwgYwxCzAJBgNV BAYTAklMMRYwFAYDVQQKEw1TdGFydENvbSBMdGQuMSswKQYDVQQLEyJTZWN1cmUgRGlnaXRhbCBD ZXJ0aWZpY2F0ZSBTaWduaW5nMTgwNgYDVQQDEy9TdGFydENvbSBDbGFzcyAxIFByaW1hcnkgSW50 ZXJtZWRpYXRlIENsaWVudCBDQQIDBqKOMIGnBgsqhkiG9w0BCRACCzGBl6CBlDCBjDELMAkGA1UE BhMCSUwxFjAUBgNVBAoTDVN0YXJ0Q29tIEx0ZC4xKzApBgNVBAsTIlNlY3VyZSBEaWdpdGFsIENl cnRpZmljYXRlIFNpZ25pbmcxODA2BgNVBAMTL1N0YXJ0Q29tIENsYXNzIDEgUHJpbWFyeSBJbnRl cm1lZGlhdGUgQ2xpZW50IENBAgMGoo4wDQYJKoZIhvcNAQEBBQAEggEAm79NhInduqhlEdtRWhIM lryXLDS5ASznxef6kNKHNJ8k06A0mYxpFqaF+VQuxsVb8mfNu3Ue3WqDU8/rbOHWLmcd/U57AkTe Uq6nLxjQv0uXIysFAQHxMArMGuj+6Yx/pRHqCM8sJETRLDYtOiWw34AX10UmfP29PziJir2jf/ez 5btBjzw+QHbVAFVWpBSJO1yPjnFl4UPGYuKUNITr85iB7DUC+FR5tibrFb7I7xILXy9/YvkEzfuw hKP2XABRoOm45jiFWlN1AIKrYxeTMYbrAkEvs0mURD0xhSGaAuJIkCDDg9vdvu2lMByGPEfcnZn9 TyOLlsqjWJraoy2NuQAAAAAAAA== --Apple-Mail-A7136F4D-2CDB-4270-8AEB-610F01220791-- From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 10:20:43 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6064CD5C for ; Mon, 3 Feb 2014 10:20:43 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9ECFB162E for ; Mon, 3 Feb 2014 10:20:42 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id ACCE8102EB5 for ; Mon, 3 Feb 2014 11:20:40 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=Oa/oN0t ZkrzvB1ppm2Tqd35rKSBZ1IWRhvFFhcorrP8=; b=cntnqh0DJizZvKdN2E6uC3R W3WC5Ph5qnfo9DwW8m5lGheyhLGf8qiDc6rJ5ZCEGMvfYjhxH57PPOoC6bfSJXVK Nya6dxvbQzEwwDPHlnxBXoVL2eo35T4Ck/qj97wpRQKliv6mp17GeV0aQhq5mabu rgeg690hQYIR8fWEjRXU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=j ModRVqZFjloXWUNTBxrwX9UWXd/XjnzPE6mSx8ODpGv7AVA9l3vNd8Vk6jUU8TkD nLuxi6UloUgJJrAUJfuCSaMeLsJrLcxDJi9Ubb9goP8dThYAwrngbNQvlzYlpNsi JfbYhu2l6gO6nRe+eoeF57huaHYV2aBTt7p3uaa9uI= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id 64073102EB4 for ; Mon, 3 Feb 2014 11:20:40 +0100 (CET) Message-ID: <52EF6D61.7010505@niessen.ch> Date: Mon, 03 Feb 2014 11:20:17 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> In-Reply-To: <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:20:43 -0000 Hi, It was Juniper's active/passive mode regarding LACP. It was set to passive and worked as you described without sending any=20 packages. Now it was set to active and works perfectly again. I couldn't try your patch easily as I didn't have the sources installed=20 (and obviously no network connection). If the time allows I will try your patch anyway. Thanks for your help! Regards Ben On 03.02.2014 10:58, Scott Long wrote: > Hi, > > If you can, please test the patch I sent and let me know the results. = I=92ll check it into FreeBSD 11 and 10 if it works for you. > > Thanks, > Scott > > On Feb 3, 2014, at 2:51 AM, Ben wrote: > >> Thank you for your detailed explanation. >> >> If I understand correctly the switch is probably not set up correctly,= right? >> >> I will try to have it configured correctly first. >> >> Thanks a lot for your help! >> >> Regards >> Ben >> >> On 03.02.2014 10:45, Scott Long wrote: >>> Ok, please try the patch I emailed earlier. Since you=92re not seein= g any receive messages, it means that your switch isn=92t generating any = LACP heartbeats. The difference between FreeBSD 9.x and 10 is that in 9.= x, it ran in =93optimistic=94 mode, meaning that it didn=92t rely on gett= ing receive messages from the switch, and only took a channel down if the= link state went down. In strict mode, it looks for the receive messages= and only transitions to a full operational state if it gets them. So wh= ile I know it=92s easy to point at the problem being FreeBSD 10, seeing a= s FreeBSD 9 worked for you, please check to make sure that your switch is= set up correctly. >>> >>> I authored the original change that went into FreeBSD 10, and I tried= to make it so that strict_mode=3D0 would keep everything working as it d= id in 9. I guess that since you=92re getting no receive messages from th= e switch at all that we need to disable strict mode on setup, not afterwa= rds. Apply the patch and everything should work as it did in FreeBSD 9. >>> >>> Scott >>> >>> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>> >>>> Yes, via sysctl and /etc/sysctl.conf >>>> >>>> I waited now roughly 20 minutes without touching it but no differenc= e. >>>> >>>> No, I only see these transmit messages, no receive. >>>> >>>> Thanks >>>> Ben >>>> >>>> On 03.02.2014 10:25, Scott Long wrote: >>>>> Did you set it to 0 via the sysctl? You might need to wait for sev= eral minutes if you set it after setting up the links. >>>>> >>>>> Also, the message that you=92re seeing is from your machine transmi= tting PDU packets. Are you seeing any "lacpdu receive=94 messages on the= console? >>>>> >>>>> Thanks, >>>>> Scott >>>>> >>>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>>> >>>>>> igb0: lacpdu transmit >>>>>> actor=3D(...) >>>>>> actor.state=3D4d >>>>>> partner=3D(...) >>>>>> partner.state=3D0 >>>>>> maxdelay=3D0 >>>>>> >>>>>> Thanks >>>>>> Ben >>>>>> >>>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD 1= 0.1. If the sysctl doesn=92t help then maybe consider compiling a custom = kernel with it defaulted to 0. You=92ll need to open /sys/net/ieee802ad_= lacp.c and look for the function lacp_attach(). You=92ll see the strict_= mode assign underneath that. I=92ll also send you a patch in a few minut= es. Until then, try enabling net.link.lagg.lacp.debug=3D1 and see if you= =92re receiving heartbeat PDU=92s from your switch. >>>>>>> >>>>>>> Scott >>>>>>> >>>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>> >>>>>>>> Hi Scott, >>>>>>>> >>>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt wor= k. But will I try again and report back. >>>>>>>> >>>>>>>> The settings of the switch have not been changed and are set to = LACP. It worked before so I guess the switch should not be the problem. M= aybe some incompatibility between FreeBSD + igb-driver + switch (Juniper = EX3300-48T). >>>>>>>> >>>>>>>> I will update you after setting the sysctl setting. It seems to = be "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I sw= itch off the strict mode globally in /etc/sysctl.conf? >>>>>>>> >>>>>>>> Thanks for your help. >>>>>>>> >>>>>>>> Regards >>>>>>>> Ben >>>>>>>> >>>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> You=92re probably running into the consequences of r253687. Ch= eck to see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92= . If it=92s =911=92 then set it to 0. My original intention was for this= to default to 0, but apparently that didn=92t happen. However, the fact= that strict mode doesn=92t seem to work at all for you might hint that y= our switch either isn=92t configured correctly for LACP, or doesn=92t act= ually support LACP at all. You might want to investigate that. >>>>>>>>> >>>>>>>>> Scott >>>>>>>>> >>>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote= : >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9= .2 was configured to use LACP with two igb devices. >>>>>>>>>> >>>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>>> >>>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeB= SD 10..0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT= 1dVPM >>>>>>>>>> >>>>>>>>>> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cg= i?pr=3Dkern/185967 >>>>>>>>>> >>>>>>>>>> It is set to low, but I would like somebody to have a look int= o it as it obviously has a great influence on our infrastructure. The onl= y way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>>>> >>>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>>> >>>>>>>>>> Thank you for your help. >>>>>>>>>> >>>>>>>>>> Best regards >>>>>>>>>> Ben >>>>>>>>>> _______________________________________________ >>>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@free= bsd..org" >>>>>>>>> _______________________________________________ >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freeb= sd.org" >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebs= d.org" >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd= .org" >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.= org" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>> >>> >>> >>> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef691c888821141715696! > > From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 10:29:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8E1B5EB3 for ; Mon, 3 Feb 2014 10:29:20 +0000 (UTC) Received: from mail-pb0-x22e.google.com (mail-pb0-x22e.google.com [IPv6:2607:f8b0:400e:c01::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5AB331683 for ; Mon, 3 Feb 2014 10:29:20 +0000 (UTC) Received: by mail-pb0-f46.google.com with SMTP id um1so6858707pbc.19 for ; Mon, 03 Feb 2014 02:29:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=AY528EE7qIgZpFwDctJKk0PYzxaT2u3hLRqLFeDbUI0=; b=w0412r9vWVK0U9TkKDNuFTrlpTT6kb43cwyQHMWaJRkjH+h9bRT0OYWm/NPaR8R0y0 PnodcOsI8LrcirHUL+sCH8AC2N5yH5JqOgYO4Sw7uKkeUWrQWHvsQ6TM2Wuy8rtf0HsN 28ZlTpcqsqmCBoXhiUfhQFWCeQEyCNweljjPOLktE8kwTfLoBDETwSaKb6t5fyagvr6o BLV71fcfGOEy7t5eYcWrjNfCiwf7WB1RuquXoPUd52FaThK6/f+gWVDWF6PKtbSgLLaP 178UCITsQ1JgFRxcH5ZS97N5RVcgSmsT6pHx1coIG8iMuZx2bljMceOGqOXOMdE6VUpB D4Cw== X-Received: by 10.68.130.169 with SMTP id of9mr36480366pbb.79.1391423359857; Mon, 03 Feb 2014 02:29:19 -0800 (PST) Received: from [192.168.1.7] (ppp59-167-128-11.static.internode.on.net. [59.167.128.11]) by mx.google.com with ESMTPSA id ns7sm54501645pbc.32.2014.02.03.02.29.16 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Feb 2014 02:29:19 -0800 (PST) Message-ID: <52EF6F7A.40101@FreeBSD.org> Date: Mon, 03 Feb 2014 21:29:14 +1100 From: Kubilay Kocak User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Thunderbird/27.0 MIME-Version: 1.0 To: Ben , freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> <52EF6D61.7010505@niessen.ch> In-Reply-To: <52EF6D61.7010505@niessen.ch> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: koobs@FreeBSD.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:29:20 -0000 On 3/02/2014 9:20 PM, Ben wrote: > Hi, > > It was Juniper's active/passive mode regarding LACP. > > It was set to passive and worked as you described without sending any > packages. Now it was set to active and works perfectly again. > > I couldn't try your patch easily as I didn't have the sources installed > (and obviously no network connection). > > If the time allows I will try your patch anyway. It would be *great* if you could do that Ben :) Having a successful real-world test case will provide Scott the confidence to land a commit and merge it back to stable/10 so that everyone can benefit as soon as possible. > Thanks for your help! > > Regards > Ben > > On 03.02.2014 10:58, Scott Long wrote: >> Hi, >> >> If you can, please test the patch I sent and let me know the results. >> I’ll check it into FreeBSD 11 and 10 if it works for you. >> >> Thanks, >> Scott >> >> On Feb 3, 2014, at 2:51 AM, Ben wrote: >> >>> Thank you for your detailed explanation. >>> >>> If I understand correctly the switch is probably not set up >>> correctly, right? >>> >>> I will try to have it configured correctly first. >>> >>> Thanks a lot for your help! >>> >>> Regards >>> Ben >>> >>> On 03.02.2014 10:45, Scott Long wrote: >>>> Ok, please try the patch I emailed earlier. Since you’re not seeing >>>> any receive messages, it means that your switch isn’t generating any >>>> LACP heartbeats. The difference between FreeBSD 9.x and 10 is that >>>> in 9.x, it ran in “optimistic” mode, meaning that it didn’t rely on >>>> getting receive messages from the switch, and only took a channel >>>> down if the link state went down. In strict mode, it looks for the >>>> receive messages and only transitions to a full operational state if >>>> it gets them. So while I know it’s easy to point at the problem >>>> being FreeBSD 10, seeing as FreeBSD 9 worked for you, please check >>>> to make sure that your switch is set up correctly. >>>> >>>> I authored the original change that went into FreeBSD 10, and I >>>> tried to make it so that strict_mode=0 would keep everything working >>>> as it did in 9. I guess that since you’re getting no receive >>>> messages from the switch at all that we need to disable strict mode >>>> on setup, not afterwards. Apply the patch and everything should >>>> work as it did in FreeBSD 9. >>>> >>>> Scott >>>> >>>> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>>> >>>>> Yes, via sysctl and /etc/sysctl.conf >>>>> >>>>> I waited now roughly 20 minutes without touching it but no difference. >>>>> >>>>> No, I only see these transmit messages, no receive. >>>>> >>>>> Thanks >>>>> Ben >>>>> >>>>> On 03.02.2014 10:25, Scott Long wrote: >>>>>> Did you set it to 0 via the sysctl? You might need to wait for >>>>>> several minutes if you set it after setting up the links. >>>>>> >>>>>> Also, the message that you’re seeing is from your machine >>>>>> transmitting PDU packets. Are you seeing any "lacpdu receive” >>>>>> messages on the console? >>>>>> >>>>>> Thanks, >>>>>> Scott >>>>>> >>>>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>>>> >>>>>>> igb0: lacpdu transmit >>>>>>> actor=(...) >>>>>>> actor.state=4d >>>>>>> partner=(...) >>>>>>> partner.state=0 >>>>>>> maxdelay=0 >>>>>>> >>>>>>> Thanks >>>>>>> Ben >>>>>>> >>>>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Unfortunately, you can’t control the strict mode globally. My >>>>>>>> apologies for this mess, I’ll make sure that it’s fixed for >>>>>>>> FreeBSD 10.1. If the sysctl doesn’t help then maybe consider >>>>>>>> compiling a custom kernel with it defaulted to 0. You’ll need >>>>>>>> to open /sys/net/ieee802ad_lacp.c and look for the function >>>>>>>> lacp_attach(). You’ll see the strict_mode assign underneath >>>>>>>> that. I’ll also send you a patch in a few minutes. Until then, >>>>>>>> try enabling net.link.lagg.lacp.debug=1 and see if you’re >>>>>>>> receiving heartbeat PDU’s from your switch. >>>>>>>> >>>>>>>> Scott >>>>>>>> >>>>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>>> >>>>>>>>> Hi Scott, >>>>>>>>> >>>>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt >>>>>>>>> work. But will I try again and report back. >>>>>>>>> >>>>>>>>> The settings of the switch have not been changed and are set to >>>>>>>>> LACP. It worked before so I guess the switch should not be the >>>>>>>>> problem. Maybe some incompatibility between FreeBSD + >>>>>>>>> igb-driver + switch (Juniper EX3300-48T). >>>>>>>>> >>>>>>>>> I will update you after setting the sysctl setting. It seems to >>>>>>>>> be "dynamic", I guess 0 reflects the index of LACP lagg >>>>>>>>> devices. Can I switch off the strict mode globally in >>>>>>>>> /etc/sysctl.conf? >>>>>>>>> >>>>>>>>> Thanks for your help. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Ben >>>>>>>>> >>>>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> You’re probably running into the consequences of r253687. >>>>>>>>>> Check to see the value of ‘sysctl >>>>>>>>>> net.link.lagg.0.lacp.lacp_strict_mode’. If it’s ‘1’ then set >>>>>>>>>> it to 0. My original intention was for this to default to 0, >>>>>>>>>> but apparently that didn’t happen. However, the fact that >>>>>>>>>> strict mode doesn’t seem to work at all for you might hint >>>>>>>>>> that your switch either isn’t configured correctly for LACP, >>>>>>>>>> or doesn’t actually support LACP at all. You might want to >>>>>>>>>> investigate that. >>>>>>>>>> >>>>>>>>>> Scott >>>>>>>>>> >>>>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD >>>>>>>>>>> 9.2 was configured to use LACP with two igb devices. >>>>>>>>>>> >>>>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>>>> >>>>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to >>>>>>>>>>> FreeBSD 10..0-RELEASE: >>>>>>>>>>> http://tinypic.com/view.php?pic=28jvgpw&s=5#.Uu9PXT1dVPM >>>>>>>>>>> >>>>>>>>>>> A PR is currently open: >>>>>>>>>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/185967 >>>>>>>>>>> >>>>>>>>>>> It is set to low, but I would like somebody to have a look >>>>>>>>>>> into it as it obviously has a great influence on our >>>>>>>>>>> infrastructure. The only way to "solve" it is currently >>>>>>>>>>> switching back to FreeBSD 9.2. >>>>>>>>>>> >>>>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>>>> >>>>>>>>>>> Thank you for your help. >>>>>>>>>>> >>>>>>>>>>> Best regards >>>>>>>>>>> Ben >>>>>>>>>>> _______________________________________________ -- Koobs From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 11:06:50 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 82E2A118 for ; Mon, 3 Feb 2014 11:06:50 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6DB501A4F for ; Mon, 3 Feb 2014 11:06:50 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s13B6ogx022702 for ; Mon, 3 Feb 2014 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s13B6oP5022700 for freebsd-net@FreeBSD.org; Mon, 3 Feb 2014 11:06:50 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 3 Feb 2014 11:06:50 GMT Message-Id: <201402031106.s13B6oP5022700@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-net@FreeBSD.org Subject: Current problem reports assigned to freebsd-net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 11:06:50 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/185496 net [re] RTL8169 doesn't receive unicast ethernet packets o kern/185427 net [igb] freebsd 8.4, 9.1 and 9.2 panic Double-Fault with o kern/185023 net [tun] Closing tun interface deconfigures IP address o kern/185022 net [tun] ls /dev/tun creates tun interface o kern/184311 net [bge] [panic] kernel panic with bge(4) on SunFire X210 o kern/184084 net [ral] kernel crash by ral (RT3090) o bin/183687 net [patch] route(8): route add -net 172.20 add wrong host p kern/183659 net [tcp] ]TCP stack lock contention with short-lived conn o conf/183407 net [rc.d] [patch] Routing restart returns non-zero exitco o kern/183391 net [oce] 10gigabit networking problems with Emulex OCE 11 o kern/183390 net [ixgbe] 10gigabit networking problems o kern/182917 net [igb] strange out traffic with igb interfaces o kern/182847 net [netinet6] [patch] Remove dead code o kern/182665 net [wlan] Kernel panic when creating second wlandev. o kern/182382 net [tcp] sysctl to set TCP CC method on BIG ENDIAN system o kern/182297 net [cm] ArcNet driver fails to detect the link address - o kern/182212 net [patch] [ng_mppc] ng_mppc(4) blocks on network errors o kern/181970 net [re] LAN Realtek® 8111G is not supported by re driver o kern/181931 net [vlan] [lagg] vlan over lagg over mlxen crashes the ke o kern/181823 net [ip6] [patch] make ipv6 mroute return same errror code o kern/181741 net [kernel] [patch] Packet loss when 'control' messages a o kern/181703 net [re] [patch] Fix Realtek 8111G Ethernet controller not o kern/181657 net [bpf] [patch] BPF_COP/BPF_COPX instruction reservation o kern/181257 net [bge] bge link status change o kern/181236 net [igb] igb driver unstable work o kern/181135 net [netmap] [patch] sys/dev/netmap patch for Linux compat o kern/181131 net [netmap] [patch] sys/dev/netmap memory allocation impr o kern/181006 net [run] [patch] mbuf leak in run(4) driver o kern/180893 net [if_ethersubr] [patch] Packets received with own LLADD o kern/180844 net [panic] [re] Intermittent panic (re driver?) o kern/180775 net [bxe] if_bxe driver broken with Broadcom BCM57711 card o kern/180722 net [bluetooth] bluetooth takes 30-50 attempts to pair to s kern/180468 net [request] LOCAL_PEERCRED support for PF_INET o kern/180065 net [netinet6] [patch] Multicast loopback to own host brok o kern/179926 net [lacp] [patch] active aggregator selection bug o kern/179824 net [ixgbe] System (9.1-p4) hangs on heavy ixgbe network t o kern/179733 net [lagg] [patch] interface loses capabilities when proto o kern/179429 net [tap] STP enabled tap bridge a kern/179264 net [vimage] [pf] Core dump with Packet filter and VIMAGE o kern/178947 net [arp] arp rejecting not working o kern/178782 net [ixgbe] 82599EB SFP does not work with passthrough und o kern/178612 net [run] kernel panic due the problems with run driver o kern/178472 net [ip6] [patch] make return code consistent with IPv4 co o kern/178079 net [tcp] Switching TCP CC algorithm panics on sparc64 wit s kern/178071 net FreeBSD unable to recongize Kontron (Industrial Comput o kern/177905 net [xl] [panic] ifmedia_set when pluging CardBus LAN card o kern/177618 net [bridge] Problem with bridge firewall with trunk ports o kern/177402 net [igb] [pf] problem with ethernet driver igb + pf / alt o kern/177400 net [jme] JMC25x 1000baseT establishment issues o kern/177366 net [ieee80211] negative malloc(9) statistics for 80211nod f kern/177362 net [netinet] [patch] Wrong control used to return TOS o kern/177194 net [netgraph] Unnamed netgraph nodes for vlan interfaces o kern/177184 net [bge] [patch] enable wake on lan o kern/177139 net [igb] igb drops ethernet ports 2 and 3 o kern/176884 net [re] re0 flapping up/down o kern/176671 net [epair] MAC address for epair device not unique o kern/176484 net [ipsec] [enc] [patch] panic: IPsec + enc(4); device na o kern/176420 net [kernel] [patch] incorrect errno for LOCAL_PEERCRED o kern/176419 net [kernel] [patch] socketpair support for LOCAL_PEERCRED o kern/176401 net [netgraph] page fault in netgraph o kern/176167 net [ipsec][lagg] using lagg and ipsec causes immediate pa o kern/176027 net [em] [patch] flow control systcl consistency for em dr o kern/176026 net [tcp] [patch] TCP wrappers caused quite a lot of warni o kern/175864 net [re] Intel MB D510MO, onboard ethernet not working aft o kern/175852 net [amd64] [patch] in_cksum_hdr() behaves differently on o kern/175734 net no ethernet detected on system with EG20T PCH chipset o kern/175267 net [pf] [tap] pf + tap keep state problem o kern/175236 net [epair] [gif] epair and gif Devices On Bridge o kern/175182 net [panic] kernel panic on RADIX_MPATH when deleting rout o kern/175153 net [tcp] will there miss a FIN when do TSO? o kern/174959 net [net] [patch] rnh_walktree_from visits spurious nodes o kern/174958 net [net] [patch] rnh_walktree_from makes unreasonable ass o kern/174897 net [route] Interface routes are broken o kern/174851 net [bxe] [patch] UDP checksum offload is wrong in bxe dri o kern/174850 net [bxe] [patch] bxe driver does not receive multicasts o kern/174849 net [bxe] [patch] bxe driver can hang kernel when reset o kern/174822 net [tcp] Page fault in tcp_discardcb under high traffic o kern/174602 net [gif] [ipsec] traceroute issue on gif tunnel with ipse o kern/174535 net [tcp] TCP fast retransmit feature works strange o kern/173871 net [gif] process of 'ifconfig gif0 create hangs' when if_ o kern/173475 net [tun] tun(4) stays opened by PID after process is term o kern/173201 net [ixgbe] [patch] Missing / broken ixgbe sysctl's and tu o kern/173137 net [em] em(4) unable to run at gigabit with 9.1-RC2 o kern/173002 net [patch] data type size problem in if_spppsubr.c o kern/172895 net [ixgb] [ixgbe] do not properly determine link-state o kern/172683 net [ip6] Duplicate IPv6 Link Local Addresses o kern/172675 net [netinet] [patch] sysctl_tcp_hc_list (net.inet.tcp.hos p kern/172113 net [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4 o kern/171840 net [ip6] IPv6 packets transmitting only on queue 0 o kern/171739 net [bce] [panic] bce related kernel panic o kern/171711 net [dummynet] [panic] Kernel panic in dummynet o kern/171532 net [ndis] ndis(4) driver includes 'pccard'-specific code, o kern/171531 net [ndis] undocumented dependency for ndis(4) o kern/171524 net [ipmi] ipmi driver crashes kernel by reboot or shutdow s kern/171508 net [epair] [request] Add the ability to name epair device o kern/171228 net [re] [patch] if_re - eeprom write issues o kern/170701 net [ppp] killl ppp or reboot with active ppp connection c o kern/170267 net [ixgbe] IXGBE_LE32_TO_CPUS is probably an unintentiona o kern/170081 net [fxp] pf/nat/jails not working if checksum offloading o kern/169898 net ifconfig(8) fails to set MTU on multiple interfaces. o kern/169676 net [bge] [hang] system hangs, fully or partially after re o kern/169620 net [ng] [pf] ng_l2tp incoming packet bypass pf firewall o kern/169459 net [ppp] umodem/ppp/3g stopped working after update from p kern/168294 net [ixgbe] [patch] ixgbe driver compiled in kernel has no o kern/168246 net [em] Multiple em(4) not working with qemu o kern/168245 net [arp] [regression] Permanent ARP entry not deleted on o kern/168244 net [arp] [regression] Unable to manually remove permanent o kern/168183 net [bce] bce driver hang system o kern/167603 net [ip] IP fragment reassembly's broken: file transfer ov o kern/167500 net [em] [panic] Kernel panics in em driver o kern/167325 net [netinet] [patch] sosend sometimes return EINVAL with o kern/167202 net [igmp]: Sending multiple IGMP packets crashes kernel o kern/166462 net [gre] gre(4) when using a tunnel source address from c o kern/166285 net [arp] FreeBSD v8.1 REL p8 arp: unknown hardware addres o kern/166255 net [net] [patch] It should be possible to disable "promis p kern/165903 net mbuf leak o kern/165622 net [ndis][panic][patch] Unregistered use of FPU in kernel s kern/165562 net [request] add support for Intel i350 in FreeBSD 7.4 o kern/165526 net [bxe] UDP packets checksum calculation whithin if_bxe o kern/165488 net [ppp] [panic] Fatal trap 12 jails and ppp , kernel wit o kern/165305 net [ip6] [request] Feature parity between IP_TOS and IPV6 o kern/165296 net [vlan] [patch] Fix EVL_APPLY_VLID, update EVL_APPLY_PR o kern/165181 net [igb] igb freezes after about 2 weeks of uptime o kern/165174 net [patch] [tap] allow tap(4) to keep its address on clos o kern/165152 net [ip6] Does not work through the issue of ipv6 addresse o kern/164495 net [igb] connect double head igb to switch cause system t o kern/164490 net [pfil] Incorrect IP checksum on pfil pass from ip_outp o kern/164475 net [gre] gre misses RUNNING flag after a reboot o kern/164265 net [netinet] [patch] tcp_lro_rx computes wrong checksum i o kern/163903 net [igb] "igb0:tx(0)","bpf interface lock" v2.2.5 9-STABL o kern/163481 net freebsd do not add itself to ping route packet o kern/162927 net [tun] Modem-PPP error ppp[1538]: tun0: Phase: Clearing o kern/162558 net [dummynet] [panic] seldom dummynet panics o kern/162153 net [em] intel em driver 7.2.4 don't compile o kern/162110 net [igb] [panic] RELENG_9 panics on boot in IGB driver - o kern/162028 net [ixgbe] [patch] misplaced #endif in ixgbe.c o kern/161277 net [em] [patch] BMC cannot receive IPMI traffic after loa o kern/160873 net [igb] igb(4) from HEAD fails to build on 7-STABLE o kern/160750 net Intel PRO/1000 connection breaks under load until rebo o kern/160693 net [gif] [em] Multicast packet are not passed from GIF0 t o kern/160293 net [ieee80211] ppanic] kernel panic during network setup o kern/160206 net [gif] gifX stops working after a while (IPv6 tunnel) o kern/159817 net [udp] write UDPv4: No buffer space available (code=55) o kern/159629 net [ipsec] [panic] kernel panic with IPsec in transport m o kern/159621 net [tcp] [panic] panic: soabort: so_count o kern/159603 net [netinet] [patch] in_ifscrubprefix() - network route c o kern/159601 net [netinet] [patch] in_scrubprefix() - loopback route re o kern/159294 net [em] em watchdog timeouts o kern/159203 net [wpi] Intel 3945ABG Wireless LAN not support IBSS o kern/158930 net [bpf] BPF element leak in ifp->bpf_if->bif_dlist o kern/158726 net [ip6] [patch] ICMPv6 Router Announcement flooding limi o kern/158694 net [ix] [lagg] ix0 is not working within lagg(4) o kern/158665 net [ip6] [panic] kernel pagefault in in6_setscope() o kern/158635 net [em] TSO breaks BPF packet captures with em driver f kern/157802 net [dummynet] [panic] kernel panic in dummynet o kern/157785 net amd64 + jail + ipfw + natd = very slow outbound traffi o kern/157418 net [em] em driver lockup during boot on Supermicro X9SCM- o kern/157410 net [ip6] IPv6 Router Advertisements Cause Excessive CPU U o kern/157287 net [re] [panic] INVARIANTS panic (Memory modified after f o kern/157200 net [network.subr] [patch] stf(4) can not communicate betw o kern/157182 net [lagg] lagg interface not working together with epair o kern/156877 net [dummynet] [panic] dummynet move_pkt() null ptr derefe o kern/156667 net [em] em0 fails to init on CURRENT after March 17 o kern/156408 net [vlan] Routing failure when using VLANs vs. Physical e o kern/156328 net [icmp]: host can ping other subnet but no have IP from o kern/156317 net [ip6] Wrong order of IPv6 NS DAD/MLD Report o kern/156279 net [if_bridge][divert][ipfw] unable to correctly re-injec o kern/156226 net [lagg]: failover does not announce the failover to swi o kern/156030 net [ip6] [panic] Crash in nd6_dad_start() due to null ptr o kern/155680 net [multicast] problems with multicast s kern/155642 net [new driver] [request] Add driver for Realtek RTL8191S o kern/155597 net [panic] Kernel panics with "sbdrop" message o kern/155420 net [vlan] adding vlan break existent vlan o kern/155177 net [route] [panic] Panic when inject routes in kernel o kern/155010 net [msk] ntfs-3g via iscsi using msk driver cause kernel o kern/154943 net [gif] ifconfig gifX create on existing gifX clears IP s kern/154851 net [new driver] [request]: Port brcm80211 driver from Lin o kern/154850 net [netgraph] [patch] ng_ether fails to name nodes when t o kern/154679 net [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R o kern/154600 net [tcp] [panic] Random kernel panics on tcp_output o kern/154557 net [tcp] Freeze tcp-session of the clients, if in the gat o kern/154443 net [if_bridge] Kernel module bridgestp.ko missing after u o kern/154286 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/154255 net [nfs] NFS not responding o kern/154214 net [stf] [panic] Panic when creating stf interface o kern/154185 net race condition in mb_dupcl p kern/154169 net [multicast] [ip6] Node Information Query multicast add o kern/154134 net [ip6] stuck kernel state in LISTEN on ipv6 daemon whic o kern/154091 net [netgraph] [panic] netgraph, unaligned mbuf? o conf/154062 net [vlan] [patch] change to way of auto-generatation of v o kern/153937 net [ral] ralink panics the system (amd64 freeBSDD 8.X) wh o kern/153936 net [ixgbe] [patch] MPRC workaround incorrectly applied to o kern/153816 net [ixgbe] ixgbe doesn't work properly with the Intel 10g o kern/153772 net [ixgbe] [patch] sysctls reference wrong XON/XOFF varia o kern/153497 net [netgraph] netgraph panic due to race conditions o kern/153454 net [patch] [wlan] [urtw] Support ad-hoc and hostap modes o kern/153308 net [em] em interface use 100% cpu o kern/153244 net [em] em(4) fails to send UDP to port 0xffff o kern/152893 net [netgraph] [panic] 8.2-PRERELEASE panic in netgraph o kern/152853 net [em] tftpd (and likely other udp traffic) fails over e o kern/152828 net [em] poor performance on 8.1, 8.2-PRE o kern/152569 net [net]: Multiple ppp connections and routing table prob o kern/152235 net [arp] Permanent local ARP entries are not properly upd o kern/152141 net [vlan] [patch] encapsulate vlan in ng_ether before out o kern/152036 net [libc] getifaddrs(3) returns truncated sockaddrs for n o kern/151690 net [ep] network connectivity won't work until dhclient is o kern/151681 net [nfs] NFS mount via IPv6 leads to hang on client with o kern/151593 net [igb] [panic] Kernel panic when bringing up igb networ o kern/150920 net [ixgbe][igb] Panic when packets are dropped with heade o kern/150557 net [igb] igb0: Watchdog timeout -- resetting o kern/150251 net [patch] [ixgbe] Late cable insertion broken o kern/150249 net [ixgbe] Media type detection broken o bin/150224 net ppp(8) does not reassign static IP after kill -KILL co f kern/149969 net [wlan] [ral] ralink rt2661 fails to maintain connectio o kern/149643 net [rum] device not sending proper beacon frames in ap mo o kern/149609 net [panic] reboot after adding second default route o kern/149117 net [inet] [patch] in_pcbbind: redundant test o kern/149086 net [multicast] Generic multicast join failure in 8.1 o kern/148018 net [flowtable] flowtable crashes on ia64 o kern/147912 net [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300 11 o kern/147894 net [ipsec] IPv6-in-IPv4 does not work inside an ESP-only o kern/147155 net [ip6] setfb not work with ipv6 o kern/146845 net [libc] close(2) returns error 54 (connection reset by f kern/146792 net [flowtable] flowcleaner 100% cpu's core load o kern/146719 net [pf] [panic] PF or dumynet kernel panic o kern/146534 net [icmp6] wrong source address in echo reply o kern/146427 net [mwl] Additional virtual access points don't work on m f kern/146394 net [vlan] IP source address for outgoing connections o bin/146377 net [ppp] [tun] Interface doesn't clear addresses when PPP o kern/146358 net [vlan] wrong destination MAC address o kern/146165 net [wlan] [panic] Setting bssid in adhoc mode causes pani o kern/146037 net [panic] mpd + CoA = kernel panic o kern/145825 net [panic] panic: soabort: so_count o kern/145728 net [lagg] Stops working lagg between two servers. p kern/145600 net TCP/ECN behaves different to CE/CWR than ns2 reference f kern/144917 net [flowtable] [panic] flowtable crashes system [regressi o kern/144882 net MacBookPro =>4.1 does not connect to BSD in hostap wit o kern/144874 net [if_bridge] [patch] if_bridge frees mbuf after pfil ho o conf/144700 net [rc.d] async dhclient breaks stuff for too many people o kern/144616 net [nat] [panic] ip_nat panic FreeBSD 7.2 f kern/144315 net [ipfw] [panic] freebsd 8-stable reboot after add ipfw o kern/144231 net bind/connect/sendto too strict about sockaddr length o kern/143846 net [gif] bringing gif3 tunnel down causes gif0 tunnel to s kern/143673 net [stf] [request] there should be a way to support multi o kern/143622 net [pfil] [patch] unlock pfil lock while calling firewall o kern/143593 net [ipsec] When using IPSec, tcpdump doesn't show outgoin o kern/143591 net [ral] RT2561C-based DLink card (DWL-510) fails to work o kern/143208 net [ipsec] [gif] IPSec over gif interface not working o kern/143034 net [panic] system reboots itself in tcp code [regression] o kern/142877 net [hang] network-related repeatable 8.0-STABLE hard hang o kern/142774 net Problem with outgoing connections on interface with mu o kern/142772 net [libc] lla_lookup: new lle malloc failed f kern/142518 net [em] [lagg] Problem on 8.0-STABLE with em and lagg o kern/142018 net [iwi] [patch] Possibly wrong interpretation of beacon- o kern/141861 net [wi] data garbled with WEP and wi(4) with Prism 2.5 f kern/141741 net Etherlink III NIC won't work after upgrade to FBSD 8, o kern/140742 net rum(4) Two asus-WL167G adapters cannot talk to each ot o kern/140682 net [netgraph] [panic] random panic in netgraph f kern/140634 net [vlan] destroying if_lagg interface with if_vlan membe o kern/140619 net [ifnet] [patch] refine obsolete if_var.h comments desc o kern/140346 net [wlan] High bandwidth use causes loss of wlan connecti o kern/140142 net [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6 o kern/140066 net [bwi] install report for 8.0 RC 2 (multiple problems) o kern/139387 net [ipsec] Wrong lenth of PF_KEY messages in promiscuous o bin/139346 net [patch] arp(8) add option to remove static entries lis o kern/139268 net [if_bridge] [patch] allow if_bridge to forward just VL p kern/139204 net [arp] DHCP server replies rejected, ARP entry lost bef o kern/139117 net [lagg] + wlan boot timing (EBUSY) o kern/138850 net [dummynet] dummynet doesn't work correctly on a bridge o kern/138782 net [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00 o kern/138688 net [rum] possibly broken on 8 Beta 4 amd64: able to wpa a o kern/138678 net [lo] FreeBSD does not assign linklocal address to loop o kern/138407 net [gre] gre(4) interface does not come up after reboot o kern/138332 net [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/ o kern/138266 net [panic] kernel panic when udp benchmark test used as r f kern/138029 net [bpf] [panic] periodically kernel panic and reboot o kern/137881 net [netgraph] [panic] ng_pppoe fatal trap 12 p bin/137841 net [patch] wpa_supplicant(8) cannot verify SHA256 signed p kern/137776 net [rum] panic in rum(4) driver on 8.0-BETA2 o bin/137641 net ifconfig(8): various problems with "vlan_device.vlan_i o kern/137392 net [ip] [panic] crash in ip_nat.c line 2577 o kern/137372 net [ral] FreeBSD doesn't support wireless interface from o kern/137089 net [lagg] lagg falsely triggers IPv6 duplicate address de o kern/136911 net [netgraph] [panic] system panic on kldload ng_bpf.ko t o kern/136618 net [pf][stf] panic on cloning interface without unit numb o kern/135502 net [periodic] Warning message raised by rtfree function i o kern/134583 net [hang] Machine with jail freezes after random amount o o kern/134531 net [route] [panic] kernel crash related to routes/zebra o kern/134157 net [dummynet] dummynet loads cpu for 100% and make a syst o kern/133969 net [dummynet] [panic] Fatal trap 12: page fault while in o kern/133968 net [dummynet] [panic] dummynet kernel panic o kern/133736 net [udp] ip_id not protected ... o kern/133595 net [panic] Kernel Panic at pcpu.h:195 o kern/133572 net [ppp] [hang] incoming PPTP connection hangs the system o kern/133490 net [bpf] [panic] 'kmem_map too small' panic on Dell r900 o kern/133235 net [netinet] [patch] Process SIOCDLIFADDR command incorre f kern/133213 net arp and sshd errors on 7.1-PRERELEASE o kern/133060 net [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs o kern/132889 net [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d o conf/132851 net [patch] rc.conf(5): allow to setfib(1) for service run o kern/132734 net [ifmib] [panic] panic in net/if_mib.c o kern/132705 net [libwrap] [patch] libwrap - infinite loop if hosts.all o kern/132672 net [ndis] [panic] ndis with rt2860.sys causes kernel pani o kern/132354 net [nat] Getting some packages to ipnat(8) causes crash o kern/131781 net [ndis] ndis keeps dropping the link o kern/131776 net [wi] driver fails to init o kern/131753 net [altq] [panic] kernel panic in hfsc_dequeue o bin/131365 net route(8): route add changes interpretation of network f kern/130820 net [ndis] wpa_supplicant(8) returns 'no space on device' o kern/130628 net [nfs] NFS / rpc.lockd deadlock on 7.1-R o kern/130525 net [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau o kern/130311 net [wlan_xauth] [panic] hostapd restart causing kernel pa o kern/130109 net [ipfw] Can not set fib for packets originated from loc f kern/130059 net [panic] Leaking 50k mbufs/hour f kern/129719 net [nfs] [panic] Panic during shutdown, tcp_ctloutput: in o kern/129517 net [ipsec] [panic] double fault / stack overflow f kern/129508 net [carp] [panic] Kernel panic with EtherIP (may be relat o kern/129219 net [ppp] Kernel panic when using kernel mode ppp o kern/129197 net [panic] 7.0 IP stack related panic o kern/129036 net [ipfw] 'ipfw fwd' does not change outgoing interface n o bin/128954 net ifconfig(8) deletes valid routes o bin/128602 net [an] wpa_supplicant(8) crashes with an(4) o kern/128448 net [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res o bin/128295 net [patch] ifconfig(8) does not print TOE4 or TOE6 capabi o bin/128001 net wpa_supplicant(8), wlan(4), and wi(4) issues o kern/127826 net [iwi] iwi0 driver has reduced performance and connecti o kern/127815 net [gif] [patch] if_gif does not set vlan attributes from o kern/127724 net [rtalloc] rtfree: 0xc5a8f870 has 1 refs f bin/127719 net [arp] arp: Segmentation fault (core dumped) f kern/127528 net [icmp]: icmp socket receives icmp replies not owned by p kern/127360 net [socket] TOE socket options missing from sosetopt() o bin/127192 net routed(8) removes the secondary alias IP of interface f kern/127145 net [wi]: prism (wi) driver crash at bigger traffic o kern/126895 net [patch] [ral] Add antenna selection (marked as TBD) o kern/126874 net [vlan]: Zebra problem if ifconfig vlanX destroy o kern/126695 net rtfree messages and network disruption upon use of if_ o kern/126339 net [ipw] ipw driver drops the connection o kern/126075 net [inet] [patch] internet control accesses beyond end of o bin/125922 net [patch] Deadlock in arp(8) o kern/125920 net [arp] Kernel Routing Table loses Ethernet Link status o kern/125845 net [netinet] [patch] tcp_lro_rx() should make use of hard o kern/125258 net [socket] socket's SO_REUSEADDR option does not work o kern/125239 net [gre] kernel crash when using gre o kern/124341 net [ral] promiscuous mode for wireless device ral0 looses o kern/124225 net [ndis] [patch] ndis network driver sometimes loses net o kern/124160 net [libc] connect(2) function loops indefinitely o kern/124021 net [ip6] [panic] page fault in nd6_output() o kern/123968 net [rum] [panic] rum driver causes kernel panic with WPA. o kern/123892 net [tap] [patch] No buffer space available o kern/123890 net [ppp] [panic] crash & reboot on work with PPP low-spee o kern/123858 net [stf] [patch] stf not usable behind a NAT o kern/123758 net [panic] panic while restarting net/freenet6 o bin/123633 net ifconfig(8) doesn't set inet and ether address in one o kern/123559 net [iwi] iwi periodically disassociates/associates [regre o bin/123465 net [ip6] route(8): route add -inet6 -interfac o kern/123463 net [ipsec] [panic] repeatable crash related to ipsec-tool o conf/123330 net [nsswitch.conf] Enabling samba wins in nsswitch.conf c o kern/123160 net [ip] Panic and reboot at sysctl kern.polling.enable=0 o kern/122989 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/122954 net [lagg] IPv6 EUI64 incorrectly chosen for lagg devices f kern/122780 net [lagg] tcpdump on lagg interface during high pps wedge o kern/122685 net It is not visible passing packets in tcpdump(1) o kern/122319 net [wi] imposible to enable ad-hoc demo mode with Orinoco o kern/122290 net [netgraph] [panic] Netgraph related "kmem_map too smal o kern/122252 net [ipmi] [bge] IPMI problem with BCM5704 (does not work o kern/122033 net [ral] [lor] Lock order reversal in ral0 at bootup ieee o bin/121895 net [patch] rtsol(8)/rtsold(8) doesn't handle managed netw s kern/121774 net [swi] [panic] 6.3 kernel panic in swi1: net o kern/121555 net [panic] Fatal trap 12: current process = 12 (swi1: net o kern/121534 net [ipl] [nat] FreeBSD Release 6.3 Kernel Trap 12: o kern/121443 net [gif] [lor] icmp6_input/nd6_lookup o kern/121437 net [vlan] Routing to layer-2 address does not work on VLA o bin/121359 net [patch] [security] ppp(8): fix local stack overflow in o kern/121257 net [tcp] TSO + natd -> slow outgoing tcp traffic o kern/121181 net [panic] Fatal trap 3: breakpoint instruction fault whi o kern/120966 net [rum] kernel panic with if_rum and WPA encryption o kern/120566 net [request]: ifconfig(8) make order of arguments more fr o kern/120304 net [netgraph] [patch] netgraph source assumes 32-bit time o kern/120266 net [udp] [panic] gnugk causes kernel panic when closing U o bin/120060 net routed(8) deletes link-level routes in the presence of o kern/119945 net [rum] [panic] rum device in hostap mode, cause kernel o kern/119791 net [nfs] UDP NFS mount of aliased IP addresses from a Sol o kern/119617 net [nfs] nfs error on wpa network when reseting/shutdown f kern/119516 net [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi o kern/119432 net [arp] route add -host -iface causes arp e o kern/119225 net [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr o kern/118727 net [netgraph] [patch] [request] add new ng_pf module o kern/117423 net [vlan] Duplicate IP on different interfaces o bin/117339 net [patch] route(8): loading routing management commands o bin/116643 net [patch] [request] fstat(1): add INET/INET6 socket deta o kern/116185 net [iwi] if_iwi driver leads system to reboot o kern/115239 net [ipnat] panic with 'kmem_map too small' using ipnat o kern/115019 net [netgraph] ng_ether upper hook packet flow stops on ad o kern/115002 net [wi] if_wi timeout. failed allocation (busy bit). ifco o kern/114915 net [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f o kern/113432 net [ucom] WARNING: attempt to net_add_domain(netgraph) af o kern/112722 net [ipsec] [udp] IP v4 udp fragmented packet reject o kern/112686 net [patm] patm driver freezes System (FreeBSD 6.2-p4) i38 o bin/112557 net [patch] ppp(8) lock file should not use symlink name o kern/112528 net [nfs] NFS over TCP under load hangs with "impossible p o kern/111537 net [inet6] [patch] ip6_input() treats mbuf cluster wrong o kern/111457 net [ral] ral(4) freeze o kern/110284 net [if_ethersubr] Invalid Assumption in SIOCSIFADDR in et o kern/110249 net [kernel] [regression] [patch] setsockopt() error regre o kern/109470 net [wi] Orinoco Classic Gold PC Card Can't Channel Hop o bin/108895 net pppd(8): PPPoE dead connections on 6.2 [regression] f kern/108197 net [panic] [gif] [ip6] if_delmulti reference counting pan o kern/107944 net [wi] [patch] Forget to unlock mutex-locks o conf/107035 net [patch] bridge(8): bridge interface given in rc.conf n o kern/106444 net [netgraph] [panic] Kernel Panic on Binding to an ip to o kern/106316 net [dummynet] dummynet with multipass ipfw drops packets o kern/105945 net Address can disappear from network interface s kern/105943 net Network stack may modify read-only mbuf chain copies o bin/105925 net problems with ifconfig(8) and vlan(4) [regression] o kern/104851 net [inet6] [patch] On link routes not configured when usi o kern/104751 net [netgraph] kernel panic, when getting info about my tr o kern/104738 net [inet] [patch] Reentrant problem with inet_ntoa in the o kern/103191 net Unpredictable reboot o kern/103135 net [ipsec] ipsec with ipfw divert (not NAT) encodes a pac o kern/102540 net [netgraph] [patch] supporting vlan(4) by ng_fec(4) o conf/102502 net [netgraph] [patch] ifconfig name does't rename netgrap o kern/102035 net [plip] plip networking disables parallel port printing o kern/100709 net [libc] getaddrinfo(3) should return TTL info o kern/100519 net [netisr] suggestion to fix suboptimal network polling o kern/98597 net [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu o bin/98218 net wpa_supplicant(8) blacklist not working o kern/97306 net [netgraph] NG_L2TP locks after connection with failed o conf/97014 net [gif] gifconfig_gif? in rc.conf does not recognize IPv f kern/96268 net [socket] TCP socket performance drops by 3000% if pack o kern/95519 net [ral] ral0 could not map mbuf o kern/95288 net [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr o kern/95277 net [netinet] [patch] IP Encapsulation mask_match() return o kern/95267 net packet drops periodically appear f kern/93378 net [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo o kern/93019 net [ppp] ppp and tunX problems: no traffic after restarti o kern/92880 net [libc] [patch] almost rewritten inet_network(3) functi s kern/92279 net [dc] Core faults everytime I reboot, possible NIC issu o kern/91859 net [ndis] if_ndis does not work with Asus WL-138 o kern/91364 net [ral] [wep] WF-511 RT2500 Card PCI and WEP o kern/91311 net [aue] aue interface hanging o kern/87421 net [netgraph] [panic]: ng_ether + ng_eiface + if_bridge o kern/86871 net [tcp] [patch] allocation logic for PCBs in TIME_WAIT s o kern/86427 net [lor] Deadlock with FASTIPSEC and nat o kern/85780 net 'panic: bogus refcnt 0' in routing/ipv6 o bin/85445 net ifconfig(8): deprecated keyword to ifconfig inoperativ o bin/82975 net route change does not parse classfull network as given o kern/82881 net [netgraph] [panic] ng_fec(4) causes kernel panic after o kern/82468 net Using 64MB tcp send/recv buffers, trafficflow stops, i o bin/82185 net [patch] ndp(8) can delete the incorrect entry o kern/81095 net IPsec connection stops working if associated network i o kern/78968 net FreeBSD freezes on mbufs exhaustion (network interface o kern/78090 net [ipf] ipf filtering on bridged packets doesn't work if o kern/77341 net [ip6] problems with IPV6 implementation o kern/75873 net Usability problem with non-RFC-compliant IP spoof prot s kern/75407 net [an] an(4): no carrier after short time a kern/71474 net [route] route lookup does not skip interfaces marked d o kern/71469 net default route to internet magically disappears with mu o kern/68889 net [panic] m_copym, length > size of mbuf chain o kern/66225 net [netgraph] [patch] extend ng_eiface(4) control message o kern/65616 net IPSEC can't detunnel GRE packets after real ESP encryp s kern/60293 net [patch] FreeBSD arp poison patch a kern/56233 net IPsec tunnel (ESP) over IPv6: MTU computation is wrong s bin/41647 net ifconfig(8) doesn't accept lladdr along with inet addr o kern/39937 net ipstealth issue a kern/38554 net [patch] changing interface ipaddress doesn't seem to w o kern/31940 net ip queue length too short for >500kpps o kern/31647 net [libc] socket calls can return undocumented EINVAL o kern/30186 net [libc] getaddrinfo(3) does not handle incorrect servna f kern/24959 net [patch] proper TCP_NOPUSH/TCP_CORK compatibility o conf/23063 net [arp] [patch] for static ARP tables in rc.network o kern/21998 net [socket] [patch] ident only for outgoing connections o kern/5877 net [socket] sb_cc counts control data as well as data dat 472 problems total. From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 14:02:40 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 96BDFFE0 for ; Mon, 3 Feb 2014 14:02:40 +0000 (UTC) Received: from mail.niessen.ch (btx02.niessen.ch [85.10.192.239]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0C1BF1B8B for ; Mon, 3 Feb 2014 14:02:39 +0000 (UTC) Received: from mail.niessen.ch (mail.niessen.ch [127.0.10.3]) by mail.niessen.ch (Postfix) with ESMTP id DD4DB139351 for ; Mon, 3 Feb 2014 15:02:21 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=dkim-2012; bh=W9o/MNr Nx86wsYv1AN/Pm1UjlwPxBNSPrlB0M/CBOBI=; b=pJ0QbSHHIt3HQ4Oz8QMKOYT plYamMU446l/HI3EzQG5LwfH8Gkz5CDIIhKPpThSSmC7R3lR+SkhXnDUuCMz12r6 yn1Y0po4rVifKAeCEwScNL+v+g11JWxTN+pqw2JfPYcyS6tK27Ucz3IE7Fvne+zH Uu+xfI5xveF5bZNNfs6M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=niessen.ch; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=dkim-2012; b=F lk1DW2WhX35CCZDu4/3DLKDu5bZChyVKK/lJgTFn9dvYZqABHjLk7oM1+HlSIdy7 oeSCz3FuRKukd4lzjm1WC2Op2PVeeErOruSZAQ4/jQhOvmwHoJnFSEOpmsRRl1+d pxNK7qugquAkSb1/GwG0zfVU6M4/B+W4+t+GYbP+q8= Received: from [172.20.10.3] (unknown [178.197.236.128]) by mail.niessen.ch (Postfix) with ESMTPSA id 8BAA1139350 for ; Mon, 3 Feb 2014 15:02:21 +0100 (CET) Message-ID: <52EFA157.9080007@niessen.ch> Date: Mon, 03 Feb 2014 15:01:59 +0100 From: Ben User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> In-Reply-To: <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 14:02:40 -0000 On 03.02.2014 10:58, Scott Long wrote: > Hi, > > If you can, please test the patch I sent and let me know the results. = I=92ll check it into FreeBSD 11 and 10 if it works for you. > > Thanks, > Scott > > On Feb 3, 2014, at 2:51 AM, Ben wrote: > >> Thank you for your detailed explanation. >> >> If I understand correctly the switch is probably not set up correctly,= right? >> >> I will try to have it configured correctly first. >> >> Thanks a lot for your help! >> >> Regards >> Ben >> >> On 03.02.2014 10:45, Scott Long wrote: >>> Ok, please try the patch I emailed earlier. Since you=92re not seein= g any receive messages, it means that your switch isn=92t generating any = LACP heartbeats. The difference between FreeBSD 9.x and 10 is that in 9.= x, it ran in =93optimistic=94 mode, meaning that it didn=92t rely on gett= ing receive messages from the switch, and only took a channel down if the= link state went down. In strict mode, it looks for the receive messages= and only transitions to a full operational state if it gets them. So wh= ile I know it=92s easy to point at the problem being FreeBSD 10, seeing a= s FreeBSD 9 worked for you, please check to make sure that your switch is= set up correctly. >>> >>> I authored the original change that went into FreeBSD 10, and I tried= to make it so that strict_mode=3D0 would keep everything working as it d= id in 9. I guess that since you=92re getting no receive messages from th= e switch at all that we need to disable strict mode on setup, not afterwa= rds. Apply the patch and everything should work as it did in FreeBSD 9. >>> >>> Scott >>> >>> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>> >>>> Yes, via sysctl and /etc/sysctl.conf >>>> >>>> I waited now roughly 20 minutes without touching it but no differenc= e. >>>> >>>> No, I only see these transmit messages, no receive. >>>> >>>> Thanks >>>> Ben >>>> >>>> On 03.02.2014 10:25, Scott Long wrote: >>>>> Did you set it to 0 via the sysctl? You might need to wait for sev= eral minutes if you set it after setting up the links. >>>>> >>>>> Also, the message that you=92re seeing is from your machine transmi= tting PDU packets. Are you seeing any "lacpdu receive=94 messages on the= console? >>>>> >>>>> Thanks, >>>>> Scott >>>>> >>>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>>> >>>>>> igb0: lacpdu transmit >>>>>> actor=3D(...) >>>>>> actor.state=3D4d >>>>>> partner=3D(...) >>>>>> partner.state=3D0 >>>>>> maxdelay=3D0 >>>>>> >>>>>> Thanks >>>>>> Ben >>>>>> >>>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Unfortunately, you can=92t control the strict mode globally. My = apologies for this mess, I=92ll make sure that it=92s fixed for FreeBSD 1= 0.1. If the sysctl doesn=92t help then maybe consider compiling a custom = kernel with it defaulted to 0. You=92ll need to open /sys/net/ieee802ad_= lacp.c and look for the function lacp_attach(). You=92ll see the strict_= mode assign underneath that. I=92ll also send you a patch in a few minut= es. Until then, try enabling net.link.lagg.lacp.debug=3D1 and see if you= =92re receiving heartbeat PDU=92s from your switch. >>>>>>> >>>>>>> Scott >>>>>>> >>>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>> >>>>>>>> Hi Scott, >>>>>>>> >>>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt wor= k. But will I try again and report back. >>>>>>>> >>>>>>>> The settings of the switch have not been changed and are set to = LACP. It worked before so I guess the switch should not be the problem. M= aybe some incompatibility between FreeBSD + igb-driver + switch (Juniper = EX3300-48T). >>>>>>>> >>>>>>>> I will update you after setting the sysctl setting. It seems to = be "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I sw= itch off the strict mode globally in /etc/sysctl.conf? >>>>>>>> >>>>>>>> Thanks for your help. >>>>>>>> >>>>>>>> Regards >>>>>>>> Ben >>>>>>>> >>>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> You=92re probably running into the consequences of r253687. Ch= eck to see the value of =91sysctl net.link.lagg.0.lacp.lacp_strict_mode=92= . If it=92s =911=92 then set it to 0. My original intention was for this= to default to 0, but apparently that didn=92t happen. However, the fact= that strict mode doesn=92t seem to work at all for you might hint that y= our switch either isn=92t configured correctly for LACP, or doesn=92t act= ually support LACP at all. You might want to investigate that. >>>>>>>>> >>>>>>>>> Scott >>>>>>>>> >>>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote= : >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9= .2 was configured to use LACP with two igb devices. >>>>>>>>>> >>>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>>> >>>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to FreeB= SD 10..0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT= 1dVPM >>>>>>>>>> >>>>>>>>>> A PR is currently open: http://www.freebsd.org/cgi/query-pr.cg= i?pr=3Dkern/185967 >>>>>>>>>> >>>>>>>>>> It is set to low, but I would like somebody to have a look int= o it as it obviously has a great influence on our infrastructure. The onl= y way to "solve" it is currently switching back to FreeBSD 9.2. >>>>>>>>>> >>>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>>> >>>>>>>>>> Thank you for your help. >>>>>>>>>> >>>>>>>>>> Best regards >>>>>>>>>> Ben >>>>>>>>>> _______________________________________________ >>>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@free= bsd..org" >>>>>>>>> _______________________________________________ >>>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freeb= sd.org" >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> freebsd-net@freebsd.org mailing list >>>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebs= d.org" >>>>>>> _______________________________________________ >>>>>>> freebsd-net@freebsd.org mailing list >>>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd= .org" >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> freebsd-net@freebsd.org mailing list >>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.= org" >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org= " >>> >>> >>> >>> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > !DSPAM:1,52ef691c888821141715696! > > Hi, I tried your patch and it works. The strict mode is now set to 0. One thing I saw: There is a message at the login prompt: igb0: Interface stopped=20 DISTRIBUTING, possible flapping igb0 and igb1 are used for the lagg device. I still get the following messages when I restart netif: can't re-use a leaf (lacp_strict_mode)! can't re-use a leaf (rx_test)! can't re-use a leaf (tx_test)! sysctl says strcit_mode is off at that time. I hope this helps. Best regards Ben From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 17:52:11 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 86BE31DB for ; Mon, 3 Feb 2014 17:52:11 +0000 (UTC) Received: from nm4-vm0.bullet.mail.ne1.yahoo.com (nm4-vm0.bullet.mail.ne1.yahoo.com [98.138.90.253]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0723F1175 for ; Mon, 3 Feb 2014 17:52:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=gcom1024; t=1391449924; bh=hSO17HzdLGwiAoNtUjGjC4YUXIO7I+804vq2GAtMroI=; h=Received:Received:Received:DKIM-Signature:X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=CesVbXtOCw2z5MRJ8MzNrE3Pb6Fg8yEnmFLIJdKlUBG18ltoqv3qhYz0nV3gl91OtcEYuNzyxD8vX3O2hyGcdH6aOCX82lMJNwGzbzz9Z9CYGf4JW/E0uuA1a+dKp4oDnPn8+z8TzXIDCrC13MtVIILuQMWUOp9f0EeiRsBZKMI= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=gcom1024; d=yahoo.com; b=D8Uytz6i0YQgUA8t8pYrGpAIy/9Tfuir8u14aPSsQUM/f+gqtmUqDEXiGxY0vLdsxT1Icc5qJ+OI1o0VEjWbXix+wpsuMYV8VwUQNuZEY81yRPDtBpTorCD0NhnBjnmwwDhMVqUr8mtIcMdZkHIr6EQbvmpe8Hd3aBbnSP+NxSw=; Received: from [98.138.101.130] by nm4.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 17:52:04 -0000 Received: from [98.138.226.132] by tm18.bullet.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 17:52:04 -0000 Received: from [127.0.0.1] by smtp219.mail.ne1.yahoo.com with NNFMP; 03 Feb 2014 17:52:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391449924; bh=hSO17HzdLGwiAoNtUjGjC4YUXIO7I+804vq2GAtMroI=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=rLDAcIGHx0VeeM+KpNx+Qoa9yWS4se70UB/AtMC7oIHOb3lxOHSgO/WEF7oBZ36UQzcZwE1kcWDfpmm+yOBOzeMqATo4RElZjDOX2NGFGKPXXE5f3u6ArMJT/xeUjZq5eRnWHeJimbIMuq0YfGRULqPFz9zzgpi8YqjABDA2IEw= X-Yahoo-Newman-Id: 706465.93046.bm@smtp219.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: 711JWcUVM1n2qbdtkTbAFoBHIQHuLDI1q.sr6SB_J6tgWa_ nXlN4QVMYKNK6EISZAynZAXSVuDKjZK.YxSbel8od151iQE6gSjQKqZIPhEZ LYFo.l1lFTleIBmBefJ03.2e8wJtxtOzf2lak1kUM32HsqfwFmPnggJ2tYn9 iaUQzCZeNVgo3M.gdhMvWq1TRNgZj2p6NFaXGcvxWApwdh2JZDjhyRl_4H6B beqsralkGOj_hTpEnX6mxpbOD11RexnTrJHS4hKMTaZ7DTFAVRLKbnJyhRiW uvWzNjUE.86W0.9aTS5ti_kW3zANPsLyeb_lOC9xoGnUVqlkTPokg3dd.3m5 6DjONNDgRxR7pM4JkqupxFUMtphSdy488WUDtWk.ES68jo4PgkETxL1bGe1u HDWUsW88Ra1YZP2jcIMy2aSLaJQFJw2V3eCN8kjCCRtSrmdd9rioTSXYRotM bTRGf0dckwCATg0ka9NBEEHCB.uLdwwqTeE9BdQLnacROuKJEYAAAiKOnLjD j4DCKUKC.piUeuzxkt7QMCc8iTrcTBfJYtZTgJgGGf.LUiDrndLLamzDe6Uq f_af0JAh_ X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [10.64.24.117] (scott4long@69.53.236.251 with plain [98.139.211.125]) by smtp219.mail.ne1.yahoo.com with SMTP; 03 Feb 2014 09:52:04 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <52EFA157.9080007@niessen.ch> Date: Mon, 3 Feb 2014 10:52:01 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <073B89E9-C0AB-4251-9504-E79059F15B7F@yahoo.com> References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> <52EFA157.9080007@niessen.ch> To: Ben X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 17:52:11 -0000 On Feb 3, 2014, at 7:01 AM, Ben wrote: >=20 > Hi, >=20 > I tried your patch and it works. The strict mode is now set to 0. >=20 > One thing I saw: > There is a message at the login prompt: igb0: Interface stopped = DISTRIBUTING, possible flapping > igb0 and igb1 are used for the lagg device. >=20 > I still get the following messages when I restart netif: > can't re-use a leaf (lacp_strict_mode)! > can't re-use a leaf (rx_test)! > can't re-use a leaf (tx_test)! >=20 > sysctl says strcit_mode is off at that time. >=20 > I hope this helps. >=20 Thanks a lot for testing this. The =91flapping=92 message is = intentional, it points out that something is wrong with heartbeat = exchange with the switch. The =93can=92t re-use a leaf=94 messages = point to bugs that I=92ll fix shortly. Scott From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 19:11:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4553F2EC for ; Mon, 3 Feb 2014 19:11:20 +0000 (UTC) Received: from mail-qc0-x22a.google.com (mail-qc0-x22a.google.com [IPv6:2607:f8b0:400d:c01::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EF0D219EC for ; Mon, 3 Feb 2014 19:11:19 +0000 (UTC) Received: by mail-qc0-f170.google.com with SMTP id e9so11960877qcy.15 for ; Mon, 03 Feb 2014 11:11:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=ZR3U/b2kHiAbMY5vCv+VQeY6L3nK+Q6h4thzkeS0LpI=; b=Jq+Fac9B9xGNrYRpVtb4zxoQEKuoYk6WavcuEUtbXkscJlvwRLYJp9SLyZM2+t+YLk pa89ZKMgcd3LAptFx9n6wn8UmWyrmDV2gRqAlH0tdgO5EwjHgSR8HulHreaClmScYE6T gW3kLf2Sm1XzdgzJ/BjP4ufaldViDRly1OSNg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-type; bh=ZR3U/b2kHiAbMY5vCv+VQeY6L3nK+Q6h4thzkeS0LpI=; b=LB31KIG4dUWhirXODo4mT105b3Hg41gBsvBCNU1BSOq8BrPg+RhU7r5G2SeglikQaa UFTOwl09FMlGLX/j2sXXJbKDk4utd3himzGJs+RpFMdcAvM9Xca4y7/THNMLFAXEWaqO Mo8mWcbMEbVD0qtDK8jMN+Fe7FMUjBbaYpWBujizHFI1Faks9Rz2Wiz7Ev5DFQX6m/6F 9V08d3Mpes/1olhEH7ppGfkL5uZiKe6O3clGvYoW3rB0f7Zxr8hOBXbTRIwF4Vh2Fnh3 PfA81pKpp/hDk80sd4N77fNRQT5cBKiFL2HTS6rpfnrNP6z8zZ2yxyWW+C7bPyXw+OIq VERg== X-Gm-Message-State: ALoCoQmVb8Bb2NN7E2CgXBE5QdXzM+P5MjMHN7WLefhgRC4t7tAKCFpvIpcwCXCIY5IL5rslbxXf X-Received: by 10.229.213.194 with SMTP id gx2mr58741732qcb.16.1391454679113; Mon, 03 Feb 2014 11:11:19 -0800 (PST) MIME-Version: 1.0 Sender: lists@eitanadler.com Received: by 10.96.30.229 with HTTP; Mon, 3 Feb 2014 11:10:49 -0800 (PST) In-Reply-To: <417C5F59-DD48-4BEC-BD28-BFF87B8F61E3@adaranetworks.com> References: <52CB3AE9.3030107@wemm.org> <52CC5F2E.5030201@wemm.org> <52CC8246.7080609@wemm.org> <52CC903C.5090706@sentex.net> <52CCC0DF.1020007@wemm.org> <52CDA490.5060002@wemm.org> <417C5F59-DD48-4BEC-BD28-BFF87B8F61E3@adaranetworks.com> From: Eitan Adler Date: Mon, 3 Feb 2014 14:10:49 -0500 X-Google-Sender-Auth: QA68qrxu449Jcc_nZs3sEbvFj5E Message-ID: Subject: Re: TCP question: Is this simultaneous close handling broken? To: Randall Stewart Content-Type: text/plain; charset=UTF-8 Cc: freebsd-net , Peter Wemm , Mike Tancsa X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 19:11:20 -0000 On Wed, Jan 15, 2014 at 9:01 AM, Randall Stewart wrote: ... Hi all, This email thread occurred when I was on vacation. Since then there were a variety of commits to the relevant code (all the way back to stable/4!). Is this still an issue? -- Eitan Adler Source, Ports, Doc committer Bugmeister, Ports Security teams From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 19:57:01 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4953C2F0 for ; Mon, 3 Feb 2014 19:57:01 +0000 (UTC) Received: from mail-qc0-x22b.google.com (mail-qc0-x22b.google.com [IPv6:2607:f8b0:400d:c01::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 05D011D7C for ; Mon, 3 Feb 2014 19:57:00 +0000 (UTC) Received: by mail-qc0-f171.google.com with SMTP id n7so11984844qcx.16 for ; Mon, 03 Feb 2014 11:57:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:from:date:message-id:subject:to:content-type; bh=+mmWKlhgLagHgqk5ZfMTPddbaMmKAJnPF8DFFFICdtw=; b=W5aYO6eSSN47zXZ/P7nt5+i8TfOhaTXW67yXTbwchDyz4EWFq6t5PKKieVVsJKDX4t jytg3GUkLHxtmt2qp5EJVdE+JIUDaZrNl8xsbNWkkXdUJAtk2ciISM9Zjw+kQhQZtjO+ GRYaddguAHo/Y6clo9yTisw8Rxk8dc4KHY2/I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-type; bh=+mmWKlhgLagHgqk5ZfMTPddbaMmKAJnPF8DFFFICdtw=; b=WXF+iZ5mip/u/+JO3yxPED5G5n0XdbLi6/k7AdswiIp+u2OJ412rCMaBa6TspN6bWj L6sVwCMmoI9hJ79fxNXpscdxkS51rPFA7FZSi/Q/ZKF9fibBC5wVTy2jpfM82Rlotzkc z+W/l+dXAlEsNGrdfliHzFkc0SNuQFAHZ2dNPyrLo6UqiK6tnhy9D62l5XQbNcbjkawn rksZHZazGPCVuaVAg2oUV10HQg48pyuqn8NLffWoYA5S+HNdLA9A3EmMQrldDr6hpu5G HjavGW0JgCmwOcSkSW819cZpqpmVP4icn7Sqlqs5U4PIAIdlmFlfmJegD5/Xmae0s/gV Dq0Q== X-Gm-Message-State: ALoCoQldcVjeoh6d07/YVZJvamCbnwVLmbiyMUb/tvLannQ1vmdorr8R5gY8bHMQjmIG168CqqDL X-Received: by 10.224.2.194 with SMTP id 2mr59920361qak.44.1391457385637; Mon, 03 Feb 2014 11:56:25 -0800 (PST) MIME-Version: 1.0 Received: by 10.96.30.229 with HTTP; Mon, 3 Feb 2014 11:55:55 -0800 (PST) From: Eitan Adler Date: Mon, 3 Feb 2014 14:55:55 -0500 Message-ID: Subject: ip6opt.c To: "freebsd-net@freebsd.org" , swildner@dragonflybsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 19:57:01 -0000 Hi all, DragonFly recently committed the following change and it seems that it applies to us as well. http://gitweb.dragonflybsd.org/dragonfly.git/blobdiff/5764e12516158974fac10d50dbd2df76ce1ab007..98651c6e0e1c3b7a6b8650b55b473fcc745a22b7:/lib/libc/net/ip6opt.c Should I commit it? Index: ip6opt.c =================================================================== --- ip6opt.c (revision 261405) +++ ip6opt.c (working copy) @@ -381,11 +381,8 @@ inet6_opt_init(void *extbuf, socklen_t extlen) { struct ip6_ext *ext = (struct ip6_ext *)extbuf; - if (extlen < 0 || (extlen % 8)) - return(-1); - if (ext) { - if (extlen == 0) + if (extlen == 0 || (extlen % 8)) return(-1); ext->ip6e_len = (extlen >> 3) - 1; } -- Eitan Adler From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 20:27:19 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 688D7777 for ; Mon, 3 Feb 2014 20:27:19 +0000 (UTC) Received: from mail.allbsd.org (gatekeeper.allbsd.org [IPv6:2001:2f0:104:e001::32]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5670A10CC for ; Mon, 3 Feb 2014 20:27:18 +0000 (UTC) Received: from alph.d.allbsd.org (p2106-ipbf2009funabasi.chiba.ocn.ne.jp [114.146.169.106]) (authenticated bits=128) by mail.allbsd.org (8.14.5/8.14.5) with ESMTP id s13KQrcM015281 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 4 Feb 2014 05:27:04 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from localhost (localhost [IPv6:::1]) (authenticated bits=0) by alph.d.allbsd.org (8.14.7/8.14.7) with ESMTP id s13KQo8U040436; Tue, 4 Feb 2014 05:26:51 +0900 (JST) (envelope-from hrs@FreeBSD.org) Date: Tue, 04 Feb 2014 05:26:25 +0900 (JST) Message-Id: <20140204.052625.1192023326694116318.hrs@allbsd.org> To: lists@eitanadler.com Subject: Re: ip6opt.c From: Hiroki Sato In-Reply-To: References: X-PGPkey-fingerprint: BDB3 443F A5DD B3D0 A530 FFD7 4F2C D3D8 2793 CF2D X-Mailer: Mew version 6.5 on Emacs 24.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="--Security_Multipart(Tue_Feb__4_05_26_25_2014_679)--" Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.4 at gatekeeper.allbsd.org X-Virus-Status: Clean X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.7 (mail.allbsd.org [133.31.130.32]); Tue, 04 Feb 2014 05:27:05 +0900 (JST) X-Spam-Status: No, score=-94.3 required=13.0 tests=CONTENT_TYPE_PRESENT, RCVD_IN_PBL,RCVD_IN_RP_RNBL,SPF_SOFTFAIL,USER_IN_WHITELIST autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on gatekeeper.allbsd.org Cc: freebsd-net@FreeBSD.org, swildner@dragonflybsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 20:27:19 -0000 ----Security_Multipart(Tue_Feb__4_05_26_25_2014_679)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Eitan Adler wrote in : li> Hi all, li> li> DragonFly recently committed the following change and it seems that it li> applies to us as well. li> li> http://gitweb.dragonflybsd.org/dragonfly.git/blobdiff/5764e12516158974fac10d50dbd2df76ce1ab007..98651c6e0e1c3b7a6b8650b55b473fcc745a22b7:/lib/libc/net/ip6opt.c li> li> Should I commit it? Just out of curiousity, what is the problem with returning -1 when (extbuf == NULL) && (extlen % 8) != 0? li> li> Index: ip6opt.c li> =================================================================== li> --- ip6opt.c (revision 261405) li> +++ ip6opt.c (working copy) li> @@ -381,11 +381,8 @@ inet6_opt_init(void *extbuf, socklen_t extlen) li> { li> struct ip6_ext *ext = (struct ip6_ext *)extbuf; li> li> - if (extlen < 0 || (extlen % 8)) li> - return(-1); li> - li> if (ext) { li> - if (extlen == 0) li> + if (extlen == 0 || (extlen % 8)) li> return(-1); li> ext->ip6e_len = (extlen >> 3) - 1; li> } li> li> li> -- li> Eitan Adler li> _______________________________________________ li> freebsd-net@freebsd.org mailing list li> http://lists.freebsd.org/mailman/listinfo/freebsd-net li> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" li> ----Security_Multipart(Tue_Feb__4_05_26_25_2014_679)-- Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlLv+3EACgkQTyzT2CeTzy1WWgCffNk5/VrDoxdhFGWZIChMNEr2 33kAn0d/yAonutiC8D2Q/eSwKnpAfYIK =wIrZ -----END PGP SIGNATURE----- ----Security_Multipart(Tue_Feb__4_05_26_25_2014_679)---- From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 23:21:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 53FF8624; Mon, 3 Feb 2014 23:21:29 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id DE01412DA; Mon, 3 Feb 2014 23:21:28 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: X-IronPort-AV: E=Sophos;i="4.95,775,1384318800"; d="scan'208";a="93327299" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 03 Feb 2014 18:21:21 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 0ED02B4167; Mon, 3 Feb 2014 18:21:21 -0500 (EST) Date: Mon, 3 Feb 2014 18:21:21 -0500 (EST) From: Rick Macklem To: J David Message-ID: <709957141.2296744.1391469681036.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Terrible NFS performance under 9.2-RELEASE? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 23:21:29 -0000 J David wrote: > On Sat, Feb 1, 2014 at 10:53 PM, Rick Macklem > wrote: > > Btw, if you do want to test with O_DIRECT ("-I"), you should enable > > direct io in the client. > > sysctl vfs.nfs.nfs_directio_enable=1 > > > > I just noticed that it is disabled by default. This means that your > > "-I" was essentially being ignored by the FreeBSD client. > > Ouch. Yes, that appears to be correct. > > > It also explains why Linux isn't doing a read before write, since > > that wouldn't happen for direct I/O. You should test Linux without > > "-I" > > and see if it still doesn't do the read before write, including a > > "-r 2k" > > to avoid the "just happens to be a page size" case. > > With O_DIRECT, the Linux client reads only during the read tests. > Without O_DIRECT, the Linux client does *no reads at all*, not even > for the read tests. It caches the whole file and returns > commensurately irrelevant/silly performance numbers (e.g. 7.2GiB/sec > random reads). > > Setting the sysctl on the FreeBSD client does stop it from doing the > excess reads. Ironically this actually makes O_DIRECT improve > performance for all the workloads punished by that behavior. > > It also creates a fairly consistent pattern indicating performance > being bottlenecked by the FreeBSD NFS server. > > Here is a sample of test results, showing both throughput and IOPS > achieved: > > https://imageshack.com/i/4jiljhp > > In this chart, the 64k test is run 4 times, once with FreeBSD as both > client and server (64k), once with Linux as the client and FreeBSD as > the server (L/F 64k), once with FreeBSD as the client and Linux as > the > server (F/L 64k, hands down the best NFS combo), and once with Linux > as the client and server (L/L 64k). For reference, the native > performance of the md0 filesystem is also included. > > The TLDR version of this chart is that the FreeBSD NFS server is the > primary bottleneck; it is not being held back by the network or the > underlying disk. Ideally, it would be nice to see the 64k column for > FreeBSD client / FreeBSD server as high as or higher than the column > for FreeBSD client / Linux Server. (Also, the bottleneck of the F/L > 64k test appears to be CPU on the FreeBSD client.) > > The more detailed findings are: > > 1) The Linux client runs at about half the IOPS of the FreeBSD client > regardless of server type. The gut-level suspicion is that it must > be > doing twice as many NFS operations per write. (Possibly commit?) > > 2) The FreeBSD NFS server seems capped at around 6300 IOPS. This is > neither a limit of the network (at least 28k IOPS) nor the filesystem > (about 40k IOPs). > > 3) When O_DIRECT is not used (not shown), the excess read operations > pull from the same 6300 IOPS bucket, and that's what kills small > writes. > > 4) It's possible that the sharp drop off visible at the 64k/64k test > is a result of doubling the number of packets traversing the > TSO-capable network. > > Here's a representative top from the server while the test is > running, > showing all the nfsd kernel threads being utilized, and spare RAM and > CPU: > > last pid: 14996; load averages: 0.58, 0.17, 0.10 > up 5+01:42:13 04:02:24 > > 255 processes: 4 running, 223 sleeping, 28 waiting > > CPU: 0.0% user, 0.0% nice, 55.9% system, 9.2% interrupt, 34.9% > idle > > Mem: 2063M Active, 109M Inact, 1247M Wired, 1111M Buf, 4492M Free > > ARC: 930K Total, 41K MFU, 702K MRU, 16K Anon, 27K Header, 143K Other > > Swap: 8192M Total, 8192M Free > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > > 11 root 155 ki31 0K 32K RUN 1 120.8H 65.87% > idle{idle: cpu1} > > 11 root 155 ki31 0K 32K RUN 0 120.6H 48.63% > idle{idle: cpu0} > > 12 root -92 - 0K 448K WAIT 0 13:41 14.55% > intr{irq268: virtio_p} > > 1001 root -8 - 0K 16K mdwait 0 19:37 12.99% md0 > > 13 root -8 - 0K 48K - 0 10:53 6.05% > geom{g_down} > > 12 root -92 - 0K 448K WAIT 1 3:13 4.64% > intr{irq269: virtio_p} > > 859 root -4 0 9912K 1824K ufs 1 2:00 3.08% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:11 2.83% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:04 2.64% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 2:00 2.29% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 1 6:08 2.20% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 5:40 2.20% > nfsd{nfsd: master} > > 859 root -4 0 9912K 1824K ufs 1 2:00 1.95% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:50 1.90% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 2:47 1.66% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K RUN 0 2:13 1.66% > nfsd{nfsd: service} > > 13 root -8 - 0K 48K - 0 1:55 1.46% > geom{g_up} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:39 1.42% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 5:18 1.32% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 1:55 1.12% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:00 0.98% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 2:01 0.73% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 5:56 0.49% > nfsd{nfsd: service} > > > All of this tends to exonerate the client. So what would be the next > step to track down the cause of poor performance on the server-side? > Ok, I'm going to assume that you either applied my patch to the server (so it is using 4K clusters and isn't exceeding the 33 limit) or updated your virtio network driver with the fixes Bryan recently committed to head, so it should work better with the 34 packet read replies. Did you also set these sysctl values in the server? vfs.nfsd.tcphighwater=100000 vfs.nfsd.tcpcache_timeout=600 (or whatever it is called, just do a "sysctl -a" and look for it in the vfs.nfsd section.) If you have done all of the above and the server still appears slow, you need to test with a recent kernel that has mav@'s changes in it. He has done a bunch of recent performance work on it, using the SpecNFS benchmark, I believe. rick (I believe he has now MFC'd these to stable/9, if using a recent head kernel isn't practical.) > Thanks! > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 00:03:26 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0F35371; Tue, 4 Feb 2014 00:03:26 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id A22CB16B5; Tue, 4 Feb 2014 00:03:25 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAMst8FKDaFve/2dsb2JhbABZg0RXgwG6Qk+BI3SCJQEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBHASHXAgNrE+hTheBKY0IBgEBGzQHgm+BSQSJSYwOhAWQb4NLHjF8CBci X-IronPort-AV: E=Sophos;i="4.95,775,1384318800"; d="scan'208";a="92774089" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 03 Feb 2014 19:03:17 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DF47EB4026; Mon, 3 Feb 2014 19:03:17 -0500 (EST) Date: Mon, 3 Feb 2014 19:03:17 -0500 (EST) From: Rick Macklem To: J David Message-ID: <320778540.2326389.1391472197902.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Terrible NFS performance under 9.2-RELEASE? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 00:03:26 -0000 J David wrote: > On Sat, Feb 1, 2014 at 10:53 PM, Rick Macklem > wrote: > > Btw, if you do want to test with O_DIRECT ("-I"), you should enable > > direct io in the client. > > sysctl vfs.nfs.nfs_directio_enable=1 > > > > I just noticed that it is disabled by default. This means that your > > "-I" was essentially being ignored by the FreeBSD client. > > Ouch. Yes, that appears to be correct. > > > It also explains why Linux isn't doing a read before write, since > > that wouldn't happen for direct I/O. You should test Linux without > > "-I" > > and see if it still doesn't do the read before write, including a > > "-r 2k" > > to avoid the "just happens to be a page size" case. > > With O_DIRECT, the Linux client reads only during the read tests. > Without O_DIRECT, the Linux client does *no reads at all*, not even > for the read tests. It caches the whole file and returns > commensurately irrelevant/silly performance numbers (e.g. 7.2GiB/sec > random reads). > It looks like the "-U" option can be used to get iozone to unmount/remount the file system and avoid hits on the buffer cache. (Alternately, a single run after doing a manual dismount/mount might help.) > Setting the sysctl on the FreeBSD client does stop it from doing the > excess reads. Ironically this actually makes O_DIRECT improve > performance for all the workloads punished by that behavior. > > It also creates a fairly consistent pattern indicating performance > being bottlenecked by the FreeBSD NFS server. > > Here is a sample of test results, showing both throughput and IOPS > achieved: > > https://imageshack.com/i/4jiljhp > > In this chart, the 64k test is run 4 times, once with FreeBSD as both > client and server (64k), once with Linux as the client and FreeBSD as > the server (L/F 64k), once with FreeBSD as the client and Linux as > the > server (F/L 64k, hands down the best NFS combo), and once with Linux > as the client and server (L/L 64k). For reference, the native > performance of the md0 filesystem is also included. > > The TLDR version of this chart is that the FreeBSD NFS server is the > primary bottleneck; it is not being held back by the network or the > underlying disk. Ideally, it would be nice to see the 64k column for > FreeBSD client / FreeBSD server as high as or higher than the column > for FreeBSD client / Linux Server. (Also, the bottleneck of the F/L > 64k test appears to be CPU on the FreeBSD client.) > > The more detailed findings are: > > 1) The Linux client runs at about half the IOPS of the FreeBSD client > regardless of server type. The gut-level suspicion is that it must > be > doing twice as many NFS operations per write. (Possibly commit?) > > 2) The FreeBSD NFS server seems capped at around 6300 IOPS. This is > neither a limit of the network (at least 28k IOPS) nor the filesystem > (about 40k IOPs). > > 3) When O_DIRECT is not used (not shown), the excess read operations > pull from the same 6300 IOPS bucket, and that's what kills small > writes. > > 4) It's possible that the sharp drop off visible at the 64k/64k test > is a result of doubling the number of packets traversing the > TSO-capable network. > > Here's a representative top from the server while the test is > running, > showing all the nfsd kernel threads being utilized, and spare RAM and > CPU: > > last pid: 14996; load averages: 0.58, 0.17, 0.10 > up 5+01:42:13 04:02:24 > > 255 processes: 4 running, 223 sleeping, 28 waiting > > CPU: 0.0% user, 0.0% nice, 55.9% system, 9.2% interrupt, 34.9% > idle > > Mem: 2063M Active, 109M Inact, 1247M Wired, 1111M Buf, 4492M Free > > ARC: 930K Total, 41K MFU, 702K MRU, 16K Anon, 27K Header, 143K Other > > Swap: 8192M Total, 8192M Free > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > > 11 root 155 ki31 0K 32K RUN 1 120.8H 65.87% > idle{idle: cpu1} > > 11 root 155 ki31 0K 32K RUN 0 120.6H 48.63% > idle{idle: cpu0} > > 12 root -92 - 0K 448K WAIT 0 13:41 14.55% > intr{irq268: virtio_p} > > 1001 root -8 - 0K 16K mdwait 0 19:37 12.99% md0 > > 13 root -8 - 0K 48K - 0 10:53 6.05% > geom{g_down} > > 12 root -92 - 0K 448K WAIT 1 3:13 4.64% > intr{irq269: virtio_p} > > 859 root -4 0 9912K 1824K ufs 1 2:00 3.08% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:11 2.83% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:04 2.64% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 2:00 2.29% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 1 6:08 2.20% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 5:40 2.20% > nfsd{nfsd: master} > > 859 root -4 0 9912K 1824K ufs 1 2:00 1.95% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:50 1.90% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 2:47 1.66% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K RUN 0 2:13 1.66% > nfsd{nfsd: service} > > 13 root -8 - 0K 48K - 0 1:55 1.46% > geom{g_up} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:39 1.42% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 5:18 1.32% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 1:55 1.12% > nfsd{nfsd: service} > > 859 root -8 0 9912K 1824K rpcsvc 0 2:00 0.98% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 2:01 0.73% > nfsd{nfsd: service} > > 859 root -4 0 9912K 1824K ufs 1 5:56 0.49% > nfsd{nfsd: service} > > > All of this tends to exonerate the client. So what would be the next > step to track down the cause of poor performance on the server-side? > Also, as an alternative to setting the 2 sysctls for the TCP DRC cache, you can simply disable it by setting the sysctl: vfs.nfsd.cachetcp=0 Many would argue that doing a DRC for TCP isn't necessary (it wasn't done in most NFS servers, including the old NFS server in FreeBSD). I have no idea if the Linux server does a DRC for TCP. And make sure you've increased your nfsd thread count. You can put a line like this in your /etc/rc.conf to do that: nfs_server_flags="-u -t -n 64" (sets it to 64, which should be plenty for a single client.) rick > Thanks! > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 00:05:24 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6EEA212C for ; Tue, 4 Feb 2014 00:05:24 +0000 (UTC) Received: from mail.ijs.si (mail.ijs.si [IPv6:2001:1470:ff80::25]) by mx1.freebsd.org (Postfix) with ESMTP id 2172316C6 for ; Tue, 4 Feb 2014 00:05:24 +0000 (UTC) Received: from amavis-proxy-ori.ijs.si (localhost [IPv6:::1]) by mail.ijs.si (Postfix) with ESMTP id 3fJ5l64SyTzGN5V for ; Tue, 4 Feb 2014 01:05:22 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ijs.si; h= user-agent:message-id:references:in-reply-to:organization :subject:subject:from:from:date:date:content-transfer-encoding :content-type:content-type:mime-version:received:received :received:received; s=jakla2; t=1391472319; x=1394064320; bh=jY3 Cc+sKw5R/YPbsSwKXKWAbxM9MSO/ysPQ/75RKF8s=; b=DvDfGLk6VuECtwX//yC wikoLj1eSamhJEEekvg0n8zmkpwpm8fSTbtlI5ZxNRMTo7xZvN+5fvGXvMA8Ss+n dfQO8edaytJlCfrH/v97NSma5j6fKWMw2fnMQd7QXGtchRGbHBUTSOeLAHp9MAed u/RlSv3jPymRr9fVp9bqcywA= X-Virus-Scanned: amavisd-new at ijs.si Received: from mail.ijs.si ([IPv6:::1]) by amavis-proxy-ori.ijs.si (mail.ijs.si [IPv6:::1]) (amavisd-new, port 10012) with ESMTP id SUUYNzL5eyaA for ; Tue, 4 Feb 2014 01:05:19 +0100 (CET) Received: from mildred.ijs.si (mailbox.ijs.si [IPv6:2001:1470:ff80::143:1]) by mail.ijs.si (Postfix) with ESMTP for ; Tue, 4 Feb 2014 01:05:19 +0100 (CET) Received: from neli.ijs.si (neli.ijs.si [193.2.4.95]) by mildred.ijs.si (Postfix) with ESMTP id DB5A4DC3 for ; Tue, 4 Feb 2014 01:05:19 +0100 (CET) Received: from sleepy.ijs.si ([2001:1470:ff80:e001::1:1]) by neli.ijs.si with HTTP (HTTP/1.1 POST); Tue, 04 Feb 2014 01:05:19 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Date: Tue, 04 Feb 2014 01:05:19 +0100 From: Mark Martinec To: freebsd-net@freebsd.org Subject: Re: ip6opt.c Organization: J. Stefan Institute In-Reply-To: <20140204.052625.1192023326694116318.hrs@allbsd.org> References: <20140204.052625.1192023326694116318.hrs@allbsd.org> Message-ID: <758be52b7a245c492ec80a1d3a21d79e@mailbox.ijs.si> X-Sender: Mark.Martinec+freebsd@ijs.si User-Agent: Roundcube Webmail/0.9.5 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 00:05:24 -0000 > Eitan Adler wrote > li> DragonFly recently committed the following change and it seems that > it > li> applies to us as well. --- ip6opt.c (revision 261405) +++ ip6opt.c (working copy) @@ -381,11 +381,8 @@ inet6_opt_init(void *extbuf, socklen_t extlen) { struct ip6_ext *ext = (struct ip6_ext *)extbuf; - if (extlen < 0 || (extlen % 8)) - return(-1); - if (ext) { - if (extlen == 0) + if (extlen == 0 || (extlen % 8)) return(-1); ext->ip6e_len = (extlen >> 3) - 1; } 2014-02-03 21:26, Hiroki Sato wrote: > Just out of curiousity, what is the problem with returning -1 when > (extbuf == NULL) && (extlen % 8) != 0? It is against the specs. The RFC 3542 is clear on this: 10.1. inet6_opt_init int inet6_opt_init(void *extbuf, socklen_t extlen); This function returns the number of bytes needed for the empty extension header i.e., without any options. If extbuf is not NULL it also initializes the extension header to have the correct length field. In that case if the extlen value is not a positive (i.e., non-zero) multiple of 8 the function fails and returns -1. li> Should I commit it? I'd say yes. Mark From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 01:46:30 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F05A59EC; Tue, 4 Feb 2014 01:46:30 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C6B051E48; Tue, 4 Feb 2014 01:46:30 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s141kUAE047703; Tue, 4 Feb 2014 01:46:30 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s141kUGO047702; Tue, 4 Feb 2014 01:46:30 GMT (envelope-from linimon) Date: Tue, 4 Feb 2014 01:46:30 GMT Message-Id: <201402040146.s141kUGO047702@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-net@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/186401: [re] Problem with RTL8111/8168B initialization X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 01:46:31 -0000 Old Synopsis: Problem with RTL8111/8168B initialization New Synopsis: [re] Problem with RTL8111/8168B initialization Responsible-Changed-From-To: freebsd-amd64->freebsd-net Responsible-Changed-By: linimon Responsible-Changed-When: Tue Feb 4 01:46:14 UTC 2014 Responsible-Changed-Why: reclassify. http://www.freebsd.org/cgi/query-pr.cgi?pr=186401 From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 02:29:47 2014 Return-Path: Delivered-To: freebsd-net@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1488E7D for ; Tue, 4 Feb 2014 02:29:47 +0000 (UTC) Received: from mail.allbsd.org (gatekeeper.allbsd.org [IPv6:2001:2f0:104:e001::32]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 836141242 for ; Tue, 4 Feb 2014 02:29:46 +0000 (UTC) Received: from alph.d.allbsd.org (p2106-ipbf2009funabasi.chiba.ocn.ne.jp [114.146.169.106]) (authenticated bits=128) by mail.allbsd.org (8.14.5/8.14.5) with ESMTP id s142TPwj056282 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 4 Feb 2014 11:29:35 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from localhost (localhost [IPv6:::1]) (authenticated bits=0) by alph.d.allbsd.org (8.14.7/8.14.7) with ESMTP id s142TMhX001779; Tue, 4 Feb 2014 11:29:25 +0900 (JST) (envelope-from hrs@FreeBSD.org) Date: Tue, 04 Feb 2014 11:15:44 +0900 (JST) Message-Id: <20140204.111544.1874646668550166600.hrs@allbsd.org> To: Mark.Martinec+freebsd@ijs.si Subject: Re: ip6opt.c From: Hiroki Sato In-Reply-To: <758be52b7a245c492ec80a1d3a21d79e@mailbox.ijs.si> References: <20140204.052625.1192023326694116318.hrs@allbsd.org> <758be52b7a245c492ec80a1d3a21d79e@mailbox.ijs.si> X-PGPkey-fingerprint: BDB3 443F A5DD B3D0 A530 FFD7 4F2C D3D8 2793 CF2D X-Mailer: Mew version 6.5 on Emacs 24.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="--Security_Multipart(Tue_Feb__4_11_15_44_2014_566)--" Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.4 at gatekeeper.allbsd.org X-Virus-Status: Clean X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.7 (mail.allbsd.org [133.31.130.32]); Tue, 04 Feb 2014 11:29:36 +0900 (JST) X-Spam-Status: No, score=-94.3 required=13.0 tests=CONTENT_TYPE_PRESENT, RCVD_IN_PBL,RCVD_IN_RP_RNBL,SPF_SOFTFAIL,USER_IN_WHITELIST autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on gatekeeper.allbsd.org Cc: freebsd-net@FreeBSD.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 02:29:47 -0000 ----Security_Multipart(Tue_Feb__4_11_15_44_2014_566)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Mark Martinec wrote in <758be52b7a245c492ec80a1d3a21d79e@mailbox.ijs.si>: Ma> It is against the specs. Ma> Ma> The RFC 3542 is clear on this: Ma> Ma> Ma> 10.1. inet6_opt_init Ma> Ma> int inet6_opt_init(void *extbuf, socklen_t extlen); Ma> Ma> This function returns the number of bytes needed for the empty Ma> extension header i.e., without any options. If extbuf is not NULL it Ma> also initializes the extension header to have the correct length Ma> field. In that case if the extlen value is not a positive (i.e., Ma> non-zero) multiple of 8 the function fails and returns -1. Ah, I see. Thank you for the clarification. Ma> li> Should I commit it? Ma> Ma> I'd say yes. I think so, too. -- Hiroki ----Security_Multipart(Tue_Feb__4_11_15_44_2014_566)-- Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlLwTVAACgkQTyzT2CeTzy07EQCgj0k80qpTFSzYa6lWagokxR+p pNcAoJYVxQAwmSRsMTDOfirmkqh8hSVC =bRF0 -----END PGP SIGNATURE----- ----Security_Multipart(Tue_Feb__4_11_15_44_2014_566)---- From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 03:33:22 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B22049FC for ; Tue, 4 Feb 2014 03:33:22 +0000 (UTC) Received: from mail-ob0-x236.google.com (mail-ob0-x236.google.com [IPv6:2607:f8b0:4003:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7C5011B2C for ; Tue, 4 Feb 2014 03:33:22 +0000 (UTC) Received: by mail-ob0-f182.google.com with SMTP id wm4so9012586obc.27 for ; Mon, 03 Feb 2014 19:33:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=gBjTlnodYWZ94Q5Z58ILEzjUMAjPVXn3zcX+C/QBjeg=; b=ZHgWSb/K0y1rrMktdS39S7nLM3MNMqGjfPewXUL3zp4f6DkosGL/IYkV+pqf1jU1Bk zaNyJdRJwatkexqmtwhJEMf+knZvsNWHOGbpVrvhTSfAGlS6Jc0YcnKWUsBdgt/YdK0F lgyM6365cf9S5pAqit0kcg4rAdKLtjYfm4M7KJoRD+C3sBX06mxudmoTqxieZY4cISjy XN9fBObAnJabUY1Uagm0gqdfsMSpqjhWPHdR5jNlYGyln7FcgI3ubzmZf+FeHK0Se0Bs 7U2ecahWxA7NlfI/IwqBobeADhz4dxBtoPf4FjNYU58xP7xalNVrsSVdz4BUXOF7eSjp iAfg== MIME-Version: 1.0 X-Received: by 10.60.16.168 with SMTP id h8mr33620733oed.32.1391484801787; Mon, 03 Feb 2014 19:33:21 -0800 (PST) Received: by 10.182.74.4 with HTTP; Mon, 3 Feb 2014 19:33:21 -0800 (PST) Date: Mon, 3 Feb 2014 19:33:21 -0800 Message-ID: Subject: vnet deletion panic From: Vijay Singh To: "freebsd-net@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.17 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 03:33:22 -0000 I'm running into a crash due on vnet deletion in the presence of routing sockets. The root cause seems to originate from(): if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() -> rt_dispatch() In rt_dispatch() we have: #ifdef VIMAGE if (V_loif) m->m_pkthdr.rcvif = V_loif; #endif netisr_queue(NETISR_ROUTE, m); Now since this would be processed async, and the ifp alove is the loopback of the vnet being deleted, we run into accessing a freed pointer (ifp) when netisr picks up the mbuf. So I am wondering how to fix this. I am thinking that we could do something like the following in rt_dispatch(): #ifdef VIMAGE if (V_loif) { if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { CURVNET_SET_QUIET(vnet0); m->m_pkthdr.rcvif = V_loif; CURVNET_RESTORE(); } else m->m_pkthdr.rcvif = V_loif; } #endif So basically switch to the default vnet for the mbuf with the routing socket message. Thoughts? -vijay From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 04:53:11 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 27ECA6E2 for ; Tue, 4 Feb 2014 04:53:11 +0000 (UTC) Received: from mail.fer.hr (mail.fer.hr [161.53.72.233]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id ADB08117F for ; Tue, 4 Feb 2014 04:53:09 +0000 (UTC) Received: from x23.lan (141.138.17.195) by MAIL.fer.hr (161.53.72.233) with Microsoft SMTP Server (TLS) id 14.2.342.3; Tue, 4 Feb 2014 05:51:56 +0100 Date: Tue, 4 Feb 2014 05:52:29 +0100 From: Marko Zec To: Vijay Singh Subject: Re: vnet deletion panic Message-ID: <20140204055229.4a52ec15@x23.lan> In-Reply-To: References: Organization: FER X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.19; amd64-portbld-freebsd9.1) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [141.138.17.195] Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 04:53:11 -0000 On Mon, 3 Feb 2014 19:33:21 -0800 Vijay Singh wrote: > I'm running into a crash due on vnet deletion in the presence of > routing sockets. The root cause seems to originate from(): > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() -> > rt_dispatch() > > In rt_dispatch() we have: > > #ifdef VIMAGE > if (V_loif) > m->m_pkthdr.rcvif = V_loif; > #endif > netisr_queue(NETISR_ROUTE, m); > > Now since this would be processed async, and the ifp alove is the > loopback of the vnet being deleted, we run into accessing a freed > pointer (ifp) when netisr picks up the mbuf. So I am wondering how to > fix this. I am thinking that we could do something like the following > in rt_dispatch(): > > #ifdef VIMAGE > if (V_loif) { > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > CURVNET_SET_QUIET(vnet0); > m->m_pkthdr.rcvif = V_loif; > CURVNET_RESTORE(); > } else > m->m_pkthdr.rcvif = V_loif; > } > #endif > > So basically switch to the default vnet for the mbuf with the routing > socket message. Thoughts? By design, the vnet teardown procedure should not commence before the last socket attached to that vnet is closed, so I'm suspicious whether the proposed approach could actually appease the panics you're observing. Furthermore, it would certainly cause bogus routing messages to appear in vnet0 and possibly confuse routing socket consumers running there. Plus, in rt_dispatch() there's no ifp context to check against V_loif at all, as you're proposing your patch? Perhaps it could be possible to walk through all the netisr queues just before V_loif gets destroyed, and prune all queued mbufs which have m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown procedure cannot be initiated before all (routing) sockets attached to that vnet have been closed, after all other ifnets except V_loif have also been destroyed it should not be possible for new mbufs to be queued with rcvif pointing back to V_loif, so at least conceptually that approach might work correctly. Marko From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 05:05:29 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 770F890C for ; Tue, 4 Feb 2014 05:05:29 +0000 (UTC) Received: from mail-oa0-x22f.google.com (mail-oa0-x22f.google.com [IPv6:2607:f8b0:4003:c02::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3C04D121E for ; Tue, 4 Feb 2014 05:05:29 +0000 (UTC) Received: by mail-oa0-f47.google.com with SMTP id m1so9188241oag.20 for ; Mon, 03 Feb 2014 21:05:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=G1+Nmm05d6xTX2FLAhc6HZtaoIKGnQp9vS0i+/m9uhE=; b=JXmrs9C8nCkObQQLNgN80FSBWpc/TlOe8jFOpz2buymt8R0zIfNLGty3c3Ah/gppIA 3BydX/U9pIKHjN/BXJ50k9x1NDlNak6XBywiazExlI5ZMju5P8N4+c9mAT7b1PvgmzUn 8wlJg3bTuHN/7qW7c6JSUmZM4dFLHhPu497GP81ZuBEnWZt1ksh+cYYmFrkAzVgpyRQm LYQgpZ2fU0xey9MWcyLAbF1PuMvgdxrNDBTgtzarLjrzm6HrVtxuKKeJNZFZ/RyCLxXY QUSzlgvp1Osud8eivm7O8unChojUzaAL+ETu/ims35K7SolQVS7nFVUiVpTyW58LeE1f CKYQ== MIME-Version: 1.0 X-Received: by 10.182.40.201 with SMTP id z9mr5428468obk.45.1391490328383; Mon, 03 Feb 2014 21:05:28 -0800 (PST) Received: by 10.182.74.4 with HTTP; Mon, 3 Feb 2014 21:05:28 -0800 (PST) In-Reply-To: <20140204055229.4a52ec15@x23.lan> References: <20140204055229.4a52ec15@x23.lan> Date: Mon, 3 Feb 2014 21:05:28 -0800 Message-ID: Subject: Re: vnet deletion panic From: Vijay Singh To: Marko Zec Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 05:05:29 -0000 Hi Marco, the code in rt_ifmsg() checks what seems like global state, so its not routing sockets in the vnet being destroyed. rt_ifmsg(struct ifnet *ifp) { struct if_msghdr *ifm; struct mbuf *m; struct rt_addrinfo info; if (route_cb.any_count == 0) return; You are right, there is no ifp context in rt_dispatch(). So perhaps we should not call rt_ifmsg() from if_unroute() is (ifp == V_loif) since that would end up using the soon to be destroyed ifp in the mbuf. What do you think? On Mon, Feb 3, 2014 at 8:52 PM, Marko Zec wrote: > On Mon, 3 Feb 2014 19:33:21 -0800 > Vijay Singh wrote: > > > I'm running into a crash due on vnet deletion in the presence of > > routing sockets. The root cause seems to originate from(): > > > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> rt_ifmsg() -> > > rt_dispatch() > > > > In rt_dispatch() we have: > > > > #ifdef VIMAGE > > if (V_loif) > > m->m_pkthdr.rcvif = V_loif; > > #endif > > netisr_queue(NETISR_ROUTE, m); > > > > Now since this would be processed async, and the ifp alove is the > > loopback of the vnet being deleted, we run into accessing a freed > > pointer (ifp) when netisr picks up the mbuf. So I am wondering how to > > fix this. I am thinking that we could do something like the following > > in rt_dispatch(): > > > > #ifdef VIMAGE > > if (V_loif) { > > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > > CURVNET_SET_QUIET(vnet0); > > m->m_pkthdr.rcvif = V_loif; > > CURVNET_RESTORE(); > > } else > > m->m_pkthdr.rcvif = V_loif; > > } > > #endif > > > > So basically switch to the default vnet for the mbuf with the routing > > socket message. Thoughts? > > By design, the vnet teardown procedure should not commence before the > last socket attached to that vnet is closed, so I'm suspicious whether > the proposed approach could actually appease the panics you're > observing. Furthermore, it would certainly cause bogus routing messages > to appear in vnet0 and possibly confuse routing socket consumers > running there. Plus, in rt_dispatch() there's no ifp context to check > against V_loif at all, as you're proposing your patch? > > Perhaps it could be possible to walk through all the netisr queues just > before V_loif gets destroyed, and prune all queued mbufs which have > m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown procedure > cannot be initiated before all (routing) sockets attached to that vnet > have been closed, after all other ifnets except V_loif have also been > destroyed it should not be possible for new mbufs to be queued with > rcvif pointing back to V_loif, so at least conceptually that approach > might work correctly. > > Marko > From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 05:27:01 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3AEABE88 for ; Tue, 4 Feb 2014 05:27:01 +0000 (UTC) Received: from mail.fer.hr (mail.fer.hr [161.53.72.233]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 909C61366 for ; Tue, 4 Feb 2014 05:27:00 +0000 (UTC) Received: from x23.lan (141.136.230.70) by MAIL.fer.hr (161.53.72.233) with Microsoft SMTP Server (TLS) id 14.2.342.3; Tue, 4 Feb 2014 06:26:58 +0100 Date: Tue, 4 Feb 2014 06:27:31 +0100 From: Marko Zec To: Vijay Singh Subject: Re: vnet deletion panic Message-ID: <20140204062731.2e9d4bc0@x23.lan> In-Reply-To: References: <20140204055229.4a52ec15@x23.lan> Organization: FER X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.19; amd64-portbld-freebsd9.1) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/.h=lNEG+_Pdgcpx_45mRJX2" X-Originating-IP: [141.136.230.70] Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 05:27:01 -0000 --MP_/.h=lNEG+_Pdgcpx_45mRJX2 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Mon, 3 Feb 2014 21:05:28 -0800 Vijay Singh wrote: > Hi Marco, the code in rt_ifmsg() checks what seems like global state, > so its not routing sockets in the vnet being destroyed. Huh then we should V_irtualize that global state - maybe the attached patch could help (only compile-tested, perhaps you could try it out)? Marko > rt_ifmsg(struct ifnet *ifp) > { > struct if_msghdr *ifm; > struct mbuf *m; > struct rt_addrinfo info; > > if (route_cb.any_count == 0) > return; > > You are right, there is no ifp context in rt_dispatch(). So perhaps we > should not call rt_ifmsg() from if_unroute() is (ifp == V_loif) since > that would end up using the soon to be destroyed ifp in the mbuf. > What do you think? > > > On Mon, Feb 3, 2014 at 8:52 PM, Marko Zec wrote: > > > On Mon, 3 Feb 2014 19:33:21 -0800 > > Vijay Singh wrote: > > > > > I'm running into a crash due on vnet deletion in the presence of > > > routing sockets. The root cause seems to originate from(): > > > > > > if_detach_internal() -> if_down(ifp) -> if_unroute() -> > > > rt_ifmsg() -> rt_dispatch() > > > > > > In rt_dispatch() we have: > > > > > > #ifdef VIMAGE > > > if (V_loif) > > > m->m_pkthdr.rcvif = V_loif; > > > #endif > > > netisr_queue(NETISR_ROUTE, m); > > > > > > Now since this would be processed async, and the ifp alove is the > > > loopback of the vnet being deleted, we run into accessing a freed > > > pointer (ifp) when netisr picks up the mbuf. So I am wondering > > > how to fix this. I am thinking that we could do something like > > > the following in rt_dispatch(): > > > > > > #ifdef VIMAGE > > > if (V_loif) { > > > if ((ifp == V_loif) && !IS_DEFAULT_VNET(curvnet)) { > > > CURVNET_SET_QUIET(vnet0); > > > m->m_pkthdr.rcvif = V_loif; > > > CURVNET_RESTORE(); > > > } else > > > m->m_pkthdr.rcvif = V_loif; > > > } > > > #endif > > > > > > So basically switch to the default vnet for the mbuf with the > > > routing socket message. Thoughts? > > > > By design, the vnet teardown procedure should not commence before > > the last socket attached to that vnet is closed, so I'm suspicious > > whether the proposed approach could actually appease the panics > > you're observing. Furthermore, it would certainly cause bogus > > routing messages to appear in vnet0 and possibly confuse routing > > socket consumers running there. Plus, in rt_dispatch() there's no > > ifp context to check against V_loif at all, as you're proposing > > your patch? > > > > Perhaps it could be possible to walk through all the netisr queues > > just before V_loif gets destroyed, and prune all queued mbufs which > > have m->m_pkthdr.rcvif pointing to V_loif? Since the vnet teardown > > procedure cannot be initiated before all (routing) sockets attached > > to that vnet have been closed, after all other ifnets except V_loif > > have also been destroyed it should not be possible for new mbufs to > > be queued with rcvif pointing back to V_loif, so at least > > conceptually that approach might work correctly. > > > > Marko > > --MP_/.h=lNEG+_Pdgcpx_45mRJX2 Content-Type: text/x-patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="v_route_cb.diff" Index: rtsock.c =================================================================== --- rtsock.c (revision 258393) +++ rtsock.c (working copy) @@ -156,12 +156,14 @@ #define RTS_FILTER_FIB M_PROTO8 #define RTS_ALLFIBS -1 -static struct { +typedef struct { int ip_count; /* attached w/ AF_INET */ int ip6_count; /* attached w/ AF_INET6 */ int ipx_count; /* attached w/ AF_IPX */ int any_count; /* total attached */ -} route_cb; +} route_cb_t; +static VNET_DEFINE(route_cb_t, route_cb); +#define V_route_cb VNET(route_cb) struct mtx rtsock_mtx; MTX_SYSINIT(rtsock, &rtsock_mtx, "rtsock route_cb lock", MTX_DEF); @@ -328,16 +330,16 @@ RTSOCK_LOCK(); switch(rp->rcb_proto.sp_protocol) { case AF_INET: - route_cb.ip_count++; + V_route_cb.ip_count++; break; case AF_INET6: - route_cb.ip6_count++; + V_route_cb.ip6_count++; break; case AF_IPX: - route_cb.ipx_count++; + V_route_cb.ipx_count++; break; } - route_cb.any_count++; + V_route_cb.any_count++; RTSOCK_UNLOCK(); soisconnected(so); so->so_options |= SO_USELOOPBACK; @@ -372,16 +374,16 @@ RTSOCK_LOCK(); switch(rp->rcb_proto.sp_protocol) { case AF_INET: - route_cb.ip_count--; + V_route_cb.ip_count--; break; case AF_INET6: - route_cb.ip6_count--; + V_route_cb.ip6_count--; break; case AF_IPX: - route_cb.ipx_count--; + V_route_cb.ipx_count--; break; } - route_cb.any_count--; + V_route_cb.any_count--; RTSOCK_UNLOCK(); raw_usrreqs.pru_detach(so); } @@ -928,7 +930,7 @@ * Check to see if we don't want our own messages. */ if ((so->so_options & SO_USELOOPBACK) == 0) { - if (route_cb.any_count <= 1) { + if (V_route_cb.any_count <= 1) { if (rtm) Free(rtm); m_freem(m); @@ -1217,7 +1219,7 @@ struct mbuf *m; struct sockaddr *sa = rtinfo->rti_info[RTAX_DST]; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; m = rt_msg1(type, rtinfo); if (m == NULL) @@ -1255,7 +1257,7 @@ struct mbuf *m; struct rt_addrinfo info; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; bzero((caddr_t)&info, sizeof(info)); m = rt_msg1(RTM_IFINFO, &info); @@ -1299,7 +1301,7 @@ sctp_addr_change(ifa, cmd); #endif /* SCTP */ #endif - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; for (pass = 1; pass < 3; pass++) { bzero((caddr_t)&info, sizeof(info)); @@ -1368,7 +1370,7 @@ struct ifnet *ifp = ifma->ifma_ifp; struct ifma_msghdr *ifmam; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return; bzero((caddr_t)&info, sizeof(info)); @@ -1397,7 +1399,7 @@ struct if_announcemsghdr *ifan; struct mbuf *m; - if (route_cb.any_count == 0) + if (V_route_cb.any_count == 0) return NULL; bzero((caddr_t)info, sizeof(*info)); m = rt_msg1(type, info); --MP_/.h=lNEG+_Pdgcpx_45mRJX2-- From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 07:28:00 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 669968E5; Tue, 4 Feb 2014 07:28:00 +0000 (UTC) Received: from cc-smtpout2.netcologne.de (cc-smtpout2.netcologne.de [IPv6:2001:4dd0:100:1062:25:2:0:2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1E49E1B65; Tue, 4 Feb 2014 07:28:00 +0000 (UTC) Received: from cc-smtpin1.netcologne.de (cc-smtpin1.netcologne.de [89.1.8.201]) by cc-smtpout2.netcologne.de (Postfix) with ESMTP id 66DE6121AF; Tue, 4 Feb 2014 08:27:50 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by cc-smtpin1.netcologne.de (Postfix) with ESMTP id 61E9D11DBE; Tue, 4 Feb 2014 08:27:50 +0100 (CET) Received: from [109.44.3.238] (helo=cc-smtpin1.netcologne.de) by localhost with ESMTP (eXpurgate 4.0.2) (envelope-from ) id 52f09676-1380-7f0000012729-7f000001831d-1 for ; Tue, 04 Feb 2014 08:27:50 +0100 Received: from collider (ip-109-44-3-238.web.vodafone.de [109.44.3.238]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by cc-smtpin1.netcologne.de (Postfix) with ESMTPSA; Tue, 4 Feb 2014 08:27:40 +0100 (CET) Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: lists@eitanadler.com, "Hiroki Sato" Subject: Re: ip6opt.c References: <20140204.052625.1192023326694116318.hrs@allbsd.org> Date: Tue, 04 Feb 2014 08:27:36 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Sascha Wildner" Message-ID: In-Reply-To: <20140204.052625.1192023326694116318.hrs@allbsd.org> User-Agent: Opera Mail/1.0 (Win32) Cc: freebsd-net@freebsd.org, swildner@dragonflybsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 07:28:00 -0000 On Mon, 03 Feb 2014 21:26:25 +0100, Hiroki Sato wrote: > Eitan Adler wrote > in > : > > li> Hi all, > li> > li> DragonFly recently committed the following change and it seems that > it > li> applies to us as well. > li> > li> > http://gitweb.dragonflybsd.org/dragonfly.git/blobdiff/5764e12516158974fac10d50dbd2df76ce1ab007..98651c6e0e1c3b7a6b8650b55b473fcc745a22b7:/lib/libc/net/ip6opt.c > li> > li> Should I commit it? > > Just out of curiousity, what is the problem with returning -1 when > (extbuf == NULL) && (extlen % 8) != 0? I read the RFC as saying that the extlen check should be performed in the extbuf != NULL case, not when it is NULL. No real problem, really. Sascha From owner-freebsd-net@FreeBSD.ORG Tue Feb 4 09:38:39 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 63F26447 for ; Tue, 4 Feb 2014 09:38:39 +0000 (UTC) Received: from mx11.netapp.com (mx11.netapp.com [216.240.18.76]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 317771715 for ; Tue, 4 Feb 2014 09:38:38 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.95,778,1384329600"; d="asc'?scan'208";a="99927825" Received: from vmwexceht04-prd.hq.netapp.com ([10.106.77.34]) by mx11-out.netapp.com with ESMTP; 04 Feb 2014 01:38:30 -0800 Received: from SACEXCMBX01-PRD.hq.netapp.com ([169.254.2.211]) by vmwexceht04-prd.hq.netapp.com ([10.106.77.34]) with mapi id 14.03.0123.003; Tue, 4 Feb 2014 01:38:30 -0800 From: "Eggert, Lars" To: "freebsd-net@freebsd.org" Subject: Patches for RFC6937 and draft-ietf-tcpm-newcwv-00 Thread-Topic: Patches for RFC6937 and draft-ietf-tcpm-newcwv-00 Thread-Index: AQHPIYzc3HaTSw//+U6b0HMYTUDKSQ== Date: Tue, 4 Feb 2014 09:38:30 +0000 Message-ID: <259C9434-C6FE-42EA-823D-ECB024DBF3D7@netapp.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [10.106.53.51] Content-Type: multipart/signed; boundary="Apple-Mail=_5088A38A-DF43-450D-9E03-31C51BD176C7"; protocol="application/pgp-signature"; micalg=pgp-sha1 MIME-Version: 1.0 Cc: "varis81@hotmail.com" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 09:38:39 -0000 --Apple-Mail=_5088A38A-DF43-450D-9E03-31C51BD176C7 Content-Type: multipart/mixed; boundary="Apple-Mail=_E3102E34-B41D-47A0-AB2F-5E03D718EBE4" --Apple-Mail=_E3102E34-B41D-47A0-AB2F-5E03D718EBE4 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, below are two patches that implement RFC6937 ("Proportional Rate = Reduction for TCP") and draft-ietf-tcpm-newcwv-00 ("Updating TCP to = support Rate-Limited Traffic"). They were done by Aris = Angelogiannopoulos for his MS thesis, which is at = https://eggert.org/students/angelogiannopoulos-thesis.pdf. The patches should apply to -CURRENT as of Sep 17, 2013. (Sorry for the = delay in sending them, we'd been trying to get some feedback from = committers first, without luck.) Please note that newcwv is still a work in progress in the IETF, and the = patch has some limitations with regards to the "pipeACK Sampling Period" = mentioned in the Internet-Draft. Aris says this in his thesis about what = exactly he implemented: "The second implementation choice, is in regards with the measurement of = pipeACK. This variable is the most important introduced by the method = and is used to compute the phase that the sender currently lies in. In = order to compute pipeACK the approach suggested by the Internet Draft = (ID) is followed [ncwv]. During initialization, pipeACK is set to the = maximum possible value. A helper variable prevHighACK is introduced that = is initialized to the initial sequence number (iss). prevHighACK holds = the value of the highest acknowledged byte so far. pipeACK is measured = once per RTT meaning that when an ACK covering prevHighACK is received, = pipeACK becomes the difference between the current ACK and prevHighACK. = This is called a pipeACK sample. A newer version of the draft suggests = that multiple pipeACK samples can be used during the pipeACK sampling = period." Lars --Apple-Mail=_E3102E34-B41D-47A0-AB2F-5E03D718EBE4 Content-Disposition: attachment; filename=prr.patch Content-Type: application/octet-stream; name="prr.patch" Content-Transfer-Encoding: 7bit diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c index 75609fd..70c29a8 100644 --- a/sys/netinet/tcp_input.c +++ b/sys/netinet/tcp_input.c @@ -145,6 +145,18 @@ SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, drop_synfin, CTLFLAG_RW, &VNET_NAME(drop_synfin), 0, "Drop TCP packets with SYN+FIN set"); +VNET_DEFINE(int, tcp_do_prr_conservative) = 0; +#define V_tcp_do_prr_conservative VNET(tcp_do_prr_conservative) +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, do_prr_conservative, CTLFLAG_RW, + &VNET_NAME(tcp_do_prr_conservative), 0, + "Do conservative PRR"); + +VNET_DEFINE(int, tcp_do_prr) = 0; +#define V_tcp_do_prr VNET(tcp_do_prr) +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, do_prr, CTLFLAG_RW, + &VNET_NAME(tcp_do_prr), 0, + "Do the Proportional Rate Reduction Algorithm"); + VNET_DEFINE(int, tcp_do_rfc3042) = 1; #define V_tcp_do_rfc3042 VNET(tcp_do_rfc3042) SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, rfc3042, CTLFLAG_RW, @@ -229,6 +241,7 @@ static void tcp_pulloutofband(struct socket *, struct tcphdr *, struct mbuf *, int); static void tcp_xmit_timer(struct tcpcb *, int); static void tcp_newreno_partial_ack(struct tcpcb *, struct tcphdr *); +static void tcp_prr_partial_ack(struct tcpcb *, struct tcphdr *); static void inline tcp_fields_to_host(struct tcphdr *); #ifdef TCP_SIGNATURE static void inline tcp_fields_to_net(struct tcphdr *); @@ -2460,7 +2473,50 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so, else if (++tp->t_dupacks > tcprexmtthresh || IN_FASTRECOVERY(tp->t_flags)) { cc_ack_received(tp, th, CC_DUPACK); - if ((tp->t_flags & TF_SACK_PERMIT) && + if (V_tcp_do_prr && + IN_FASTRECOVERY(tp->t_flags) && + (tp->t_flags & TF_SACK_PERMIT)) { + long snd_cnt = 0, limit = 0, del_data = 0, pipe = 0; + /* + *In a duplicate ACK del_data is only the + *diff_in_sack. If no SACK is used del_data will be 0. + *Pipe is the amount of data we estimate to be + *in the network. + */ + del_data = tp->diff_in_sack; + pipe = (tp->snd_nxt - tp->snd_fack) + + tp->sackhint.sack_bytes_rexmit; + tp->prr_delivered += del_data; + if (pipe > tp->snd_ssthresh) + snd_cnt = (tp->prr_delivered * tp->snd_ssthresh / + tp->recover_fs) + 1 - tp->prr_out; + else { + if (V_tcp_do_prr_conservative) + limit = tp->prr_delivered - tp->prr_out; + else + if ((tp->prr_delivered - tp->prr_out) > del_data) + limit = tp->prr_delivered - tp->prr_out + + tp->t_maxseg; + else + limit = del_data + tp->t_maxseg; + if ((tp->snd_ssthresh - pipe) < limit) + snd_cnt = tp->snd_ssthresh - pipe; + else + snd_cnt = limit; + } + snd_cnt = (snd_cnt / tp->t_maxseg); + if (snd_cnt < 0) + snd_cnt = 0; + /* + * Send snd_cnt new data into the network in + * response to this ack.If there is gonna be a + * SACK retransmission, adjust snd_cwnd + * accordingly. + */ + tp->snd_cwnd = tp->snd_nxt - tp->sack_newdata + + tp->sackhint.sack_bytes_rexmit + (snd_cnt*tp->t_maxseg); + } + else if ((tp->t_flags & TF_SACK_PERMIT) && IN_FASTRECOVERY(tp->t_flags)) { int awnd; @@ -2495,12 +2551,18 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so, tcp_seq onxt = tp->snd_nxt; /* - * If we're doing sack, check to - * see if we're already in sack + * If we're doing sack or prr, check to + * see if we're already in * recovery. If we're not doing sack, * check to see if we're in newreno * recovery. */ + if (V_tcp_do_prr) { + if (IN_FASTRECOVERY(tp->t_flags)) { + tp->t_dupacks = 0; + break; + } + } if (tp->t_flags & TF_SACK_PERMIT) { if (IN_FASTRECOVERY(tp->t_flags)) { tp->t_dupacks = 0; @@ -2518,6 +2580,15 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so, cc_ack_received(tp, th, CC_DUPACK); tcp_timer_activate(tp, TT_REXMT, 0); tp->t_rtttime = 0; + if (V_tcp_do_prr) { + /* + * snd_ssthresh is already updated by cc_cong_signal. + */ + tp->prr_delivered = 0; + tp->prr_out = 0; + if(!(tp->recover_fs = tp->snd_nxt - tp->snd_una)) + tp->recover_fs = 1; + } if (tp->t_flags & TF_SACK_PERMIT) { TCPSTAT_INC( tcps_sack_recovery_episode); @@ -2614,7 +2685,9 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so, */ if (IN_FASTRECOVERY(tp->t_flags)) { if (SEQ_LT(th->th_ack, tp->snd_recover)) { - if (tp->t_flags & TF_SACK_PERMIT) + if (V_tcp_do_prr && (tp->t_flags & TF_SACK_PERMIT)) + tcp_prr_partial_ack(tp, th); + else if (tp->t_flags & TF_SACK_PERMIT) tcp_sack_partialack(tp, th); else tcp_newreno_partial_ack(tp, th); @@ -3692,6 +3765,57 @@ tcp_mssopt(struct in_conninfo *inc) return (mss); } +static void +tcp_prr_partial_ack(struct tcpcb *tp, struct tcphdr *th) +{ + long snd_cnt = 0, limit = 0, del_data = 0, pipe = 0; + + INP_WLOCK_ASSERT(tp->t_inpcb); + + tcp_timer_activate(tp, TT_REXMT, 0); + tp->t_rtttime = 0; + /* + * Compute amount of data that this ACK is indicating (del_data) + * and an estimate of how many bytes are in the network. + */ + if (SEQ_GEQ(th->th_ack,tp->snd_una)) + del_data = BYTES_THIS_ACK(tp, th); + del_data += tp->diff_in_sack; + pipe = (tp->snd_nxt - tp->snd_fack) + tp->sackhint.sack_bytes_rexmit; + tp->prr_delivered += del_data; + /* + * Proportional Rate Reduction + */ + if (pipe > tp->snd_ssthresh) + snd_cnt = (tp->prr_delivered * tp->snd_ssthresh / tp->recover_fs) - + tp->prr_out; + else { + if (V_tcp_do_prr_conservative) + limit = tp->prr_delivered - tp->prr_out; + else + if ((tp->prr_delivered - tp->prr_out) > del_data) + limit = tp->prr_delivered - tp->prr_out + tp->t_maxseg; + else + limit = del_data + tp->t_maxseg; + if ((tp->snd_ssthresh - pipe) < limit) + snd_cnt = tp->snd_ssthresh - pipe; + else + snd_cnt = limit; + } + snd_cnt = (snd_cnt / tp->t_maxseg); + if (snd_cnt < 0) + snd_cnt = 0; + /* + * Send snd_cnt new data into the network + * in response to this ack. + * If there is gonna be a SACK retransmission, + * adjust snd_cwnd accordingly. + */ + tp->snd_cwnd = tp->snd_nxt - tp->sack_newdata + + tp->sackhint.sack_bytes_rexmit + (snd_cnt * tp->t_maxseg); + tp->t_flags |= TF_ACKNOW; + (void) tcp_output(tp); +} /* * On a partial ack arrives, force the retransmission of the diff --git a/sys/netinet/tcp_output.c b/sys/netinet/tcp_output.c index 00d5415..7b4936d 100644 --- a/sys/netinet/tcp_output.c +++ b/sys/netinet/tcp_output.c @@ -1194,6 +1194,8 @@ send: ((so->so_options & SO_DONTROUTE) ? IP_ROUTETOIF : 0), NULL, NULL, tp->t_inpcb); + if (V_tcp_do_prr && IN_FASTRECOVERY(tp->t_flags)) + tp->prr_out += len; if (error == EMSGSIZE && ro.ro_rt != NULL) mtu = ro.ro_rt->rt_rmx.rmx_mtu; RO_RTFREE(&ro); @@ -1232,6 +1234,8 @@ send: ((so->so_options & SO_DONTROUTE) ? IP_ROUTETOIF : 0), 0, tp->t_inpcb); + if (V_tcp_do_prr && IN_FASTRECOVERY(tp->t_flags)) + tp->prr_out += len; if (error == EMSGSIZE && ro.ro_rt != NULL) mtu = ro.ro_rt->rt_rmx.rmx_mtu; RO_RTFREE(&ro); @@ -1323,6 +1327,8 @@ timer: * XXX: It is a POLA question whether calling tcp_drop right * away would be the really correct behavior instead. */ + if (V_tcp_do_prr && IN_FASTRECOVERY(tp->t_flags)) + tp->prr_out -= len; if (((tp->t_flags & TF_FORCEDATA) == 0 || !tcp_timer_active(tp, TT_PERSIST)) && ((flags & TH_SYN) == 0) && diff --git a/sys/netinet/tcp_sack.c b/sys/netinet/tcp_sack.c index 440bd64..800df2f 100644 --- a/sys/netinet/tcp_sack.c +++ b/sys/netinet/tcp_sack.c @@ -348,9 +348,10 @@ tcp_sackhole_remove(struct tcpcb *tp, struct sackhole *hole) void tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) { - struct sackhole *cur, *temp; + struct sackhole *cur, *temp, *temp1; struct sackblk sack, sack_blocks[TCP_MAX_SACK + 1], *sblkp; int i, j, num_sack_blks; + tcp_seq old = 0, new = 0; INP_WLOCK_ASSERT(tp->t_inpcb); @@ -382,13 +383,25 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) sack_blocks[num_sack_blks++] = sack; } } + if (TAILQ_EMPTY(&tp->snd_holes)) + /* + * Empty scoreboard. Need to initialize snd_fack (it may be + * uninitialized or have a bogus value). Scoreboard holes + * (from the sack blocks received) are created later below + * (in the logic that adds holes to the tail of the + * scoreboard). + */ + tp->snd_fack = SEQ_MAX(tp->snd_una, th_ack); /* * Return if SND.UNA is not advanced and no valid SACK block is - * received. + * received.If no new valid SACK block the scoreboard remains + * the same, i.e. the difference is 0. */ - if (num_sack_blks == 0) + if (num_sack_blks == 0){ + if (V_tcp_do_prr) + tp->diff_in_sack = 0; return; - + } /* * Sort the SACK blocks so we can update the scoreboard with just one * pass. The overhead of sorting upto 4+1 elements is less than @@ -403,15 +416,14 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) } } } - if (TAILQ_EMPTY(&tp->snd_holes)) - /* - * Empty scoreboard. Need to initialize snd_fack (it may be - * uninitialized or have a bogus value). Scoreboard holes - * (from the sack blocks received) are created later below - * (in the logic that adds holes to the tail of the - * scoreboard). - */ - tp->snd_fack = SEQ_MAX(tp->snd_una, th_ack); + if (V_tcp_do_prr) + if(!TAILQ_EMPTY(&tp->snd_holes)) + TAILQ_FOREACH(temp, &tp->snd_holes, scblink) { + if ((temp1 = TAILQ_NEXT(temp, scblink)) != NULL) + old += temp1->start - temp->end; + else if (SEQ_GT(tp->snd_fack, temp->end)) + old += tp->snd_fack - temp->end; + } /* * In the while-loop below, incoming SACK blocks (sack_blocks[]) and * SACK holes (snd_holes) are traversed from their tails with just @@ -540,6 +552,19 @@ tcp_sack_doack(struct tcpcb *tp, struct tcpopt *to, tcp_seq th_ack) else sblkp--; } + /* + * Calculate number of bytes in the scoreboard. + */ + if (V_tcp_do_prr) + if (!TAILQ_EMPTY(&tp->snd_holes)) + TAILQ_FOREACH(temp, &tp->snd_holes, scblink) { + if ((temp1 = TAILQ_NEXT(temp, scblink)) != NULL) + new += temp1->start - temp->end; + else if (SEQ_GT(tp->snd_fack, temp->end)) + new += tp->snd_fack - temp->end; + } + /* Change in the scoreboard in # of bytes */ + tp->diff_in_sack = new - old; } /* diff --git a/sys/netinet/tcp_subr.c b/sys/netinet/tcp_subr.c index 5d37b50..089d8c6 100644 --- a/sys/netinet/tcp_subr.c +++ b/sys/netinet/tcp_subr.c @@ -801,6 +801,7 @@ tcp_newtcpcb(struct inpcb *inp) tp->t_rxtcur = TCPTV_RTOBASE; tp->snd_cwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT; tp->snd_ssthresh = TCP_MAXWIN << TCP_MAX_WINSHIFT; + tp->diff_in_sack = 0; tp->t_rcvtime = ticks; /* * IPv4 TTL initialization is necessary for an IPv6 socket as well, diff --git a/sys/netinet/tcp_var.h b/sys/netinet/tcp_var.h index aaaa4a4..fe1507e 100644 --- a/sys/netinet/tcp_var.h +++ b/sys/netinet/tcp_var.h @@ -161,6 +161,11 @@ struct tcpcb { u_long t_rttupdated; /* number of times rtt sampled */ u_long max_sndwnd; /* largest window peer has offered */ + tcp_seq prr_delivered; /* Total bytes delivered during PRR recovery */ + tcp_seq prr_out; /* Total bytes sent during PRR recovery */ + tcp_seq recover_fs; /* FlightSize at the start of PRR recovery */ + tcp_seq diff_in_sack; /* (Signed) Difference of data in scoreboard due to the current ACK */ + int t_softerror; /* possible error not yet reported */ /* out-of-band data */ char t_oobflags; /* have some */ @@ -174,6 +179,7 @@ struct tcpcb { u_int32_t ts_offset; /* our timestamp offset */ tcp_seq last_ack_sent; + /* experimental */ u_long snd_cwnd_prev; /* cwnd prior to retransmit */ u_long snd_ssthresh_prev; /* ssthresh prior to retransmit */ @@ -627,8 +633,10 @@ VNET_DECLARE(int, tcp_abc_l_var); #define V_tcp_abc_l_var VNET(tcp_abc_l_var) VNET_DECLARE(int, tcp_do_sack); /* SACK enabled/disabled */ +VNET_DECLARE(int, tcp_do_prr); /* PRR enabled/disabled */ VNET_DECLARE(int, tcp_sc_rst_sock_fail); /* RST on sock alloc failure */ #define V_tcp_do_sack VNET(tcp_do_sack) +#define V_tcp_do_prr VNET(tcp_do_prr) #define V_tcp_sc_rst_sock_fail VNET(tcp_sc_rst_sock_fail) VNET_DECLARE(int, tcp_do_ecn); /* TCP ECN enabled/disabled */ --Apple-Mail=_E3102E34-B41D-47A0-AB2F-5E03D718EBE4 Content-Disposition: attachment; filename=newcwv.patch Content-Type: application/octet-stream; name="newcwv.patch" Content-Transfer-Encoding: 7bit diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c index 75609fd..0d11d9f 100644 --- a/sys/netinet/tcp_input.c +++ b/sys/netinet/tcp_input.c @@ -145,6 +145,12 @@ SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, drop_synfin, CTLFLAG_RW, &VNET_NAME(drop_synfin), 0, "Drop TCP packets with SYN+FIN set"); +VNET_DEFINE(int, tcp_do_ncwv) = 0; +#define V_tcp_do_ncwv VNET(tcp_do_ncwv) +SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, do_ncwv, CTLFLAG_RW, + &VNET_NAME(tcp_do_ncwv), 0, + "Do New-CWV targeted to rate-limited applications"); + VNET_DEFINE(int, tcp_do_rfc3042) = 1; #define V_tcp_do_rfc3042 VNET(tcp_do_rfc3042) SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, rfc3042, CTLFLAG_RW, @@ -228,6 +234,7 @@ static void tcp_dropwithreset(struct mbuf *, struct tcphdr *, static void tcp_pulloutofband(struct socket *, struct tcphdr *, struct mbuf *, int); static void tcp_xmit_timer(struct tcpcb *, int); +static void ncwv_check_phase(struct tcpcb *, struct tcphdr *); static void tcp_newreno_partial_ack(struct tcpcb *, struct tcphdr *); static void inline tcp_fields_to_host(struct tcphdr *); #ifdef TCP_SIGNATURE @@ -289,6 +296,7 @@ static void inline cc_ack_received(struct tcpcb *tp, struct tcphdr *th, uint16_t type) { INP_WLOCK_ASSERT(tp->t_inpcb); + int use_cc_algo=1; tp->ccv->bytes_this_ack = BYTES_THIS_ACK(tp, th); if (tp->snd_cwnd <= tp->snd_wnd) @@ -310,7 +318,12 @@ cc_ack_received(struct tcpcb *tp, struct tcphdr *th, uint16_t type) } } - if (CC_ALGO(tp)->ack_received != NULL) { + if (V_tcp_do_ncwv) { + ncwv_check_phase(tp,th); + if (!IN_VALPHASE(tp->t_flags)) + use_cc_algo = 0; + } + if (use_cc_algo && CC_ALGO(tp)->ack_received != NULL) { /* XXXLAS: Find a way to live without this */ tp->ccv->curack = th->th_ack; CC_ALGO(tp)->ack_received(tp->ccv, type); @@ -384,6 +397,10 @@ cc_conn_init(struct tcpcb *tp) tp->snd_cwnd = 4 * tp->t_maxseg; } + if (V_tcp_do_ncwv) { + tp->max_ack_prev = tp->iss; + tp->IW = tp->snd_cwnd; + } if (CC_ALGO(tp)->conn_init != NULL) CC_ALGO(tp)->conn_init(tp->ccv); } @@ -433,6 +450,28 @@ cc_cong_signal(struct tcpcb *tp, struct tcphdr *th, uint32_t type) break; } + /* + *Exit the NVP, that means, stop the 5 min counter. + *Flag on unset and stopped timer means that + *we have exited the NVP. Set pipeAck to max value. + *Record Flightsize so as to use it at the end + *of the congestion. + */ + if (V_tcp_do_ncwv && type != CC_DUPACK && type != CC_RTO_ERR) { + /* + *Exit the NVP, that means, stop the 5 min counter. + *Flag on unset and stopped timer means that + *we have exited the NVP. Set pipeAck to max value. + *Record Flightsize so as to use it at the end + *of the congestion. + */ + tp->lossflightsize = tp->snd_max - tp->snd_una; + tcp_timer_activate(tp, TT_NVP, 0); + EXIT_VALPHASE(tp->t_flags); + tp->pipeack = TCP_MAXWIN << TCP_MAX_WINSHIFT; + if(type != CC_RTO) + tp->snd_cwnd = min(tp->snd_cwnd/2,max(tp->pipeack,tp->lossflightsize)); + } if (CC_ALGO(tp)->cong_signal != NULL) { if (th != NULL) tp->ccv->curack = th->th_ack; @@ -451,6 +490,16 @@ cc_post_recovery(struct tcpcb *tp, struct tcphdr *th) tp->ccv->curack = th->th_ack; CC_ALGO(tp)->post_recovery(tp->ccv); } + if (V_tcp_do_ncwv) + /* + * Fast recovery will conclude after returning from this + * function. Reset the cwin to the value specified by the draft, + * cwnd = ((FlightSize - R)/2), if SACK is used, standard behaviour + * otherwise and also reset pipeACK + */ + tp->pipeack = TCP_MAXWIN << TCP_MAX_WINSHIFT; + if ((tp->t_flags & TF_SACK_PERMIT)) + tp->snd_cwnd = (tp->lossflightsize - tp->sackhint.sack_bytes_rexmit) / 2; /* XXXLAS: EXIT_RECOVERY ? */ tp->t_bytes_acked = 0; } @@ -3692,6 +3741,30 @@ tcp_mssopt(struct in_conninfo *inc) return (mss); } +/* + * Check whether the sender lies in the Validated or + * non-Validate period. + */ +static void +ncwv_check_phase(struct tcpcb *tp, struct tcphdr *th) +{ + INP_WLOCK_ASSERT(tp->t_inpcb); + + if (th->th_ack >= tp->max_ack_prev) { + tp->pipeack = tp->snd_max - tp->snd_una; + tp->max_ack_prev = th->th_ack; + tcp_timer_activate(tp, TT_PACK, max(3*tp->t_srtt,1000)); + } + if (tp->pipeack >= (tp->snd_cwnd / 2)){ + ENTER_VALPHASE(tp->t_flags); + if(tcp_timer_active(tp, TT_NVP)) + tcp_timer_activate(tp, TT_NVP, 0); + } else { + EXIT_VALPHASE(tp->t_flags); + if(!tcp_timer_active(tp, TT_NVP)) + tcp_timer_activate(tp, TT_NVP, 300000); + } +} /* * On a partial ack arrives, force the retransmission of the diff --git a/sys/netinet/tcp_output.c b/sys/netinet/tcp_output.c index 00d5415..fc37006 100644 --- a/sys/netinet/tcp_output.c +++ b/sys/netinet/tcp_output.c @@ -158,8 +158,10 @@ cc_after_idle(struct tcpcb *tp) { INP_WLOCK_ASSERT(tp->t_inpcb); - if (CC_ALGO(tp)->after_idle != NULL) + if (!V_tcp_do_ncwv && CC_ALGO(tp)->after_idle != NULL) CC_ALGO(tp)->after_idle(tp->ccv); + else if(V_tcp_do_ncwv) + tp->pipeack = 0; } /* diff --git a/sys/netinet/tcp_subr.c b/sys/netinet/tcp_subr.c index 5d37b50..bd4151e 100644 --- a/sys/netinet/tcp_subr.c +++ b/sys/netinet/tcp_subr.c @@ -783,6 +783,8 @@ tcp_newtcpcb(struct inpcb *inp) callout_init(&tp->t_timers->tt_keep, CALLOUT_MPSAFE); callout_init(&tp->t_timers->tt_2msl, CALLOUT_MPSAFE); callout_init(&tp->t_timers->tt_delack, CALLOUT_MPSAFE); + callout_init(&tp->t_timers->tt_nvp, CALLOUT_MPSAFE); + callout_init(&tp->t_timers->tt_pack, CALLOUT_MPSAFE); if (V_tcp_do_rfc1323) tp->t_flags = (TF_REQ_SCALE|TF_REQ_TSTMP); @@ -802,6 +804,10 @@ tcp_newtcpcb(struct inpcb *inp) tp->snd_cwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT; tp->snd_ssthresh = TCP_MAXWIN << TCP_MAX_WINSHIFT; tp->t_rcvtime = ticks; + if (V_tcp_do_ncwv) { + tp->lossflightsize = 0; + tp->pipeack = TCP_MAXWIN << TCP_MAX_WINSHIFT; + } /* * IPv4 TTL initialization is necessary for an IPv6 socket as well, * because the socket may be bound to an IPv6 wildcard address, @@ -931,6 +937,8 @@ tcp_discardcb(struct tcpcb *tp) callout_stop(&tp->t_timers->tt_keep); callout_stop(&tp->t_timers->tt_2msl); callout_stop(&tp->t_timers->tt_delack); + callout_stop(&tp->t_timers->tt_nvp); + callout_stop(&tp->t_timers->tt_pack); /* * If we got enough samples through the srtt filter, diff --git a/sys/netinet/tcp_timer.c b/sys/netinet/tcp_timer.c index 7c27397..a4f211f 100644 --- a/sys/netinet/tcp_timer.c +++ b/sys/netinet/tcp_timer.c @@ -646,6 +646,54 @@ out: } void +tcp_timer_nvp(void * xtp) +{ + struct tcpcb *tp = xtp; + struct inpcb *inp; + CURVNET_SET(tp->t_vnet); + INP_INFO_WLOCK(&V_tcbinfo); + inp = tp->t_inpcb; + if (inp == NULL) { + tcp_timer_race++; + INP_INFO_WUNLOCK(&V_tcbinfo); + CURVNET_RESTORE(); + return; + } + INP_WLOCK(inp); + callout_deactivate(&tp->t_timers->tt_nvp); + tp->snd_cwnd = max(tp->snd_cwnd / 2 , tp->IW ); + tp->snd_ssthresh = max(tp->snd_ssthresh, 3 * tp->snd_cwnd /4); + if (tp != NULL) + INP_WUNLOCK(inp); + INP_INFO_WUNLOCK(&V_tcbinfo); + CURVNET_RESTORE(); +} + +void +tcp_timer_pack(void * xtp) +{ + struct tcpcb *tp = xtp; + struct inpcb *inp; + CURVNET_SET(tp->t_vnet); + INP_INFO_WLOCK(&V_tcbinfo); + inp = tp->t_inpcb; + if (inp == NULL) { + tcp_timer_race++; + INP_INFO_WUNLOCK(&V_tcbinfo); + CURVNET_RESTORE(); + return; + } + INP_WLOCK(inp); + callout_deactivate(&tp->t_timers->tt_pack); + tp->snd_cwnd = max(tp->snd_cwnd / 2, tp->IW ); + tp->snd_ssthresh = max(tp->snd_ssthresh, 3 * tp->snd_cwnd /4); + if (tp != NULL) + INP_WUNLOCK(inp); + INP_INFO_WUNLOCK(&V_tcbinfo); + CURVNET_RESTORE(); +} + +void tcp_timer_activate(struct tcpcb *tp, int timer_type, u_int delta) { struct callout *t_callout; @@ -679,6 +727,14 @@ tcp_timer_activate(struct tcpcb *tp, int timer_type, u_int delta) t_callout = &tp->t_timers->tt_2msl; f_callout = tcp_timer_2msl; break; + case TT_NVP: + t_callout = &tp->t_timers->tt_nvp; + f_callout = tcp_timer_nvp; + break; + case TT_PACK: + t_callout = &tp->t_timers->tt_pack; + f_callout = tcp_timer_pack; + break; default: panic("bad timer_type"); } @@ -710,6 +766,12 @@ tcp_timer_active(struct tcpcb *tp, int timer_type) case TT_2MSL: t_callout = &tp->t_timers->tt_2msl; break; + case TT_NVP: + t_callout = &tp->t_timers->tt_nvp; + break; + case TT_PACK: + t_callout = &tp->t_timers->tt_pack; + break; default: panic("bad timer_type"); } @@ -738,5 +800,9 @@ tcp_timer_to_xtimer(struct tcpcb *tp, struct tcp_timer *timer, xtimer->tt_keep = (timer->tt_keep.c_time - now) / SBT_1MS; if (callout_active(&timer->tt_2msl)) xtimer->tt_2msl = (timer->tt_2msl.c_time - now) / SBT_1MS; + if (callout_active(&timer->tt_nvp)) + xtimer->tt_nvp = ticks_to_msecs(timer->tt_nvp.c_time - ticks); + if (callout_active(&timer->tt_pack)) + xtimer->tt_pack = ticks_to_msecs(timer->tt_pack.c_time - ticks); xtimer->t_rcvtime = ticks_to_msecs(ticks - tp->t_rcvtime); } diff --git a/sys/netinet/tcp_timer.h b/sys/netinet/tcp_timer.h index 3115fb3..7d038e8 100644 --- a/sys/netinet/tcp_timer.h +++ b/sys/netinet/tcp_timer.h @@ -146,12 +146,16 @@ struct tcp_timer { struct callout tt_keep; /* keepalive */ struct callout tt_2msl; /* 2*msl TIME_WAIT timer */ struct callout tt_delack; /* delayed ACK timer */ + struct callout tt_nvp; /* non validated timer period */ + struct callout tt_pack; /* timer for pipeack measurement */ }; #define TT_DELACK 0x01 #define TT_REXMT 0x02 #define TT_PERSIST 0x04 #define TT_KEEP 0x08 #define TT_2MSL 0x10 +#define TT_NVP 0x20 +#define TT_PACK 0x40 #define TP_KEEPINIT(tp) ((tp)->t_keepinit ? (tp)->t_keepinit : tcp_keepinit) #define TP_KEEPIDLE(tp) ((tp)->t_keepidle ? (tp)->t_keepidle : tcp_keepidle) @@ -183,6 +187,8 @@ void tcp_timer_keep(void *xtp); void tcp_timer_persist(void *xtp); void tcp_timer_rexmt(void *xtp); void tcp_timer_delack(void *xtp); +void tcp_timer_nvp(void *xtp); +void tcp_timer_pack(void *xtp); void tcp_timer_to_xtimer(struct tcpcb *tp, struct tcp_timer *timer, struct xtcp_timer *xtimer); diff --git a/sys/netinet/tcp_var.h b/sys/netinet/tcp_var.h index aaaa4a4..c8c148f 100644 --- a/sys/netinet/tcp_var.h +++ b/sys/netinet/tcp_var.h @@ -137,7 +137,7 @@ struct tcpcb { * for slow start exponential to * linear switch */ - u_long snd_spare2; /* unused */ + u_long IW; /* initial cong window */ tcp_seq snd_recover; /* for use in NewReno Fast Recovery */ u_int t_maxopd; /* mss plus options */ @@ -147,8 +147,9 @@ struct tcpcb { u_int t_rtttime; /* RTT measurement start time */ tcp_seq t_rtseq; /* sequence number being timed */ - u_int t_bw_spare1; /* unused */ - tcp_seq t_bw_spare2; /* unused */ + u_long lossflightsize; /* flightsize at the beggining of current recovery event */ + u_long pipeack; /* amount of data acked per RTT */ + tcp_seq max_ack_prev; /* caching of previous value of snd_max when rtt was measured */ int t_rxtcur; /* current retransmit value (ticks) */ u_int t_maxseg; /* maximum segment size */ @@ -247,6 +248,11 @@ struct tcpcb { #define TF_ECN_SND_ECE 0x10000000 /* ECN ECE in queue */ #define TF_CONGRECOVERY 0x20000000 /* congestion recovery mode */ #define TF_WASCRECOVERY 0x40000000 /* was in congestion recovery */ +#define TF_RLIMPHASE 0x80000000 /* ncwv phase */ + +#define IN_VALPHASE(t_flags) (t_flags & TF_RLIMPHASE) +#define ENTER_VALPHASE(t_flags) t_flags |= TF_RLIMPHASE +#define EXIT_VALPHASE(t_flags) t_flags &= ~TF_RLIMPHASE #define IN_FASTRECOVERY(t_flags) (t_flags & TF_FASTRECOVERY) #define ENTER_FASTRECOVERY(t_flags) t_flags |= TF_FASTRECOVERY @@ -561,6 +567,8 @@ struct xtcp_timer { int tt_keep; /* keepalive */ int tt_2msl; /* 2*msl TIME_WAIT timer */ int tt_delack; /* delayed ACK timer */ + int tt_nvp; /* non-Validation period timer */ + int tt_pack; /* pipeack sample timer */ int t_rcvtime; /* Time since last packet received */ }; struct xtcpcb { @@ -627,8 +635,10 @@ VNET_DECLARE(int, tcp_abc_l_var); #define V_tcp_abc_l_var VNET(tcp_abc_l_var) VNET_DECLARE(int, tcp_do_sack); /* SACK enabled/disabled */ +VNET_DECLARE(int, tcp_do_ncwv); /* New-CWV enabled/disabled */ VNET_DECLARE(int, tcp_sc_rst_sock_fail); /* RST on sock alloc failure */ #define V_tcp_do_sack VNET(tcp_do_sack) +#define V_tcp_do_ncwv VNET(tcp_do_ncwv) #define V_tcp_sc_rst_sock_fail VNET(tcp_sc_rst_sock_fail) VNET_DECLARE(int, tcp_do_ecn); /* TCP ECN enabled/disabled */ --Apple-Mail=_E3102E34-B41D-47A0-AB2F-5E03D718EBE4 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii --Apple-Mail=_E3102E34-B41D-47A0-AB2F-5E03D718EBE4-- --Apple-Mail=_5088A38A-DF43-450D-9E03-31C51BD176C7 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="signature.asc" Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQCVAwUBUvC1ENZcnpRveo1xAQKL0AQApMvaMxZPDSPEuEkTD2YRg8Q0YSYiSS7I PT0PE/sOUZ8k9kx2K78APzb8uZ3rnBnvhi9sRskd1m0iWHwTROnbKbkxz6PVQSHQ L7OCcUbAZkHGI/t3NTpxAPS5b8MZs81OUpjKFezJvnU3qvXObsN81Oh5u/eUZB2A p259UbG6SaY= =pyap -----END PGP SIGNATURE----- --Apple-Mail=_5088A38A-DF43-450D-9E03-31C51BD176C7-- From owner-freebsd-net@FreeBSD.ORG Wed Feb 5 05:27:18 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8EAA5228; Wed, 5 Feb 2014 05:27:18 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 476B71F60; Wed, 5 Feb 2014 05:27:18 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s155RGb4004806; Wed, 5 Feb 2014 00:27:16 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.7/8.14.4/Submit) id s155RGJE004803; Wed, 5 Feb 2014 00:27:16 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21233.52147.912022.488615@hergotha.csail.mit.edu> Date: Wed, 5 Feb 2014 00:27:15 -0500 From: Garrett Wollman To: freebsd-net@freebsd.org Subject: ixgbe/NFS m_defrag() instrumentation X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 05 Feb 2014 00:27:16 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: rmacklem@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 05:27:18 -0000 I instrumented calls to m_defrag() in ixgbe. As expected, it gets called *a lot* when NFS is running with the default read size of 64k. A simple benchmark (single-threaded sequential read of a 128 GB file which I didn't even run to completion) tells the tale: $ sysctl dev.ix.0.mbuf_defrag_attempted dev.ix.0.mbuf_defrag_attempted: 1737994 (There's already a similar counter for m_defrag() failures, which made it easy to add this counter. Unfortunately, there is no analogous instrumentation in cxgbe so I couldn't do likewise for that NIC.) -GAWollman From owner-freebsd-net@FreeBSD.ORG Wed Feb 5 06:15:04 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 78AFE8FD for ; Wed, 5 Feb 2014 06:15:04 +0000 (UTC) Received: from nm13-vm4.bullet.mail.ne1.yahoo.com (nm13-vm4.bullet.mail.ne1.yahoo.com [98.138.91.173]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1695412C3 for ; Wed, 5 Feb 2014 06:15:03 +0000 (UTC) Received: from [98.138.100.111] by nm13.bullet.mail.ne1.yahoo.com with NNFMP; 05 Feb 2014 06:08:23 -0000 Received: from [98.138.226.58] by tm100.bullet.mail.ne1.yahoo.com with NNFMP; 05 Feb 2014 06:08:23 -0000 Received: from [127.0.0.1] by smtp209.mail.ne1.yahoo.com with NNFMP; 05 Feb 2014 06:08:23 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391580503; bh=BmA+wqqUnjZ93RyhNgcdRR2UXq3aZEd7qiFs7uv6k1g=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=20kSi0yynt4/954Ukt893kG1lA827C/2ig/oALk3DBoRl4uTTZ6bUi7AkgREn0P9iHgT3COZ9+3XRJyXolqE3peoO3cJadzkqapROMYrtZGGoO1s1f7KZ7n1S2afDxjSEYezpFYSeYezhZfxvGU8v+2ueh5GEJSzeOD3jS9iu9s= X-Yahoo-Newman-Id: 359255.40074.bm@smtp209.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: .BwYLiMVM1mxsY2.wDKOyVt885qdTI04G57FxmvdI5h9Ysz 9LEKxtxSOCufEPDMPBmn8rXz9VbrUY0.mvD3qhY6BLGf0KWjhpb5ub93Xk2l R.JZMnBJHuzIQPRGpWUBXKjNDStjGPzPPKvS03DE4.QUsuep4zlnFHgTQkx7 IOI0xlshL4UH811C.IsSBm0YlOLt54X0Ic2YHM4KM82JB3g15P5G0nd8OZv. ictbEqlMa9o8kYGD0tqxUO6a1mMtDkRVUvJYfpH5yQYWoAjAXMJuAmYP2CE_ osPIyNZPC9aJDzpL3BYSQ3Ocfcv7pBs9ryS5q3LEBoBEa8CqhLowNQYjK84f slINn7PNCBt5X023lhLlqHC.CMukKN0.McK1U7lS3YKAc70jpNHPYXlDC80B ucgq0I7lUMe8GhVS5bIjRuZw9Qgg7ONPrq3shFG9wxl9l7oaNEVXDDbgalQd u2L7LpObRPvHQZAA0XPhAj4cvEVOW1cv3VXZd8Fu_j0197pz.Kzan.EoP5Mp 3RRWxX2rpIP9GSqj5E1r5lmN6pS_ANTgMhBJoyItVrS8z7ulAu.kFnvCNdq7 0oLA- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from phobos.samsco.home (scott4long@168.103.85.57 with plain [98.139.211.125]) by smtp209.mail.ne1.yahoo.com with SMTP; 05 Feb 2014 06:08:23 +0000 UTC Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: ixgbe/NFS m_defrag() instrumentation From: Scott Long In-Reply-To: <21233.52147.912022.488615@hergotha.csail.mit.edu> Date: Tue, 4 Feb 2014 23:08:21 -0700 Content-Transfer-Encoding: 7bit Message-Id: References: <21233.52147.912022.488615@hergotha.csail.mit.edu> To: Garrett Wollman X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net , rmacklem@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 06:15:04 -0000 On Feb 4, 2014, at 10:27 PM, Garrett Wollman wrote: > I instrumented calls to m_defrag() in ixgbe. As expected, it gets > called *a lot* when NFS is running with the default read size of 64k. > A simple benchmark (single-threaded sequential read of a 128 GB file > which I didn't even run to completion) tells the tale: > > $ sysctl dev.ix.0.mbuf_defrag_attempted > dev.ix.0.mbuf_defrag_attempted: 1737994 > > (There's already a similar counter for m_defrag() failures, which made > it easy to add this counter. Unfortunately, there is no analogous > instrumentation in cxgbe so I couldn't do likewise for that NIC.) dtrace to the rescue? #!/usr/sbin/dtrace -s fbt::m_defrag:entry { @ = count(); } From owner-freebsd-net@FreeBSD.ORG Wed Feb 5 08:07:50 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0489C727; Wed, 5 Feb 2014 08:07:50 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C957F1B05; Wed, 5 Feb 2014 08:07:49 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s1587nL0015971; Wed, 5 Feb 2014 08:07:49 GMT (envelope-from yongari@freefall.freebsd.org) Received: (from yongari@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s1587nEd015970; Wed, 5 Feb 2014 08:07:49 GMT (envelope-from yongari) Date: Wed, 5 Feb 2014 08:07:49 GMT Message-Id: <201402050807.s1587nEd015970@freefall.freebsd.org> To: marcinkk@gmail.com, yongari@FreeBSD.org, freebsd-net@FreeBSD.org, yongari@FreeBSD.org From: yongari@FreeBSD.org Subject: Re: kern/186401: [re] Problem with RTL8111/8168B initialization X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 08:07:50 -0000 Synopsis: [re] Problem with RTL8111/8168B initialization State-Changed-From-To: open->feedback State-Changed-By: yongari State-Changed-When: Wed Feb 5 08:07:04 UTC 2014 State-Changed-Why: Would you show me dmesg output(re(4)/rgephy(4) only)? Responsible-Changed-From-To: freebsd-net->yongari Responsible-Changed-By: yongari Responsible-Changed-When: Wed Feb 5 08:07:04 UTC 2014 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=186401 From owner-freebsd-net@FreeBSD.ORG Wed Feb 5 20:14:00 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4987B687 for ; Wed, 5 Feb 2014 20:14:00 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 000CD1457 for ; Wed, 5 Feb 2014 20:13:59 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s15KDvRx014111; Wed, 5 Feb 2014 15:13:58 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.7/8.14.4/Submit) id s15KDvha014110; Wed, 5 Feb 2014 15:13:57 -0500 (EST) (envelope-from wollman) Date: Wed, 5 Feb 2014 15:13:57 -0500 (EST) From: Garrett Wollman Message-Id: <201402052013.s15KDvha014110@hergotha.csail.mit.edu> To: scott4long@yahoo.com Subject: Re: ixgbe/NFS m_defrag() instrumentation In-Reply-To: References: <21233.52147.912022.488615@hergotha.csail.mit.edu> Organization: none X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 05 Feb 2014 15:13:58 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 20:14:00 -0000 In article , Scott Long writes: >dtrace to the rescue? > >#!/usr/sbin/dtrace -s >fbt::m_defrag:entry >{ > @ = count(); >} I don't know where else m_defrag() might be called from, so I would be more inclined to make that @[stack()] = count(); if I were going the dtrace route. -GAWollman From owner-freebsd-net@FreeBSD.ORG Wed Feb 5 20:23:43 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 90AE0D3C for ; Wed, 5 Feb 2014 20:23:43 +0000 (UTC) Received: from internet06.ebureau.com (internet06.ebureau.com [65.127.24.25]) by mx1.freebsd.org (Postfix) with ESMTP id 506E21583 for ; Wed, 5 Feb 2014 20:23:43 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by internet06.ebureau.com (Postfix) with ESMTP id 366119BA140 for ; Wed, 5 Feb 2014 14:14:41 -0600 (CST) X-Virus-Scanned: amavisd-new at ebureau.com Received: from internet06.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ptI9u3hfigXa for ; Wed, 5 Feb 2014 14:14:40 -0600 (CST) Received: from nail.office.ebureau.com (nail.office.ebureau.com [10.10.20.23]) by internet06.ebureau.com (Postfix) with ESMTPSA id 73B069BA131 for ; Wed, 5 Feb 2014 14:14:40 -0600 (CST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Joe Moog In-Reply-To: Date: Wed, 5 Feb 2014 14:14:39 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <6AEEC659-3788-4D2D-92A9-A1F6DD59A661@ebureau.com> References: To: freebsd-net@freebsd.org X-Mailer: Apple Mail (2.1827) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 20:23:43 -0000 > Date: Mon, 03 Feb 2014 09:40:30 +0100 > From: Ben > To: freebsd-net@freebsd.org > Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in > 10.0 > Message-ID: <52EF55FE.8030901@niessen.ch> > Content-Type: text/plain; charset=3Dwindows-1252; format=3Dflowed >=20 > Hi Scott, >=20 > I had tried to set it in /etc/sysctl.conf but seems it didnt work. But=20= > will I try again and report back. >=20 > The settings of the switch have not been changed and are set to LACP. = It=20 > worked before so I guess the switch should not be the problem. Maybe=20= > some incompatibility between FreeBSD + igb-driver + switch (Juniper=20 > EX3300-48T). >=20 > I will update you after setting the sysctl setting. It seems to be=20 > "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I=20 > switch off the strict mode globally in /etc/sysctl.conf? >=20 > Thanks for your help. >=20 > Regards > Ben >=20 > On 03.02.2014 09:31, Scott Long wrote: >> Hi, >>=20 >> You?re probably running into the consequences of r253687. Check to = see the value of ?sysctl net.link.lagg.0.lacp.lacp_strict_mode?. If = it?s ?1? then set it to 0. My original intention was for this to = default to 0, but apparently that didn?t happen. However, the fact that = strict mode doesn?t seem to work at all for you might hint that your = switch either isn?t configured correctly for LACP, or doesn?t actually = support LACP at all. You might want to investigate that. >>=20 >> Scott >>=20 >> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>=20 >>> Hi, >>>=20 >>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 was = configured to use LACP with two igb devices. >>>=20 >>> Now it stopped working after the upgrade. >>>=20 >>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >>>=20 >>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>=20 >>> It is set to low, but I would like somebody to have a look into it = as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>=20 >>> The suggested fix "use failover" seems not to work. >>>=20 >>> Thank you for your help. >>>=20 >>> Best regards >>> Ben >>>=20 Our experience appears to differ. We have 4-pot LAGG configured on an = Intel ethernet NIC (igb drivers), connected via LACP to 4 ports on a = Cisco Cat4948, host initially configured with FreeBSD 9.2-RELEASE and = upgraded to 10.0-RELEASE. Following the upgrade, everything works as = expected without making any additional adjustments. (We did initially = have to increase the mbuf_cluster allowance to get 4-port LAGG working = with 9.2, but that may be immaterial to this conversation.) As an outsider looking in, the issue seems to crop up in cases where = switch configurations have not been set specifically to force (active) = LACP, or it's something related to use with mixed ethernet drivers = (e.g., bge mixed with igb, as in the case of the linked PR), or possibly = with different switch manufacturer's handling of FreeBSD's LACP = negotiation (in both this case and the PR, Juniper). Whether or not this = needs to be addressed from within FreeBSD itself I will leave to the = experts. Joe From owner-freebsd-net@FreeBSD.ORG Wed Feb 5 20:53:21 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B0C45C92 for ; Wed, 5 Feb 2014 20:53:21 +0000 (UTC) Received: from nm30-vm1.bullet.mail.ne1.yahoo.com (nm30-vm1.bullet.mail.ne1.yahoo.com [98.138.90.46]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 58BC4185A for ; Wed, 5 Feb 2014 20:53:16 +0000 (UTC) Received: from [98.138.100.112] by nm30.bullet.mail.ne1.yahoo.com with NNFMP; 05 Feb 2014 20:53:08 -0000 Received: from [98.138.84.47] by tm103.bullet.mail.ne1.yahoo.com with NNFMP; 05 Feb 2014 20:53:08 -0000 Received: from [127.0.0.1] by smtp115.mail.ne1.yahoo.com with NNFMP; 05 Feb 2014 20:53:08 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1391633588; bh=bmoWQdjlHSr2IEMME9/ZD89dx1b0WPNlSvBEoxP9+qw=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=DdF6vt+Lyf4siZMJ3xZz/B9CM1vWYfsKSBRTN9ddrvuO04TTb6uXUaKI0P8ci7dyzGC3WFvpCB+R4yim1ontIUuPSqNvteVpTZExJ3ml/RQEawZTiknL6j4mivu5MP2hiN7+smCQjyWabewWKwTl2gQv2VDKQZWB80RI59+ntRI= X-Yahoo-Newman-Id: 935762.26179.bm@smtp115.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: xlEth8EVM1mrFyrmVUhnVk9yRuNqEr.IH301kGwYFDMRK39 bwlczijf9w79kxPxr4214SLYowmUXRcbaPb5.ywQwpMmT9Wc.TtWDBnQ.DID r6cXJkmac0r32Out6cOO0fDvHC6XmbDv0xdM2Vn91S2ShgKc5LFUtwEZfPQR jwieyLsyMbKTYCjNxyy2CkDggMwrmt2HCPRoGtdWeuop8gt.pXj9ygzTY82X dxCce8UARn74Sk2xBBYdJ69uB1WCPxKri7OCwT9HgSw4t9iA8S3Ww.AQ_fxc 1QaM8iLQaZ84gQQiFXZC6xedsE5phNdwXPJE1anfGyfFvqEfOMzN0vuwWKBv xuT4VJatrb0xWv9e1HnJbjHi3XPqh8od5jnHEbKbgfO18x.L2KILHTGGErzy UIM5WMscyDgslI9R3wSXf697HZwyX36.yz1nhkStJV_uFadb4LmI33Oz6RQb lB5LwxfdGwI_u.84OkzysZTsTGmXaZ5gdbT.DdwSYoqph1zrF.ewY2vKXElj 61Y0Cy24kjXqrNTqNcqSDL3qsoTvwUFsSNLGOqPYHJPrP.oDi71V6BRma_aX tWfgFtyNIRtUIor9mvxVTMSm_2xND5JEuYkU41MKtVvlHh9NBbXJEsrlhcFG u_FgnvEYpuu5Uc7Qho3XjekCLPO34HsZw4uUhIkDjjIzHaIhvsA-- X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- X-Rocket-Received: from [172.20.10.4] (scott4long@70.208.21.38 with plain [98.139.211.125]) by smtp115.mail.ne1.yahoo.com with SMTP; 05 Feb 2014 12:53:08 -0800 PST Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 From: Scott Long In-Reply-To: <6AEEC659-3788-4D2D-92A9-A1F6DD59A661@ebureau.com> Date: Wed, 5 Feb 2014 13:53:05 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <922B15DE-C888-453F-AFA1-048BFA279DF1@yahoo.com> References: <6AEEC659-3788-4D2D-92A9-A1F6DD59A661@ebureau.com> To: Joe Moog X-Mailer: Apple Mail (2.1827) Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Feb 2014 20:53:21 -0000 On Feb 5, 2014, at 1:14 PM, Joe Moog wrote: >> Date: Mon, 03 Feb 2014 09:40:30 +0100 >> From: Ben >> To: freebsd-net@freebsd.org >> Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in >> 10.0 >> Message-ID: <52EF55FE.8030901@niessen.ch> >> Content-Type: text/plain; charset=3Dwindows-1252; format=3Dflowed >>=20 >> Hi Scott, >>=20 >> I had tried to set it in /etc/sysctl.conf but seems it didnt work. = But=20 >> will I try again and report back. >>=20 >> The settings of the switch have not been changed and are set to LACP. = It=20 >> worked before so I guess the switch should not be the problem. Maybe=20= >> some incompatibility between FreeBSD + igb-driver + switch (Juniper=20= >> EX3300-48T). >>=20 >> I will update you after setting the sysctl setting. It seems to be=20 >> "dynamic", I guess 0 reflects the index of LACP lagg devices. Can I=20= >> switch off the strict mode globally in /etc/sysctl.conf? >>=20 >> Thanks for your help. >>=20 >> Regards >> Ben >>=20 >> On 03.02.2014 09:31, Scott Long wrote: >>> Hi, >>>=20 >>> You?re probably running into the consequences of r253687. Check to = see the value of ?sysctl net.link.lagg.0.lacp.lacp_strict_mode?. If = it?s ?1? then set it to 0. My original intention was for this to = default to 0, but apparently that didn?t happen. However, the fact that = strict mode doesn?t seem to work at all for you might hint that your = switch either isn?t configured correctly for LACP, or doesn?t actually = support LACP at all. You might want to investigate that. >>>=20 >>> Scott >>>=20 >>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>=20 >>>> Hi, >>>>=20 >>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD 9.2 = was configured to use LACP with two igb devices. >>>>=20 >>>> Now it stopped working after the upgrade. >>>>=20 >>>> This is a screenshot of ifconfig -a after the upgrade to FreeBSD = 10.0-RELEASE: http://tinypic.com/view.php?pic=3D28jvgpw&s=3D5#.Uu9PXT1dVPM= >>>>=20 >>>> A PR is currently open: = http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185967 >>>>=20 >>>> It is set to low, but I would like somebody to have a look into it = as it obviously has a great influence on our infrastructure. The only = way to "solve" it is currently switching back to FreeBSD 9.2. >>>>=20 >>>> The suggested fix "use failover" seems not to work. >>>>=20 >>>> Thank you for your help. >>>>=20 >>>> Best regards >>>> Ben >>>>=20 >=20 > Our experience appears to differ. We have 4-pot LAGG configured on an = Intel ethernet NIC (igb drivers), connected via LACP to 4 ports on a = Cisco Cat4948, host initially configured with FreeBSD 9.2-RELEASE and = upgraded to 10.0-RELEASE. Following the upgrade, everything works as = expected without making any additional adjustments. (We did initially = have to increase the mbuf_cluster allowance to get 4-port LAGG working = with 9.2, but that may be immaterial to this conversation.) >=20 > As an outsider looking in, the issue seems to crop up in cases where = switch configurations have not been set specifically to force (active) = LACP, or it's something related to use with mixed ethernet drivers = (e.g., bge mixed with igb, as in the case of the linked PR), or possibly = with different switch manufacturer's handling of FreeBSD's LACP = negotiation (in both this case and the PR, Juniper). Whether or not this = needs to be addressed from within FreeBSD itself I will leave to the = experts. As a follow-up, Ben=92s problem was that his switch was set for passive = mode and thus not sending out heartbeats. The FreeBSD LACP driver = accidentally switched its default from permissive to strict mode, and in = doing so required the reception of heartbeats in order to operate. = Compounding the problem was that the sysctl to change the behavior was = completely untested and useless because it set the state of the ports = too late in the initialization process. These problems will be = addressed for the 10.1 release. However, once Ben set his switch to = active LACP mode, everything worked. Scott From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 06:27:16 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B49627E0 for ; Thu, 6 Feb 2014 06:27:16 +0000 (UTC) Received: from quix.smartspb.net (quix.smartspb.net [217.119.16.133]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 37C3D1A5A for ; Thu, 6 Feb 2014 06:27:15 +0000 (UTC) Received: from dyr.smartspb.net ([217.119.16.26] helo=[127.0.0.1]) by quix.smartspb.net with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.61 (FreeBSD)) (envelope-from ) id 1WBIQa-000On6-Ux for freebsd-net@freebsd.org; Thu, 06 Feb 2014 10:27:13 +0400 Message-ID: <52F32B38.2040909@smartspb.net> Date: Thu, 06 Feb 2014 10:27:04 +0400 From: Dennis Yusupoff User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <6AEEC659-3788-4D2D-92A9-A1F6DD59A661@ebureau.com> In-Reply-To: <6AEEC659-3788-4D2D-92A9-A1F6DD59A661@ebureau.com> X-Enigmail-Version: 1.6 X-Antivirus: avast! (VPS 140205-1, 05.02.2014), Outbound message X-Antivirus-Status: Clean Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.17 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 06:27:16 -0000 06.02.2014 0:14, Joe Moog ?????: > Our experience appears to differ. We have 4-pot LAGG configured on an > Intel ethernet NIC (igb drivers), connected via LACP to 4 ports on a > Cisco Cat4948, host initially configured with FreeBSD 9.2-RELEASE and > upgraded to 10.0-RELEASE. Either we are. Fresh FreeBSD 10.0-RELEASE install, lacp settings copied from 9.0-STABLE, lagg with 2xigb: === | #ifconfig lagg0|| ||lagg0: flags=8843 metric 0 mtu 1500|| || options=400bb|| || ether a0:36:9f:00:0e:d0|| || inet 109.71.176.3 netmask 0xfffffffe broadcast 255.255.255.255|| || inet6 fe80::a236:9fff:fe00:ed0%lagg0 prefixlen 64 scopeid 0x8|| || nd6 options=29|| || media: Ethernet autoselect|| || status: active|| || laggproto lacp lagghash l2,l3,l4|| || laggport: igb1 flags=1c|| || laggport: igb0 flags=1c|| || ||#sysctl net.link.lagg|| ||net.link.lagg.failover_rx_all: 0|| ||net.link.lagg.default_use_flowid: 1|| ||net.link.lagg.lacp.debug: 0|| ||net.link.lagg.0.use_flowid: 1|| ||net.link.lagg.0.count: 2|| ||net.link.lagg.0.active: 2|| ||net.link.lagg.0.flapping: 0|| ||net.link.lagg.0.lacp.lacp_strict_mode: 1|| ||net.link.lagg.0.lacp.debug.rx_test: 0|| ||net.link.lagg.0.lacp.debug.tx_test: 0|| ||net.link.lagg.1.use_flowid: 1|| ||net.link.lagg.1.count: 2|| ||net.link.lagg.1.active: 2|| ||net.link.lagg.1.flapping: 0|| ||net.link.lagg.1.lacp.lacp_strict_mode: 1|| ||net.link.lagg.1.lacp.debug.rx_test: 0|| ||net.link.lagg.1.lacp.debug.tx_test: 0|| ||===|| ||with Juniper MX240:|| ||===|| ||dyr@rj39> show lacp statistics interfaces ae0|| ||Aggregated interface: ae0|| || LACP Statistics: LACP Rx LACP Tx Unknown Rx Illegal Rx|| || ge-2/1/5 25686552 868165 0 0|| || ge-2/1/6 25686544 868165 0 0|| || ||dyr@rj39> show lacp interfaces ae0|| ||Aggregated interface: ae0|| || LACP state: Role Exp Def Dist Col Syn Aggr Timeout Activity|| || ge-2/1/5 Actor No No Yes Yes Yes Yes Fast Active|| || ge-2/1/5 Partner No No Yes Yes Yes Yes Slow Active|| || ge-2/1/6 Actor No No Yes Yes Yes Yes Fast Active|| || ge-2/1/6 Partner No No Yes Yes Yes Yes Slow Active|| || LACP protocol: Receive State Transmit State Mux State|| || ge-2/1/5 Current Slow periodic Collecting distributing|| || ge-2/1/6 Current Slow periodic Collecting distributing|| || ||dyr@rj39>| === Works 5 days with max. traffic ~1.2Gbit/sec. -- Best regards, Dennis Yusupoff, network engineer of Smart-Telecom ISP Russia, Saint-Petersburg From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 07:14:56 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 498C8EC6 for ; Thu, 6 Feb 2014 07:14:56 +0000 (UTC) Received: from quix.smartspb.net (quix.smartspb.net [217.119.16.133]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D814E1DAE for ; Thu, 6 Feb 2014 07:14:55 +0000 (UTC) Received: from dyr.smartspb.net ([217.119.16.26] helo=[127.0.0.1]) by quix.smartspb.net with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.61 (FreeBSD)) (envelope-from ) id 1WBJAk-00052n-8H for freebsd-net@freebsd.org; Thu, 06 Feb 2014 11:14:54 +0400 Message-ID: <52F3366D.3030202@smartspb.net> Date: Thu, 06 Feb 2014 11:14:53 +0400 From: Dennis Yusupoff User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: PF states degrade? X-Enigmail-Version: 1.6 X-Antivirus: avast! (VPS 140205-1, 05.02.2014), Outbound message X-Antivirus-Status: Clean Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.17 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 07:14:56 -0000 Good day. We had started to testing FreeBSD 10.0 in production (pf nat, ipfw pipes, ng_netflow) with setting (sysctl,pf.conf,ipfw.conf etc) from similar rocksolid 9.0-STABLE. Server has worked fine for a ~5 days and then suddenly stop forwarding traffic from clients. What was a quite unexpecting is how it had happening. Traffic from customers...dissappear (seen in tcpdump) from LAN interface in ~10 seconds after _connection_ (NAT translation state has been created?) has been started, with pf log (when set "log debug loud" in pf.conf) strange record appears in that moment, like that: 10.53.80.224 nat'ed in 109.71.177.147, http connection to 213.180.204.183: --- Feb 5 20:41:21 nata2 kernel: pf: State failure on: 1 | 5 Feb 5 20:41:21 nata2 kernel: pf: BAD state: TCP out wire: 213.180.204.183:80 Feb 5 20:41:21 nata2 kernel: 109.71.177.147:50114 stack: 213.180.204.183:80 10.53.80.224:50114 [lo=1997798965 high=1997799354 win=2772 modulator=0] Feb 5 20:41:21 nata2 kernel: [lo=864623348 high=864624718 win=389 modulator=0] 4:4 A seq=864739382 (864739382) ack=1997798965 len=1398 ackskew=0 pkts=3:2 dir=in,rev --- Full log there: http://pastebin.com/CQ78JyJe Disabling/enabling PF - no difference (except, indeed, nat stop working). After all attempts we did "pfctl -d" and setup ipfw nat for that customer. All has work fine! So we believe in uknown (for us) problem related to PF and it state work. PF rules and settings: --- ext_if="lagg0" int_if_1="vlan22" int_if_2="vlan21" dst_nat1="109.71.177.128/25" dst_nat2="109.71.177.0/25" table persist file "/etc/pf.src-nat" table const { 80.249.176.0/20, 93.92.192.0/21, 109.71.176.0/21, 217.119.16.0/20 } table persist { 10.52.249.24 } table persist { 84.204.97.154, 213.180.204.32, 195.95.218.31, 195.95.218.30 } set limit { states 1000000, frags 80000, src-nodes 100000, table-entries 500000} set state-policy if-bound set optimization aggressive set debug urgent set ruleset-optimization profile set timeout { frag 10, tcp.established 3600, src.track 30 } set block-policy drop set require-order no set skip on {lo0, em0, pfsync0} table persist pass in quick on $int_if_1 proto tcp from to any port smtp flags S/SAFR keep state pass in quick on $int_if_2 proto tcp from to any port smtp flags S/SAFR keep state pass in on $int_if_1 proto tcp from any to any port smtp flags S/SAFR keep state \ (max-src-conn 15, max-src-conn-rate 15/30, overload flush global) block return-icmp (host-prohib) log quick proto tcp from to any port smtp pass in on $int_if_2 proto tcp from any to any port smtp flags S/SAFR keep state \ (max-src-conn 15, max-src-conn-rate 15/30, overload flush global) block return-icmp (host-prohib) log quick proto tcp from to any port smtp pass in quick on $int_if_1 all no state allow-opts tag NAT1 label "$nr:NAT1" pass in quick on $int_if_2 all no state allow-opts tag NAT2 label "$nr:NAT2" binat-anchor "binat" load anchor "binat" from "/etc/pf.anchor.binat" nat-anchor "ftp-proxy/*" rdr-anchor "ftp-proxy/*" rdr pass on $int_if_1 proto tcp from to any port 21 -> 127.0.0.1 port 8021 rdr pass on $int_if_2 proto tcp from to any port 21 -> 127.0.0.1 port 8021 rdr pass on $ext_if proto udp from 109.71.176.3 to 109.71.176.2 port 4784 -> 10.78.76.2 port 4784 nat on $ext_if from to any tagged NAT1 -> $dst_nat1 static-port source-hash #sticky-address nat on $ext_if from to any tagged NAT2 -> $dst_nat2 static-port source-hash #sticky-address nat on $ext_if from any to -> $dst_nat1 static-port source-hash #sticky-address binat on $ext_if from 10.78.78.2 to any -> 93.92.199.252 nat on $ext_if from 10.78.76.0/24 to any -> 109.71.176.2 static-port source-hash nat on $ext_if from 10.78.77.0/24 to any -> 93.92.199.254 nat on $ext_if from 10.78.78.0/24 to any -> $dst_nat1 static-port source-hash anchor "ftp-proxy/*" pass out quick proto tcp from any to any port 21 no state pass quick on $ext_if proto gre all no state --- *P. S. Traffic start forwarding with pf only after server has been rebooted.* -- Best regards, Dennis Yusupoff, network engineer of Smart-Telecom ISP Russia, Saint-Petersburg From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 13:20:01 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 39D643B0 for ; Thu, 6 Feb 2014 13:20:01 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1699613B0 for ; Thu, 6 Feb 2014 13:20:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s16DK0dB064287 for ; Thu, 6 Feb 2014 13:20:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s16DK0Jv064286; Thu, 6 Feb 2014 13:20:00 GMT (envelope-from gnats) Date: Thu, 6 Feb 2014 13:20:00 GMT Message-Id: <201402061320.s16DK0Jv064286@freefall.freebsd.org> To: freebsd-net@FreeBSD.org Cc: From: dfilter@FreeBSD.ORG (dfilter service) Subject: Re: kern/181741: commit references a PR X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: dfilter service List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 13:20:01 -0000 The following reply was made to PR kern/181741; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/181741: commit references a PR Date: Thu, 6 Feb 2014 13:18:18 +0000 (UTC) Author: glebius Date: Thu Feb 6 13:18:10 2014 New Revision: 261550 URL: http://svnweb.freebsd.org/changeset/base/261550 Log: Add test case for kern/181741. Right now test fails. PR: 181741 Sponsored by: Nginx, Inc. Modified: head/tools/regression/sockets/unix_passfd/unix_passfd.c Modified: head/tools/regression/sockets/unix_passfd/unix_passfd.c ============================================================================== --- head/tools/regression/sockets/unix_passfd/unix_passfd.c Thu Feb 6 12:43:06 2014 (r261549) +++ head/tools/regression/sockets/unix_passfd/unix_passfd.c Thu Feb 6 13:18:10 2014 (r261550) @@ -29,11 +29,14 @@ #include #include #include +#include +#include #include #include #include #include +#include #include #include @@ -106,11 +109,10 @@ samefile(const char *test, struct stat * } static void -sendfd(const char *test, int sockfd, int sendfd) +sendfd_payload(const char *test, int sockfd, int sendfd, + void *payload, size_t paylen) { struct iovec iovec; - char ch; - char message[CMSG_SPACE(sizeof(int))]; struct cmsghdr *cmsghdr; struct msghdr msghdr; @@ -118,13 +120,12 @@ sendfd(const char *test, int sockfd, int bzero(&msghdr, sizeof(msghdr)); bzero(&message, sizeof(message)); - ch = 0; msghdr.msg_control = message; msghdr.msg_controllen = sizeof(message); - iovec.iov_base = &ch; - iovec.iov_len = sizeof(ch); + iovec.iov_base = payload; + iovec.iov_len = paylen; msghdr.msg_iov = &iovec; msghdr.msg_iovlen = 1; @@ -138,33 +139,35 @@ sendfd(const char *test, int sockfd, int len = sendmsg(sockfd, &msghdr, 0); if (len < 0) err(-1, "%s: sendmsg", test); - if (len != sizeof(ch)) + if (len != paylen) errx(-1, "%s: sendmsg: %zd bytes sent", test, len); } static void -recvfd(const char *test, int sockfd, int *recvfd) +sendfd(const char *test, int sockfd, int sendfd) +{ + char ch; + + return (sendfd_payload(test, sockfd, sendfd, &ch, sizeof(ch))); +} + +static void +recvfd_payload(const char *test, int sockfd, int *recvfd, + void *buf, size_t buflen) { struct cmsghdr *cmsghdr; - char message[CMSG_SPACE(sizeof(int))]; + char message[CMSG_SPACE(SOCKCREDSIZE(CMGROUP_MAX)) + sizeof(int)]; struct msghdr msghdr; struct iovec iovec; ssize_t len; - char ch; bzero(&msghdr, sizeof(msghdr)); - ch = 0; msghdr.msg_control = message; msghdr.msg_controllen = sizeof(message); - iovec.iov_base = &ch; - iovec.iov_len = sizeof(ch); - - msghdr.msg_iov = &iovec; - msghdr.msg_iovlen = 1; - - iovec.iov_len = sizeof(ch); + iovec.iov_base = buf; + iovec.iov_len = buflen; msghdr.msg_iov = &iovec; msghdr.msg_iovlen = 1; @@ -172,19 +175,33 @@ recvfd(const char *test, int sockfd, int len = recvmsg(sockfd, &msghdr, 0); if (len < 0) err(-1, "%s: recvmsg", test); - if (len != sizeof(ch)) + if (len != buflen) errx(-1, "%s: recvmsg: %zd bytes received", test, len); + cmsghdr = CMSG_FIRSTHDR(&msghdr); if (cmsghdr == NULL) errx(-1, "%s: recvmsg: did not receive control message", test); - if (cmsghdr->cmsg_len != CMSG_LEN(sizeof(int)) || - cmsghdr->cmsg_level != SOL_SOCKET || - cmsghdr->cmsg_type != SCM_RIGHTS) + *recvfd = -1; + for (; cmsghdr != NULL; cmsghdr = CMSG_NXTHDR(&msghdr, cmsghdr)) { + if (cmsghdr->cmsg_level == SOL_SOCKET && + cmsghdr->cmsg_type == SCM_RIGHTS && + cmsghdr->cmsg_len == CMSG_LEN(sizeof(int))) { + *recvfd = *(int *)CMSG_DATA(cmsghdr); + if (*recvfd == -1) + errx(-1, "%s: recvmsg: received fd -1", test); + } + } + if (*recvfd == -1) errx(-1, "%s: recvmsg: did not receive single-fd message", test); - *recvfd = *(int *)CMSG_DATA(cmsghdr); - if (*recvfd == -1) - errx(-1, "%s: recvmsg: received fd -1", test); +} + +static void +recvfd(const char *test, int sockfd, int *recvfd) +{ + char ch; + + return (recvfd_payload(test, sockfd, recvfd, &ch, sizeof(ch))); } int @@ -330,6 +347,43 @@ main(int argc, char *argv[]) closesocketpair(fd); printf("%s passed\n", test); + + /* + * Test for PR 181741. Receiver sets LOCAL_CREDS, and kernel + * prepends a control message to the data. Sender sends large + * payload. Payload + SCM_RIGHTS + LOCAL_CREDS hit socket buffer + * limit, and receiver receives truncated data. + */ + test = "test8-rigths+creds+payload"; + printf("beginning %s\n", test); + + { + const int on = 1; + u_long sendspace; + size_t len; + void *buf; + + len = sizeof(sendspace); + if (sysctlbyname("net.local.stream.sendspace", &sendspace, + &len, NULL, 0) < 0) + err(-1, "%s: sysctlbyname(net.local.stream.sendspace)", + test); + + if ((buf = malloc(sendspace)) == NULL) + err(-1, "%s: malloc", test); + + domainsocketpair(test, fd); + if (setsockopt(fd[1], 0, LOCAL_CREDS, &on, sizeof(on)) < 0) + err(-1, "%s: setsockopt(LOCAL_CREDS)", test); + tempfile(test, &putfd_1); + sendfd_payload(test, fd[0], putfd_1, buf, sendspace); + recvfd_payload(test, fd[1], &getfd_1, buf, sendspace); + close(putfd_1); + close(getfd_1); + closesocketpair(fd); + } + + printf("%s passed\n", test); return (0); } _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 16:59:12 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2AE2898C for ; Thu, 6 Feb 2014 16:59:12 +0000 (UTC) Received: from mx1.shrew.net (mx1.shrew.net [38.97.5.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E1C281BBA for ; Thu, 6 Feb 2014 16:59:11 +0000 (UTC) Received: from mail.shrew.net (mail.shrew.prv [10.24.10.20]) by mx1.shrew.net (8.14.7/8.14.7) with ESMTP id s16Gd7xq019198 for ; Thu, 6 Feb 2014 10:39:07 -0600 (CST) (envelope-from mgrooms@shrew.net) Received: from [127.0.0.1] (216-110-21-66.static.twtelecom.net [216.110.21.66]) by mail.shrew.net (Postfix) with ESMTPSA id 05FDA187F82 for ; Thu, 6 Feb 2014 10:39:02 -0600 (CST) Message-ID: <52F3BAB6.7090304@shrew.net> Date: Thu, 06 Feb 2014 10:39:18 -0600 From: Matthew Grooms User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: PF states degrade? References: <52F3366D.3030202@smartspb.net> In-Reply-To: <52F3366D.3030202@smartspb.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mx1.shrew.net [10.24.10.10]); Thu, 06 Feb 2014 10:39:07 -0600 (CST) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 16:59:12 -0000 On 2/6/2014 1:14 AM, Dennis Yusupoff wrote: > Good day. > > We had started to testing FreeBSD 10.0 in production (pf nat, ipfw > pipes, ng_netflow) with setting (sysctl,pf.conf,ipfw.conf etc) from > similar rocksolid 9.0-STABLE. > Server has worked fine for a ~5 days and then suddenly stop forwarding > traffic from clients. What was a quite unexpecting is how it had > happening. Traffic from customers...dissappear (seen in tcpdump) from > LAN interface in ~10 seconds after _connection_ (NAT translation state > has been created?) has been started, with pf log (when set "log debug > loud" in pf.conf) strange record appears in that moment, like that: > > 10.53.80.224 nat'ed in 109.71.177.147, http connection to 213.180.204.183: > --- > Feb 5 20:41:21 nata2 kernel: pf: State failure on: 1 | 5 > Feb 5 20:41:21 nata2 kernel: pf: BAD state: TCP out wire: > 213.180.204.183:80 > Feb 5 20:41:21 nata2 kernel: 109.71.177.147:50114 stack: > 213.180.204.183:80 10.53.80.224:50114 [lo=1997798965 high=1997799354 > win=2772 modulator=0] > Feb 5 20:41:21 nata2 kernel: [lo=864623348 high=864624718 win=389 > modulator=0] 4:4 A seq=864739382 (864739382) ack=1997798965 len=1398 > ackskew=0 pkts=3:2 dir=in,rev > --- > Full log there: http://pastebin.com/CQ78JyJe > > Disabling/enabling PF - no difference (except, indeed, nat stop working). > > After all attempts we did "pfctl -d" and setup ipfw nat for that > customer. All has work fine! So we believe in uknown (for us) problem > related to PF and it state work. > > PF rules and settings: > > --- > ext_if="lagg0" > int_if_1="vlan22" > int_if_2="vlan21" > > dst_nat1="109.71.177.128/25" > dst_nat2="109.71.177.0/25" > > table persist file "/etc/pf.src-nat" > table const { 80.249.176.0/20, 93.92.192.0/21, > 109.71.176.0/21, 217.119.16.0/20 } > table persist { 10.52.249.24 } > > table persist { 84.204.97.154, 213.180.204.32, > 195.95.218.31, 195.95.218.30 } > > set limit { states 1000000, frags 80000, src-nodes 100000, table-entries > 500000} > set state-policy if-bound > set optimization aggressive > set debug urgent > set ruleset-optimization profile > set timeout { frag 10, tcp.established 3600, src.track 30 } > set block-policy drop > set require-order no > > > set skip on {lo0, em0, pfsync0} > > > table persist > pass in quick on $int_if_1 proto tcp from to any port > smtp flags S/SAFR keep state > pass in quick on $int_if_2 proto tcp from to any port > smtp flags S/SAFR keep state > pass in on $int_if_1 proto tcp from any to any port smtp flags S/SAFR > keep state \ > (max-src-conn 15, max-src-conn-rate 15/30, overload > flush global) > block return-icmp (host-prohib) log quick proto tcp from > to any port smtp > > pass in on $int_if_2 proto tcp from any to any port smtp flags S/SAFR > keep state \ > (max-src-conn 15, max-src-conn-rate 15/30, overload > flush global) > block return-icmp (host-prohib) log quick proto tcp from > to any port smtp > > > pass in quick on $int_if_1 all no state allow-opts tag NAT1 label "$nr:NAT1" > pass in quick on $int_if_2 all no state allow-opts tag NAT2 label "$nr:NAT2" > > binat-anchor "binat" > load anchor "binat" from "/etc/pf.anchor.binat" > nat-anchor "ftp-proxy/*" > rdr-anchor "ftp-proxy/*" > rdr pass on $int_if_1 proto tcp from to any port 21 -> > 127.0.0.1 port 8021 > rdr pass on $int_if_2 proto tcp from to any port 21 -> > 127.0.0.1 port 8021 > rdr pass on $ext_if proto udp from 109.71.176.3 to 109.71.176.2 port > 4784 -> 10.78.76.2 port 4784 > > nat on $ext_if from to any tagged NAT1 -> $dst_nat1 > static-port source-hash #sticky-address > nat on $ext_if from to any tagged NAT2 -> $dst_nat2 > static-port source-hash #sticky-address > nat on $ext_if from any to -> $dst_nat1 static-port > source-hash #sticky-address > > binat on $ext_if from 10.78.78.2 to any -> 93.92.199.252 > > nat on $ext_if from 10.78.76.0/24 to any -> 109.71.176.2 static-port > source-hash > nat on $ext_if from 10.78.77.0/24 to any -> 93.92.199.254 > nat on $ext_if from 10.78.78.0/24 to any -> $dst_nat1 static-port > source-hash > > anchor "ftp-proxy/*" > pass out quick proto tcp from any to any port 21 no state > > pass quick on $ext_if proto gre all no state > --- > > *P. S. Traffic start forwarding with pf only after server has been Dennis, Did you run out of pf state table entries? You can use pfctl to list the current limit and usage ... INFO: Status: Enabled for 14 days 19:48:29 Debug: Urgent State Table Total Rate current entries 4 searches 2030427 1.6/s inserts 64990 0.1/s removals 64986 0.1/s LIMITS: states hard limit 10000 src-nodes hard limit 10000 frags hard limit 5000 table-entries hard limit 200000 .. If that is the case, you can increase your state table size by inserting some configuration parameters at the top of your pf.conf file. For example ... set limit states 50000 set limit src-nodes 50000 set limit frags 25000 -Matthew From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 19:02:48 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F199F161; Thu, 6 Feb 2014 19:02:48 +0000 (UTC) Received: from mail-ee0-x22c.google.com (mail-ee0-x22c.google.com [IPv6:2a00:1450:4013:c00::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 59A2F1977; Thu, 6 Feb 2014 19:02:48 +0000 (UTC) Received: by mail-ee0-f44.google.com with SMTP id c13so1089327eek.17 for ; Thu, 06 Feb 2014 11:02:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=NxhefCz82z65FdfiO5TqhMFUT8HcuIpiJ+Ror8Nq36k=; b=sP657Uw4rPINvRjFhgX1XvGRdjY4OsXbFnJelptx+2wqohwbEcet4v0h46W0aRPa9F 32gmg3CnO/rGLscAhSEB5Mr6793+2OuL7DzM9mpdhF0jffsDGgQcwTswPEisuZnMQBTn CqihgWTjZTsfUCWKUyLAuB+3xYOS9XkmieffFRDUFCcDWkLlNEB9moGdr95YCkdpiVMc nrDVsW1qp6jjURw7ezb8HfJdwDqD/+H4cAoR+v1J+pt+pBaS4xxWwl5ES4NaFM25BuMZ hli3Y0mRTLgM/YvCumr4vNOjf5l7/oBLiAz2feWqXi851Rn8zNd4CGpx+WZEECuy7lo6 LV2A== MIME-Version: 1.0 X-Received: by 10.14.127.200 with SMTP id d48mr10976478eei.9.1391713366788; Thu, 06 Feb 2014 11:02:46 -0800 (PST) Received: by 10.14.65.4 with HTTP; Thu, 6 Feb 2014 11:02:46 -0800 (PST) In-Reply-To: References: Date: Thu, 6 Feb 2014 11:02:46 -0800 Message-ID: Subject: Re: Errors using span interface on if_bridge(4) From: hiren panchasara To: "freebsd-net@freebsd.org" , Jack F Vogel Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 19:02:49 -0000 On Sun, Feb 2, 2014 at 11:25 PM, hiren panchasara wrote: > If I start tcpdump on > interface again, the dev.ix.3.mac_stats.checksum_errs starts going up. > Stopping tcpdump would stop the counter from incrementing. > > Why is that happening? I have no clue. How/why is tcpdump affecting > this interface traffic stats in such a way? Found the reason: http://www.wireshark.org/faq.html#q11.1 http://sandilands.info/sgordon/segmentation-offloading-with-wireshark-and-ethtool Putting an end to arguably the longest monologue on -net :-) cheers, Hiren From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 20:55:03 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D50715DB; Thu, 6 Feb 2014 20:55:03 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A2DDE14C9; Thu, 6 Feb 2014 20:55:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s16Kt3aN071927; Thu, 6 Feb 2014 20:55:03 GMT (envelope-from brueffer@freefall.freebsd.org) Received: (from brueffer@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s16Kt3YO071926; Thu, 6 Feb 2014 21:55:03 +0100 (CET) (envelope-from brueffer) Date: Thu, 6 Feb 2014 21:55:03 +0100 (CET) Message-Id: <201402062055.s16Kt3YO071926@freefall.freebsd.org> To: aboyer@averesystems.com, brueffer@FreeBSD.org, freebsd-net@FreeBSD.org From: brueffer@FreeBSD.org Subject: Re: kern/153772: [ixgbe] [patch] sysctls reference wrong XON/XOFF variables X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 20:55:03 -0000 Synopsis: [ixgbe] [patch] sysctls reference wrong XON/XOFF variables State-Changed-From-To: open->closed State-Changed-By: brueffer State-Changed-When: Thu Feb 6 21:53:56 CET 2014 State-Changed-Why: This was fixed in r217127 three years ago, might as well close this PR now :-) Thanks for the patch! http://www.freebsd.org/cgi/query-pr.cgi?pr=153772 From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 22:21:22 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 127B1FAB for ; Thu, 6 Feb 2014 22:21:22 +0000 (UTC) Received: from smtp.novso.com (smtp1.novso.com [193.189.104.85]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CE0B71B76 for ; Thu, 6 Feb 2014 22:21:21 +0000 (UTC) Message-ID: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> Subject: IPsec filtertunnel broken on FreeBSD 10 From: Nicolas DEFFAYET To: freebsd-net@freebsd.org Date: Thu, 06 Feb 2014 23:21:13 +0100 Organization: DEFFAYET.COM Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4-3 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 22:21:22 -0000 Hello, The IPsec filtertunnel is broken on FreeBSD 10: incoming packets decapsulated are not going to firewall and to the pseudo interface enc. This issue affect 10.0-RELEASE and 10.0-STABLE. 9.1-RELEASE and 9.2-RELEASE are not affected. Of course the systctl show that filtertunnel is enabled: net.inet.ipsec.filtertunnel=1 net.inet6.ipsec.filtertunnel=1 This issue is serious as it's not possible to use firewall (ipfw/pf) for secure a gre/gif/l2tp IPsec tunnel as the incoming packets decapsulated are not seen by the firewall. Many peoples have reported the issue on forums.freebsd.org and a bug report have been open: http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/185876 For try to provide a fix, i have run a diff on kernel source on net, netinet, netinet6 and netipsec folders between 9.2-RELEASE and 10.0-RELEASE but I didn't have found what change can break IPsec filtertunnel. Any expert or people knowing the code can help us please ? Many thanks ! -- Nicolas DEFFAYET From owner-freebsd-net@FreeBSD.ORG Thu Feb 6 22:44:42 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B446A8B6; Thu, 6 Feb 2014 22:44:42 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 88A361D56; Thu, 6 Feb 2014 22:44:42 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s16MigGH097459; Thu, 6 Feb 2014 22:44:42 GMT (envelope-from brueffer@freefall.freebsd.org) Received: (from brueffer@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s16MigBO097458; Thu, 6 Feb 2014 23:44:42 +0100 (CET) (envelope-from brueffer) Date: Thu, 6 Feb 2014 23:44:42 +0100 (CET) Message-Id: <201402062244.s16MigBO097458@freefall.freebsd.org> To: jcnc@dhis.org, brueffer@FreeBSD.org, freebsd-net@FreeBSD.org From: brueffer@FreeBSD.org Subject: Re: kern/181006: [run] [patch] mbuf leak in run(4) driver X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Feb 2014 22:44:42 -0000 Synopsis: [run] [patch] mbuf leak in run(4) driver State-Changed-From-To: open->closed State-Changed-By: brueffer State-Changed-When: Thu Feb 6 23:42:02 CET 2014 State-Changed-Why: This was fixed in HEAD with r257435 and merged back to 9-STABLE in r259457. Thanks for the report! http://www.freebsd.org/cgi/query-pr.cgi?pr=181006 From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 01:58:45 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DABC9E8A; Fri, 7 Feb 2014 01:58:44 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id C14141D37; Fri, 7 Feb 2014 01:58:43 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: X-IronPort-AV: E=Sophos;i="4.95,797,1384318800"; d="scan'208";a="94432143" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 06 Feb 2014 20:58:35 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 099F0B403D; Thu, 6 Feb 2014 20:58:35 -0500 (EST) Date: Thu, 6 Feb 2014 20:58:35 -0500 (EST) From: Rick Macklem To: Garrett Wollman Message-ID: <1261040377.1982994.1391738315028.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Terrible NFS performance under 9.2-RELEASE? MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1982991_1248016500.1391738315025" X-Originating-IP: [172.17.91.209] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-net@freebsd.org, Alexander Motin X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 01:58:45 -0000 ------=_Part_1982991_1248016500.1391738315025 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Garrett Wollman wrote: > The real big improvement, which I have not tried to implement, would > be to use physical pages (via sfbufs) by sharing the inner loop of > sendfile(2). Since I use ZFS as my backing filesystem, I'm not sure > this would have any benefit for me, but it should be a measurable > improvement for UFS-backed NFS servers. (I didn't have this email handy, so I just cut/pasted this paragraph.) Well, I'm far from sure this is a good idea at this point, but I've attached a patch that seems to work for a short test of an export of a UFS volume. (If there is no v_object for ZFS vnodes, it will print out a message with the error# for EINVAL.) It would need some serious review and testing before it would be anywhere near ready for head. (For example, since I hold a LK_SHARED locked vnode which was not VI_DOOMED when it was locked, I don't know if I need to acquire a reference count on the vm object or if I need to check for OBJ_DEAD?) The patch checks for OBJ_DEAD, but does not acquire a reference count. (The reference count part is #ifdef notnow.) Like most of these things, I don't see a measurable difference on my old single core i386 hardware with 100Mbps networking, but that only indicates it might not be a serious regression. The patch doesn't have yours applied to it, but it should be easy to integrate the two, since it just adds a call to nfsrv_file_mbuf() at the beginning of nfsvno_read(). Good luck with the testing, rick ps: It would be nice to know if it works for ZFS? pss: If the patch doesn't make it through on the list and you want a copy, just email me. ------=_Part_1982991_1248016500.1391738315025 Content-Type: text/x-patch; name=sendfile-nfs.patch Content-Disposition: attachment; filename=sendfile-nfs.patch Content-Transfer-Encoding: base64 LS0tIGtlcm4vdWlwY19zeXNjYWxscy5jLnNhdjMJMjAxNC0wMi0wNCAxOToyMjo1Mi4wMDAwMDAw MDAgLTA1MDAKKysrIGtlcm4vdWlwY19zeXNjYWxscy5jCTIwMTQtMDItMDYgMTg6MjM6NDcuMDAw MDAwMDAwIC0wNTAwCkBAIC0xOTUwLDYgKzE5NTAsMTc2IEBAIGZyZWVic2Q0X3NlbmRmaWxlKHN0 cnVjdCB0aHJlYWQgKnRkLCBzdHIKIH0KICNlbmRpZiAvKiBDT01QQVRfRlJFRUJTRDQgKi8KIAor LyoKKyAqIFRoZSBpbm5lciBsb29wIG9mIGtlcm5fc2VuZGZpbGUoKSBleHRyYWN0ZWQgb3V0IHNv IHRoYXQgaXQgY2FuIGJlCisgKiB1c2VkIGJ5IHRoZSBORlMgc2VydmVyIGFzIHdlbGwuCisgKi8K K2ludAora2Vybl9zZW5kZmlsZV9tYnVmKHN0cnVjdCB2bm9kZSAqdnAsIHN0cnVjdCB2bV9vYmpl Y3QgKm9iaiwgc3RydWN0IG1idWYgKiptcCwKKyAgICBzdHJ1Y3QgbWJ1ZiAqKm10YWlscCwgb2Zm X3Qgb2ZmLCBvZmZfdCAqcmVtcCwgaW50IGJzaXplLCBpbnQgKmRvbmVwLAorICAgIGludCBzcGFj ZSwgaW50IG1udywgaW50IHdhaXRfZm9yYWxsLCBpbnQgZmxhZ3MsIGludCBpb2ZsYWdzLCB2b2lk ICpzZnNwLAorICAgIHN0cnVjdCB1Y3JlZCAqYWNyZWQsIHN0cnVjdCB0aHJlYWQgKnRkKQorewor CW9mZl90IHhmc2l6ZSwgc3RhcnRyZW07CisJdm1fcGluZGV4X3QgcGluZGV4OworCXZtX29mZnNl dF90IHBnb2ZmOworCXN0cnVjdCB2bV9wYWdlICpwZzsKKwlzdHJ1Y3QgbWJ1ZiAqbTAsICptLCAq bXRhaWw7CisJc3RydWN0IHNmX2J1ZiAqc2Y7CisJaW50IGVycm9yID0gMCwgbG9vcGJ5dGVzOwor CXNzaXplX3QgcmVzaWQ7CisJaW50IHJlYWRhaGVhZCA9IHNmcmVhZGFoZWFkICogTUFYQlNJWkU7 CisJc3RydWN0IHNlbmRmaWxlX3N5bmMgKnNmcyA9IChzdHJ1Y3Qgc2VuZGZpbGVfc3luYyAqKXNm c3A7CisKKwltID0gKm1wOworCW10YWlsID0gKm10YWlscDsKKwlzdGFydHJlbSA9ICpyZW1wOwor CWxvb3BieXRlcyA9IDA7CisJLyoKKwkgKiBMb29wIGFuZCBjb25zdHJ1Y3QgbWF4aW11bSBzaXpl ZCBtYnVmIGNoYWluIHRvIGJlIGJ1bGsKKwkgKiBkdW1wZWQgaW50byBzb2NrZXQgYnVmZmVyLgor CSAqLworCXdoaWxlIChzcGFjZSA+IGxvb3BieXRlcykgeworCQkvKgorCQkgKiBDYWxjdWxhdGUg dGhlIGFtb3VudCB0byB0cmFuc2Zlci4KKwkJICogTm90IHRvIGV4Y2VlZCBhIHBhZ2UsIHRoZSBF T0YsCisJCSAqIG9yIHRoZSBwYXNzZWQgaW4gbmJ5dGVzLgorCQkgKi8KKwkJcGdvZmYgPSAodm1f b2Zmc2V0X3QpKG9mZiAmIFBBR0VfTUFTSyk7CisJCSpyZW1wID0gc3RhcnRyZW0gLSBsb29wYnl0 ZXM7CisJCXhmc2l6ZSA9IG9taW4oUEFHRV9TSVpFIC0gcGdvZmYsICpyZW1wKTsKKwkJeGZzaXpl ID0gb21pbihzcGFjZSAtIGxvb3BieXRlcywgeGZzaXplKTsKKwkJaWYgKHhmc2l6ZSA8PSAwKSB7 CisJCQkqZG9uZXAgPSAxOwkJLyogYWxsIGRhdGEgc2VudCAqLworCQkJYnJlYWs7CisJCX0KKwor CQkvKgorCQkgKiBBdHRlbXB0IHRvIGxvb2sgdXAgdGhlIHBhZ2UuICBBbGxvY2F0ZQorCQkgKiBp ZiBub3QgZm91bmQgb3Igd2FpdCBhbmQgbG9vcCBpZiBidXN5LgorCQkgKi8KKwkJcGluZGV4ID0g T0ZGX1RPX0lEWChvZmYpOworCQlWTV9PQkpFQ1RfV0xPQ0sob2JqKTsKKwkJcGcgPSB2bV9wYWdl X2dyYWIob2JqLCBwaW5kZXgsIFZNX0FMTE9DX05PQlVTWSB8CisJCSAgICBWTV9BTExPQ19OT1JN QUwgfCBWTV9BTExPQ19XSVJFRCB8IFZNX0FMTE9DX1JFVFJZKTsKKworCQkvKgorCQkgKiBDaGVj ayBpZiBwYWdlIGlzIHZhbGlkIGZvciB3aGF0IHdlIG5lZWQsCisJCSAqIG90aGVyd2lzZSBpbml0 aWF0ZSBJL08uCisJCSAqIElmIHdlIGFscmVhZHkgdHVybmVkIHNvbWUgcGFnZXMgaW50byBtYnVm cywKKwkJICogc2VuZCB0aGVtIG9mZiBiZWZvcmUgd2UgY29tZSBoZXJlIGFnYWluIGFuZAorCQkg KiBibG9jay4KKwkJICovCisJCWlmIChwZy0+dmFsaWQgJiYgdm1fcGFnZV9pc192YWxpZChwZywg cGdvZmYsIHhmc2l6ZSkpCisJCQlWTV9PQkpFQ1RfV1VOTE9DSyhvYmopOworCQllbHNlIGlmICht ICE9IE5VTEwgJiYgd2FpdF9mb3JhbGwgPT0gMCkKKwkJCWVycm9yID0gRUFHQUlOOwkvKiBzZW5k IHdoYXQgd2UgYWxyZWFkeSBnb3QgKi8KKwkJZWxzZSBpZiAoZmxhZ3MgJiBTRl9OT0RJU0tJTykK KwkJCWVycm9yID0gRUJVU1k7CisJCWVsc2UgeworCQkJVk1fT0JKRUNUX1dVTkxPQ0sob2JqKTsK KworCQkJLyoKKwkJCSAqIEdldCB0aGUgcGFnZSBmcm9tIGJhY2tpbmcgc3RvcmUuCisJCQkgKiBY WFhNQUM6IEJlY2F1c2Ugd2UgZG9uJ3QgaGF2ZSBmcC0+Zl9jcmVkCisJCQkgKiBoZXJlLCB3ZSBw YXNzIGluIE5PQ1JFRC4gIFRoaXMgaXMgcHJvYmFibHkKKwkJCSAqIHdyb25nLCBidXQgaXMgY29u c2lzdGVudCB3aXRoIG91ciBvcmlnaW5hbAorCQkJICogaW1wbGVtZW50YXRpb24uCisJCQkgKi8K KwkJCWVycm9yID0gdm5fcmR3cihVSU9fUkVBRCwgdnAsIE5VTEwsIHJlYWRhaGVhZCwKKwkJCSAg ICB0cnVuY19wYWdlKG9mZiksIFVJT19OT0NPUFksIGlvZmxhZ3MgfAorCQkJICAgICgocmVhZGFo ZWFkIC8gYnNpemUpIDw8IElPX1NFUVNISUZUKSwKKwkJCSAgICB0ZC0+dGRfdWNyZWQsIGFjcmVk LCAmcmVzaWQsIHRkKTsKKwkJCVNGU1RBVF9JTkMoc2ZfaW9jbnQpOworCQkJaWYgKGVycm9yKQor CQkJCVZNX09CSkVDVF9XTE9DSyhvYmopOworCQl9CisJCWlmIChlcnJvcikgeworCQkJdm1fcGFn ZV9sb2NrKHBnKTsKKwkJCXZtX3BhZ2VfdW53aXJlKHBnLCAwKTsKKwkJCS8qCisJCQkgKiBTZWUg aWYgYW55b25lIGVsc2UgbWlnaHQga25vdyBhYm91dAorCQkJICogdGhpcyBwYWdlLiAgSWYgbm90 IGFuZCBpdCBpcyBub3QgdmFsaWQsCisJCQkgKiB0aGVuIGZyZWUgaXQuCisJCQkgKi8KKwkJCWlm IChwZy0+d2lyZV9jb3VudCA9PSAwICYmIHBnLT52YWxpZCA9PSAwICYmCisJCQkgICAgIXZtX3Bh Z2VfYnVzaWVkKHBnKSkKKwkJCQl2bV9wYWdlX2ZyZWUocGcpOworCQkJdm1fcGFnZV91bmxvY2so cGcpOworCQkJVk1fT0JKRUNUX1dVTkxPQ0sob2JqKTsKKwkJCWlmIChlcnJvciA9PSBFQUdBSU4g JiYgd2FpdF9mb3JhbGwgPT0gMCkKKwkJCQllcnJvciA9IDA7CS8qIG5vdCBhIHJlYWwgZXJyb3Ig Ki8KKwkJCWJyZWFrOworCQl9CisKKwkJLyoKKwkJICogR2V0IGEgc2VuZGZpbGUgYnVmLiAgV2hl biBhbGxvY2F0aW5nIHRoZQorCQkgKiBmaXJzdCBidWZmZXIgZm9yIG1idWYgY2hhaW4sIHdlIHVz dWFsbHkKKwkJICogd2FpdCBhcyBsb25nIGFzIG5lY2Vzc2FyeSwgYnV0IHRoaXMgd2FpdAorCQkg KiBjYW4gYmUgaW50ZXJydXB0ZWQuICBGb3IgY29uc2VxdWVudAorCQkgKiBidWZmZXJzLCBkbyBu b3Qgc2xlZXAsIHNpbmNlIHNldmVyYWwKKwkJICogdGhyZWFkcyBtaWdodCBleGhhdXN0IHRoZSBi dWZmZXJzIGFuZCB0aGVuCisJCSAqIGRlYWRsb2NrLgorCQkgKi8KKwkJc2YgPSBzZl9idWZfYWxs b2MocGcsIChtbncgfHwgKG0gIT0gTlVMTCAmJiB3YWl0X2ZvcmFsbCA9PSAwKSkgPworCQkgICAg U0ZCX05PV0FJVCA6IFNGQl9DQVRDSCk7CisJCWlmIChzZiA9PSBOVUxMKSB7CisJCQlTRlNUQVRf SU5DKHNmX2FsbG9jZmFpbCk7CisJCQl2bV9wYWdlX2xvY2socGcpOworCQkJdm1fcGFnZV91bndp cmUocGcsIDApOworCQkJS0FTU0VSVChwZy0+b2JqZWN0ICE9IE5VTEwsCisJCQkgICAgKCJrZXJu X3NlbmRmaWxlOiBvYmplY3QgZGlzYXBwZWFyZWQiKSk7CisJCQl2bV9wYWdlX3VubG9jayhwZyk7 CisJCQlpZiAobSA9PSBOVUxMKQorCQkJCWVycm9yID0gKG1udyA/IEVBR0FJTiA6IEVJTlRSKTsK KwkJCWJyZWFrOworCQl9CisKKwkJLyoKKwkJICogR2V0IGFuIG1idWYgYW5kIHNldCBpdCB1cCBh cyBoYXZpbmcKKwkJICogZXh0ZXJuYWwgc3RvcmFnZS4KKwkJICovCisJCW0wID0gbV9nZXQoKG1u dyA/IE1fTk9XQUlUIDogTV9XQUlUT0spLCBNVF9EQVRBKTsKKwkJaWYgKG0wID09IE5VTEwpIHsK KwkJCWVycm9yID0gKG1udyA/IEVBR0FJTiA6IEVOT0JVRlMpOworCQkJc2ZfYnVmX21leHQoTlVM TCwgc2YpOworCQkJYnJlYWs7CisJCX0KKwkJaWYgKG1fZXh0YWRkKG0wLCAoY2FkZHJfdCApc2Zf YnVmX2t2YShzZiksIFBBR0VfU0laRSwKKwkJICAgIHNmX2J1Zl9tZXh0LCBzZnMsIHNmLCBNX1JE T05MWSwgRVhUX1NGQlVGLAorCQkgICAgKG1udyA/IE1fTk9XQUlUIDogTV9XQUlUT0spKSAhPSAw KSB7CisJCQllcnJvciA9IChtbncgPyBFQUdBSU4gOiBFTk9CVUZTKTsKKwkJCXNmX2J1Zl9tZXh0 KE5VTEwsIHNmKTsKKwkJCW1fZnJlZW0obTApOworCQkJYnJlYWs7CisJCX0KKwkJbTAtPm1fZGF0 YSA9IChjaGFyICopc2ZfYnVmX2t2YShzZikgKyBwZ29mZjsKKwkJbTAtPm1fbGVuID0geGZzaXpl OworCisJCS8qIEFwcGVuZCB0byBtYnVmIGNoYWluLiAqLworCQlpZiAobXRhaWwgIT0gTlVMTCkK KwkJCW10YWlsLT5tX25leHQgPSBtMDsKKwkJZWxzZSBpZiAobSAhPSBOVUxMKQorCQkJbV9sYXN0 KG0pLT5tX25leHQgPSBtMDsKKwkJZWxzZQorCQkJbSA9IG0wOworCQltdGFpbCA9IG0wOworCisJ CS8qIEtlZXAgdHJhY2sgb2YgYml0cyBwcm9jZXNzZWQuICovCisJCWxvb3BieXRlcyArPSB4ZnNp emU7CisJCW9mZiArPSB4ZnNpemU7CisKKwkJaWYgKHNmcyAhPSBOVUxMKSB7CisJCQltdHhfbG9j aygmc2ZzLT5tdHgpOworCQkJc2ZzLT5jb3VudCsrOworCQkJbXR4X3VubG9jaygmc2ZzLT5tdHgp OworCQl9CisJfQorCSptcCA9IG07CisJKm10YWlscCA9IG10YWlsOworCXJldHVybiAoZXJyb3Ip OworfQorCiBpbnQKIGtlcm5fc2VuZGZpbGUoc3RydWN0IHRocmVhZCAqdGQsIHN0cnVjdCBzZW5k ZmlsZV9hcmdzICp1YXAsCiAgICAgc3RydWN0IHVpbyAqaGRyX3Vpbywgc3RydWN0IHVpbyAqdHJs X3VpbywgaW50IGNvbXBhdCkKQEAgLTE5NTksMTAgKzIxMjksOCBAQCBrZXJuX3NlbmRmaWxlKHN0 cnVjdCB0aHJlYWQgKnRkLCBzdHJ1Y3QgCiAJc3RydWN0IHZtX29iamVjdCAqb2JqID0gTlVMTDsK IAlzdHJ1Y3Qgc29ja2V0ICpzbyA9IE5VTEw7CiAJc3RydWN0IG1idWYgKm0gPSBOVUxMOwotCXN0 cnVjdCBzZl9idWYgKnNmOwotCXN0cnVjdCB2bV9wYWdlICpwZzsKIAlzdHJ1Y3QgdmF0dHIgdmE7 Ci0Jb2ZmX3Qgb2ZmLCB4ZnNpemUsIGZzYnl0ZXMgPSAwLCBzYnl0ZXMgPSAwLCByZW0gPSAwOwor CW9mZl90IG9mZiwgZnNieXRlcyA9IDAsIHNieXRlcyA9IDAsIHJlbSA9IDA7CiAJaW50IGVycm9y LCBoZHJsZW4gPSAwLCBtbncgPSAwOwogCWludCBic2l6ZTsKIAlzdHJ1Y3Qgc2VuZGZpbGVfc3lu YyAqc2ZzID0gTlVMTDsKQEAgLTIxMDUsNyArMjI3Myw2IEBAIGtlcm5fc2VuZGZpbGUoc3RydWN0 IHRocmVhZCAqdGQsIHN0cnVjdCAKIAkgKi8KIAlmb3IgKG9mZiA9IHVhcC0+b2Zmc2V0OyA7ICkg ewogCQlzdHJ1Y3QgbWJ1ZiAqbXRhaWw7Ci0JCWludCBsb29wYnl0ZXM7CiAJCWludCBzcGFjZTsK IAkJaW50IGRvbmU7CiAKQEAgLTIxMTQsNyArMjI4MSw2IEBAIGtlcm5fc2VuZGZpbGUoc3RydWN0 IHRocmVhZCAqdGQsIHN0cnVjdCAKIAkJCWJyZWFrOwogCiAJCW10YWlsID0gTlVMTDsKLQkJbG9v cGJ5dGVzID0gMDsKIAkJc3BhY2UgPSAwOwogCQlkb25lID0gMDsKIApAQCAtMjE5MywxNTcgKzIz NTksMTUgQEAgcmV0cnlfc3BhY2U6CiAJCQlnb3RvIGRvbmU7CiAJCX0KIAotCQkvKgotCQkgKiBM b29wIGFuZCBjb25zdHJ1Y3QgbWF4aW11bSBzaXplZCBtYnVmIGNoYWluIHRvIGJlIGJ1bGsKLQkJ ICogZHVtcGVkIGludG8gc29ja2V0IGJ1ZmZlci4KLQkJICovCi0JCXdoaWxlIChzcGFjZSA+IGxv b3BieXRlcykgewotCQkJdm1fcGluZGV4X3QgcGluZGV4OwotCQkJdm1fb2Zmc2V0X3QgcGdvZmY7 Ci0JCQlzdHJ1Y3QgbWJ1ZiAqbTA7Ci0KLQkJCS8qCi0JCQkgKiBDYWxjdWxhdGUgdGhlIGFtb3Vu dCB0byB0cmFuc2Zlci4KLQkJCSAqIE5vdCB0byBleGNlZWQgYSBwYWdlLCB0aGUgRU9GLAotCQkJ ICogb3IgdGhlIHBhc3NlZCBpbiBuYnl0ZXMuCi0JCQkgKi8KLQkJCXBnb2ZmID0gKHZtX29mZnNl dF90KShvZmYgJiBQQUdFX01BU0spOwotCQkJaWYgKHVhcC0+bmJ5dGVzKQotCQkJCXJlbSA9ICh1 YXAtPm5ieXRlcyAtIGZzYnl0ZXMgLSBsb29wYnl0ZXMpOwotCQkJZWxzZQotCQkJCXJlbSA9IHZh LnZhX3NpemUgLQotCQkJCSAgICB1YXAtPm9mZnNldCAtIGZzYnl0ZXMgLSBsb29wYnl0ZXM7Ci0J CQl4ZnNpemUgPSBvbWluKFBBR0VfU0laRSAtIHBnb2ZmLCByZW0pOwotCQkJeGZzaXplID0gb21p bihzcGFjZSAtIGxvb3BieXRlcywgeGZzaXplKTsKLQkJCWlmICh4ZnNpemUgPD0gMCkgewotCQkJ CWRvbmUgPSAxOwkJLyogYWxsIGRhdGEgc2VudCAqLwotCQkJCWJyZWFrOwotCQkJfQotCi0JCQkv KgotCQkJICogQXR0ZW1wdCB0byBsb29rIHVwIHRoZSBwYWdlLiAgQWxsb2NhdGUKLQkJCSAqIGlm IG5vdCBmb3VuZCBvciB3YWl0IGFuZCBsb29wIGlmIGJ1c3kuCi0JCQkgKi8KLQkJCXBpbmRleCA9 IE9GRl9UT19JRFgob2ZmKTsKLQkJCVZNX09CSkVDVF9XTE9DSyhvYmopOwotCQkJcGcgPSB2bV9w YWdlX2dyYWIob2JqLCBwaW5kZXgsIFZNX0FMTE9DX05PQlVTWSB8Ci0JCQkgICAgVk1fQUxMT0Nf Tk9STUFMIHwgVk1fQUxMT0NfV0lSRUQgfCBWTV9BTExPQ19SRVRSWSk7Ci0KLQkJCS8qCi0JCQkg KiBDaGVjayBpZiBwYWdlIGlzIHZhbGlkIGZvciB3aGF0IHdlIG5lZWQsCi0JCQkgKiBvdGhlcndp c2UgaW5pdGlhdGUgSS9PLgotCQkJICogSWYgd2UgYWxyZWFkeSB0dXJuZWQgc29tZSBwYWdlcyBp bnRvIG1idWZzLAotCQkJICogc2VuZCB0aGVtIG9mZiBiZWZvcmUgd2UgY29tZSBoZXJlIGFnYWlu IGFuZAotCQkJICogYmxvY2suCi0JCQkgKi8KLQkJCWlmIChwZy0+dmFsaWQgJiYgdm1fcGFnZV9p c192YWxpZChwZywgcGdvZmYsIHhmc2l6ZSkpCi0JCQkJVk1fT0JKRUNUX1dVTkxPQ0sob2JqKTsK LQkJCWVsc2UgaWYgKG0gIT0gTlVMTCkKLQkJCQllcnJvciA9IEVBR0FJTjsJLyogc2VuZCB3aGF0 IHdlIGFscmVhZHkgZ290ICovCi0JCQllbHNlIGlmICh1YXAtPmZsYWdzICYgU0ZfTk9ESVNLSU8p Ci0JCQkJZXJyb3IgPSBFQlVTWTsKLQkJCWVsc2UgewotCQkJCXNzaXplX3QgcmVzaWQ7Ci0JCQkJ aW50IHJlYWRhaGVhZCA9IHNmcmVhZGFoZWFkICogTUFYQlNJWkU7Ci0KLQkJCQlWTV9PQkpFQ1Rf V1VOTE9DSyhvYmopOwotCi0JCQkJLyoKLQkJCQkgKiBHZXQgdGhlIHBhZ2UgZnJvbSBiYWNraW5n IHN0b3JlLgotCQkJCSAqIFhYWE1BQzogQmVjYXVzZSB3ZSBkb24ndCBoYXZlIGZwLT5mX2NyZWQK LQkJCQkgKiBoZXJlLCB3ZSBwYXNzIGluIE5PQ1JFRC4gIFRoaXMgaXMgcHJvYmFibHkKLQkJCQkg KiB3cm9uZywgYnV0IGlzIGNvbnNpc3RlbnQgd2l0aCBvdXIgb3JpZ2luYWwKLQkJCQkgKiBpbXBs ZW1lbnRhdGlvbi4KLQkJCQkgKi8KLQkJCQllcnJvciA9IHZuX3Jkd3IoVUlPX1JFQUQsIHZwLCBO VUxMLCByZWFkYWhlYWQsCi0JCQkJICAgIHRydW5jX3BhZ2Uob2ZmKSwgVUlPX05PQ09QWSwgSU9f Tk9ERUxPQ0tFRCB8Ci0JCQkJICAgIElPX1ZNSU8gfCAoKHJlYWRhaGVhZCAvIGJzaXplKSA8PCBJ T19TRVFTSElGVCksCi0JCQkJICAgIHRkLT50ZF91Y3JlZCwgTk9DUkVELCAmcmVzaWQsIHRkKTsK LQkJCQlTRlNUQVRfSU5DKHNmX2lvY250KTsKLQkJCQlpZiAoZXJyb3IpCi0JCQkJCVZNX09CSkVD VF9XTE9DSyhvYmopOwotCQkJfQotCQkJaWYgKGVycm9yKSB7Ci0JCQkJdm1fcGFnZV9sb2NrKHBn KTsKLQkJCQl2bV9wYWdlX3Vud2lyZShwZywgMCk7Ci0JCQkJLyoKLQkJCQkgKiBTZWUgaWYgYW55 b25lIGVsc2UgbWlnaHQga25vdyBhYm91dAotCQkJCSAqIHRoaXMgcGFnZS4gIElmIG5vdCBhbmQg aXQgaXMgbm90IHZhbGlkLAotCQkJCSAqIHRoZW4gZnJlZSBpdC4KLQkJCQkgKi8KLQkJCQlpZiAo cGctPndpcmVfY291bnQgPT0gMCAmJiBwZy0+dmFsaWQgPT0gMCAmJgotCQkJCSAgICAhdm1fcGFn ZV9idXNpZWQocGcpKQotCQkJCQl2bV9wYWdlX2ZyZWUocGcpOwotCQkJCXZtX3BhZ2VfdW5sb2Nr KHBnKTsKLQkJCQlWTV9PQkpFQ1RfV1VOTE9DSyhvYmopOwotCQkJCWlmIChlcnJvciA9PSBFQUdB SU4pCi0JCQkJCWVycm9yID0gMDsJLyogbm90IGEgcmVhbCBlcnJvciAqLwotCQkJCWJyZWFrOwot CQkJfQotCi0JCQkvKgotCQkJICogR2V0IGEgc2VuZGZpbGUgYnVmLiAgV2hlbiBhbGxvY2F0aW5n IHRoZQotCQkJICogZmlyc3QgYnVmZmVyIGZvciBtYnVmIGNoYWluLCB3ZSB1c3VhbGx5Ci0JCQkg KiB3YWl0IGFzIGxvbmcgYXMgbmVjZXNzYXJ5LCBidXQgdGhpcyB3YWl0Ci0JCQkgKiBjYW4gYmUg aW50ZXJydXB0ZWQuICBGb3IgY29uc2VxdWVudAotCQkJICogYnVmZmVycywgZG8gbm90IHNsZWVw LCBzaW5jZSBzZXZlcmFsCi0JCQkgKiB0aHJlYWRzIG1pZ2h0IGV4aGF1c3QgdGhlIGJ1ZmZlcnMg YW5kIHRoZW4KLQkJCSAqIGRlYWRsb2NrLgotCQkJICovCi0JCQlzZiA9IHNmX2J1Zl9hbGxvYyhw ZywgKG1udyB8fCBtICE9IE5VTEwpID8gU0ZCX05PV0FJVCA6Ci0JCQkgICAgU0ZCX0NBVENIKTsK LQkJCWlmIChzZiA9PSBOVUxMKSB7Ci0JCQkJU0ZTVEFUX0lOQyhzZl9hbGxvY2ZhaWwpOwotCQkJ CXZtX3BhZ2VfbG9jayhwZyk7Ci0JCQkJdm1fcGFnZV91bndpcmUocGcsIDApOwotCQkJCUtBU1NF UlQocGctPm9iamVjdCAhPSBOVUxMLAotCQkJCSAgICAoImtlcm5fc2VuZGZpbGU6IG9iamVjdCBk aXNhcHBlYXJlZCIpKTsKLQkJCQl2bV9wYWdlX3VubG9jayhwZyk7Ci0JCQkJaWYgKG0gPT0gTlVM TCkKLQkJCQkJZXJyb3IgPSAobW53ID8gRUFHQUlOIDogRUlOVFIpOwotCQkJCWJyZWFrOwotCQkJ fQotCi0JCQkvKgotCQkJICogR2V0IGFuIG1idWYgYW5kIHNldCBpdCB1cCBhcyBoYXZpbmcKLQkJ CSAqIGV4dGVybmFsIHN0b3JhZ2UuCi0JCQkgKi8KLQkJCW0wID0gbV9nZXQoKG1udyA/IE1fTk9X QUlUIDogTV9XQUlUT0spLCBNVF9EQVRBKTsKLQkJCWlmIChtMCA9PSBOVUxMKSB7Ci0JCQkJZXJy b3IgPSAobW53ID8gRUFHQUlOIDogRU5PQlVGUyk7Ci0JCQkJc2ZfYnVmX21leHQoTlVMTCwgc2Yp OwotCQkJCWJyZWFrOwotCQkJfQotCQkJaWYgKG1fZXh0YWRkKG0wLCAoY2FkZHJfdCApc2ZfYnVm X2t2YShzZiksIFBBR0VfU0laRSwKLQkJCSAgICBzZl9idWZfbWV4dCwgc2ZzLCBzZiwgTV9SRE9O TFksIEVYVF9TRkJVRiwKLQkJCSAgICAobW53ID8gTV9OT1dBSVQgOiBNX1dBSVRPSykpICE9IDAp IHsKLQkJCQllcnJvciA9IChtbncgPyBFQUdBSU4gOiBFTk9CVUZTKTsKLQkJCQlzZl9idWZfbWV4 dChOVUxMLCBzZik7Ci0JCQkJbV9mcmVlbShtMCk7Ci0JCQkJYnJlYWs7Ci0JCQl9Ci0JCQltMC0+ bV9kYXRhID0gKGNoYXIgKilzZl9idWZfa3ZhKHNmKSArIHBnb2ZmOwotCQkJbTAtPm1fbGVuID0g eGZzaXplOwotCi0JCQkvKiBBcHBlbmQgdG8gbWJ1ZiBjaGFpbi4gKi8KLQkJCWlmIChtdGFpbCAh PSBOVUxMKQotCQkJCW10YWlsLT5tX25leHQgPSBtMDsKLQkJCWVsc2UgaWYgKG0gIT0gTlVMTCkK LQkJCQltX2xhc3QobSktPm1fbmV4dCA9IG0wOwotCQkJZWxzZQotCQkJCW0gPSBtMDsKLQkJCW10 YWlsID0gbTA7Ci0KLQkJCS8qIEtlZXAgdHJhY2sgb2YgYml0cyBwcm9jZXNzZWQuICovCi0JCQls b29wYnl0ZXMgKz0geGZzaXplOwotCQkJb2ZmICs9IHhmc2l6ZTsKLQotCQkJaWYgKHNmcyAhPSBO VUxMKSB7Ci0JCQkJbXR4X2xvY2soJnNmcy0+bXR4KTsKLQkJCQlzZnMtPmNvdW50Kys7Ci0JCQkJ bXR4X3VubG9jaygmc2ZzLT5tdHgpOwotCQkJfQotCQl9CisJCS8qIENhbGwga2Vybl9zZW5kZmls ZV9tYnVmKCkgZm9yIHRoZSBpbm5lciBsb29wLiAqLworCQlpZiAodWFwLT5uYnl0ZXMpCisJCQly ZW0gPSAodWFwLT5uYnl0ZXMgLSBmc2J5dGVzKTsKKwkJZWxzZQorCQkJcmVtID0gdmEudmFfc2l6 ZSAtCisJCQkgICAgdWFwLT5vZmZzZXQgLSBmc2J5dGVzOworCQllcnJvciA9IGtlcm5fc2VuZGZp bGVfbWJ1Zih2cCwgb2JqLCAmbSwgJm10YWlsLCBvZmYsICZyZW0sCisJCSAgICBic2l6ZSwgJmRv bmUsIHNwYWNlLCBtbncsIDAsIHVhcC0+ZmxhZ3MsCisJCSAgICBJT19OT0RFTE9DS0VEIHwgSU9f Vk1JTywgc2ZzLCBOT0NSRUQsIHRkKTsKIAogCQlWT1BfVU5MT0NLKHZwLCAwKTsKIAotLS0gZnMv bmZzc2VydmVyL25mc19uZnNkcG9ydC5jLnNhdjMJMjAxNC0wMi0wNiAxODo0MjowMi4wMDAwMDAw MDAgLTA1MDAKKysrIGZzL25mc3NlcnZlci9uZnNfbmZzZHBvcnQuYwkyMDE0LTAyLTA2IDE4OjI2 OjQ5LjAwMDAwMDAwMCAtMDUwMApAQCAtNzQsNiArNzQsOCBAQCBzdGF0aWMgdWludDMyX3QgbmZz djRfc3lzaWQgPSAwOwogCiBzdGF0aWMgaW50IG5mc3N2Y19zcnZjYWxsKHN0cnVjdCB0aHJlYWQg Kiwgc3RydWN0IG5mc3N2Y19hcmdzICosCiAgICAgc3RydWN0IHVjcmVkICopOworc3RhdGljIGlu dCBuZnNydl9maWxlX21idWYoc3RydWN0IHZub2RlICosIHN0cnVjdCBuZnN2YXR0ciAqLCBvZmZf dCwKKyAgICBpbnQsIHN0cnVjdCBtYnVmICoqLCBzdHJ1Y3QgbWJ1ZiAqKiwgc3RydWN0IHVjcmVk ICosIHN0cnVjdCB0aHJlYWQgKik7CiAKIGludCBuZnNydl9lbmFibGVfY3Jvc3NtbnRwdCA9IDE7 CiBzdGF0aWMgaW50IG5mc19jb21taXRfYmxrczsKQEAgLTYxNyw4ICs2MTksOSBAQCBvdXQ6CiAg KiBSZWFkIHZub2RlIG9wIGNhbGwgaW50byBtYnVmIGxpc3QuCiAgKi8KIGludAotbmZzdm5vX3Jl YWQoc3RydWN0IHZub2RlICp2cCwgb2ZmX3Qgb2ZmLCBpbnQgY250LCBzdHJ1Y3QgdWNyZWQgKmNy ZWQsCi0gICAgc3RydWN0IHRocmVhZCAqcCwgc3RydWN0IG1idWYgKiptcHAsIHN0cnVjdCBtYnVm ICoqbXBlbmRwKQorbmZzdm5vX3JlYWQoc3RydWN0IHZub2RlICp2cCwgc3RydWN0IG5mc3ZhdHRy ICpudmFwLCBvZmZfdCBvZmYsIGludCBjbnQsCisgICAgc3RydWN0IHVjcmVkICpjcmVkLCBzdHJ1 Y3QgdGhyZWFkICpwLCBzdHJ1Y3QgbWJ1ZiAqKm1wcCwKKyAgICBzdHJ1Y3QgbWJ1ZiAqKm1wZW5k cCkKIHsKIAlzdHJ1Y3QgbWJ1ZiAqbTsKIAlpbnQgaTsKQEAgLTYyOSw2ICs2MzIsOSBAQCBuZnN2 bm9fcmVhZChzdHJ1Y3Qgdm5vZGUgKnZwLCBvZmZfdCBvZmYsCiAJc3RydWN0IHVpbyBpbywgKnVp b3AgPSAmaW87CiAJc3RydWN0IG5mc2hldXIgKm5oOwogCisJZXJyb3IgPSBuZnNydl9maWxlX21i dWYodnAsIG52YXAsIG9mZiwgY250LCBtcHAsIG1wZW5kcCwgY3JlZCwgcCk7CisJaWYgKGVycm9y ID09IDApCisJCWdvdG8gb3V0OwogCWxlbiA9IGxlZnQgPSBORlNNX1JORFVQKGNudCk7CiAJbTMg PSBOVUxMOwogCS8qCkBAIC0zMjk1LDYgKzMzMDEsOTggQEAgbmZzcnZfYmFja3Vwc3RhYmxlKHZv aWQpCiAJfQogfQogCisvKgorICogVGhpcyBmdW5jdGlvbiB1c2VzIGtlcm5fc2VuZGZpbGVfbWJ1 ZigpIHRvIGdlbmVyYXRlIGEgbGlzdCBvZiBtYnVmcworICogdGhhdCBjYW4gYmUgdXNlZCBieSB0 aGUgTkZTIHNlcnZlciByZWFkIHJlcGx5LgorICovCitpbnQKK25mc3J2X2ZpbGVfbWJ1ZihzdHJ1 Y3Qgdm5vZGUgKnZwLCBzdHJ1Y3QgbmZzdmF0dHIgKm52YXAsIG9mZl90IG9mZiwKKyAgICBpbnQg bmJ5dGVzLCBzdHJ1Y3QgbWJ1ZiAqKm1wLCBzdHJ1Y3QgbWJ1ZiAqKm1lbmRwLCBzdHJ1Y3QgdWNy ZWQgKmFjcmVkLAorICAgIHN0cnVjdCB0aHJlYWQgKnRkKQoreworCXN0cnVjdCB2bV9vYmplY3Qg Km9iaiA9IE5VTEw7CisJc3RydWN0IG1idWYgKm0sICptMjsKKwlvZmZfdCByZW07CisJaW50IGJz aXplLCBjbnQsIGRvbmUsIGVycm9yLCBpOworCWNoYXIgKmNwOworCisJKm1wID0gTlVMTDsKKwkq bWVuZHAgPSBOVUxMOworCWVycm9yID0gMDsKKwlic2l6ZSA9IHZwLT52X21vdW50LT5tbnRfc3Rh dC5mX2lvc2l6ZTsKKwlpZiAobnZhcC0+bmFfc2l6ZSA+IG9mZikKKwkJcmVtID0gbnZhcC0+bmFf c2l6ZSAtIG9mZjsKKwllbHNlCisJCWdvdG8gb3V0OwkvKiBOb3RoaW5nIHRvIHJlYWQsIHNvIHdl IGFyZSBkb25lLiAqLworCXJlbSA9IG9taW4ocmVtLCBuYnl0ZXMpOworCWNudCA9IHJlbTsKKwlj bnQgPSBORlNNX1JORFVQKGNudCkgLSBjbnQ7CS8qIENudCBvZiBieXRlcyBvZiB0cmFpbGluZyAw cy4gKi8KKwlvYmogPSB2cC0+dl9vYmplY3Q7CisJaWYgKG9iaiAhPSBOVUxMKSB7CisJCS8qCisJ CSAqIERvIHdlIG5lZWQgdG8gYWNxdWlyZSBhIHJlZmVyZW5jZSBjbnQgb24gdGhlIG9iaiBsaWtl CisJCSAqIGtlcm5fc2VuZGZpbGUoKSBkb2VzLCBldmVuIGlmIHdlIG5ldmVyIHVubG9jayB2cD8K KwkJICovCisJCVZNX09CSkVDVF9XTE9DSyhvYmopOworCQlpZiAoKG9iai0+ZmxhZ3MgJiBPQkpf REVBRCkgPT0gMCkgeworI2lmZGVmIG5vdG5vdworCQkJdm1fb2JqZWN0X3JlZmVyZW5jZV9sb2Nr ZWQob2JqKTsKKyNlbmRpZgorCQkJVk1fT0JKRUNUX1dVTkxPQ0sob2JqKTsKKwkJfSBlbHNlIHsK KwkJCVZNX09CSkVDVF9XVU5MT0NLKG9iaik7CisJCQlvYmogPSBOVUxMOworCQl9CisJfQorCWlm IChvYmogPT0gTlVMTCkKKwkJZXJyb3IgPSBFSU5WQUw7CisKKwkvKiBTZXR1cCB0aGUgYXJncyBm b3Iga2Vybl9zZW5kZmlsZV9tYnVmKCkuICovCisJZG9uZSA9IDA7CisJaWYgKGVycm9yID09IDAp CisJCWVycm9yID0ga2Vybl9zZW5kZmlsZV9tYnVmKHZwLCBvYmosIG1wLCBtZW5kcCwgb2ZmLCAm cmVtLCBic2l6ZSwKKwkJICAgICZkb25lLCAyICogTUFYQlNJWkUsIDAsIDEsIDAsIElPX05PREVM T0NLRUQgfCBJT19WTUlPIHwKKwkJICAgIElPX05PTUFDQ0hFQ0ssIE5VTEwsIGFjcmVkLCB0ZCk7 CisJaWYgKGVycm9yID09IDAgJiYgY250ID4gMCkgeworCQkvKgorCQkgKiBTdW4gWERSIHJlcXVp cmVzIHRoYXQgZGF0YSBiZSBmaWxsZWQgdG8gYSBtdWx0aXBsZQorCQkgKiBvZiA0IGJ5dGVzIHdp dGggMCBieXRlcy4KKwkJICogU2luY2UgdGhlIGxpc3Qgb2YgbWJ1ZnMgcmV0dXJuZWQgYnkga2Vy bl9zZW5kZmlsZV9tYnVmCisJCSAqIHJlZmVyZW5jZSBwYWdlcyBmb3IgYSBmaWxlIGFuZCBjYW4n dCBiZSBtb2RpZmllZCwgdGhpcworCQkgKiByZXF1aXJlcyBhbiBhZGRpdGlvbmFsIG1idWYuCisJ CSAqLworCQltID0gKm1lbmRwOworCQlORlNNR0VUKG0yKTsKKwkJbS0+bV9uZXh0ID0gbTI7CisJ CSptZW5kcCA9IG0yOworCQljcCA9IG10b2QobTIsIGNoYXIgKik7CisJCW0yLT5tX2xlbiA9IGNu dDsKKwkJZm9yIChpID0gMDsgaSA8IGNudDsgaSsrKQorCQkJKmNwKysgPSAnXDAnOworCX0KKwlp ZiAoZXJyb3IgPT0gRVJFU1RBUlQpCisJCXBhbmljKCJuZnNydl9maWxlX21idWY6IEVSRVNUQVJU Iik7CitvdXQ6CisjaWZkZWYgbm90bm93CisJaWYgKG9iaiAhPSBOVUxMKSB7CisJCU5GU1ZPUFVO TE9DSyh2cCwgMCk7CisJCXZtX29iamVjdF9kZWFsbG9jYXRlKG9iaik7CisJCU5GU1ZPUExPQ0so dnAsIExLX1NIQVJFRCB8IExLX1JFVFJZKTsKKwkJaWYgKGVycm9yID09IDAgJiYgKHZwLT52X2lm bGFnICYgVklfRE9PTUVEKSAhPSAwKQorCQkJZXJyb3IgPSBFU1RBTEU7CisJfQorI2VuZGlmCisJ aWYgKGVycm9yICE9IDApIHsKKwkJaWYgKCptcCAhPSBOVUxMKSB7CisJCQltX2ZyZWVtKCptcCk7 CisJCQkqbXAgPSBOVUxMOworCQkJKm1lbmRwID0gTlVMTDsKKwkJfQorcHJpbnRmKCJlbyBuZnNy dl9maWxlX21idWYgZXJyPSVkXG4iLCBlcnJvcik7CisJfQorCXJldHVybiAoZXJyb3IpOworfQor CiBleHRlcm4gaW50ICgqbmZzZF9jYWxsX25mc2QpKHN0cnVjdCB0aHJlYWQgKiwgc3RydWN0IG5m c3N2Y19hcmdzICopOwogCiAvKgotLS0gZnMvbmZzc2VydmVyL25mc19uZnNkc2Vydi5jLnNhdjMJ MjAxNC0wMi0wNSAyMTozNToxNi4wMDAwMDAwMDAgLTA1MDAKKysrIGZzL25mc3NlcnZlci9uZnNf bmZzZHNlcnYuYwkyMDE0LTAyLTA1IDIxOjM1OjQ1LjAwMDAwMDAwMCAtMDUwMApAQCAtNzI2LDcg KzcyNiw3IEBAIG5mc3J2ZF9yZWFkKHN0cnVjdCBuZnNydl9kZXNjcmlwdCAqbmQsIF8KIAkJY250 ID0gcmVxbGVuOwogCW0zID0gTlVMTDsKIAlpZiAoY250ID4gMCkgewotCQluZC0+bmRfcmVwc3Rh dCA9IG5mc3Zub19yZWFkKHZwLCBvZmYsIGNudCwgbmQtPm5kX2NyZWQsIHAsCisJCW5kLT5uZF9y ZXBzdGF0ID0gbmZzdm5vX3JlYWQodnAsICZudmEsIG9mZiwgY250LCBuZC0+bmRfY3JlZCwgcCwK IAkJICAgICZtMywgJm0yKTsKIAkJaWYgKCEobmQtPm5kX2ZsYWcgJiBORF9ORlNWNCkpIHsKIAkJ CWdldHJldCA9IG5mc3Zub19nZXRhdHRyKHZwLCAmbnZhLCBuZC0+bmRfY3JlZCwgcCwgMSk7Ci0t LSBmcy9uZnMvbmZzX3Zhci5oLnNhdjMJMjAxNC0wMi0wNSAyMjoxNDozOC4wMDAwMDAwMDAgLTA1 MDAKKysrIGZzL25mcy9uZnNfdmFyLmgJMjAxNC0wMi0wNSAyMjoxNzoyOS4wMDAwMDAwMDAgLTA1 MDAKQEAgLTU3OSw4ICs1NzksOCBAQCB2b2lkIG5mc3Zub19zZXRwYXRoYnVmKHN0cnVjdCBuYW1l aWRhdGEgCiB2b2lkIG5mc3Zub19yZWxwYXRoYnVmKHN0cnVjdCBuYW1laWRhdGEgKik7CiBpbnQg bmZzdm5vX3JlYWRsaW5rKHZub2RlX3QsIHN0cnVjdCB1Y3JlZCAqLCBORlNQUk9DX1QgKiwgbWJ1 Zl90ICosCiAgICAgbWJ1Zl90ICosIGludCAqKTsKLWludCBuZnN2bm9fcmVhZCh2bm9kZV90LCBv ZmZfdCwgaW50LCBzdHJ1Y3QgdWNyZWQgKiwgTkZTUFJPQ19UICosCi0gICAgbWJ1Zl90ICosIG1i dWZfdCAqKTsKK2ludCBuZnN2bm9fcmVhZCh2bm9kZV90LCBzdHJ1Y3QgbmZzdmF0dHIgKiwgb2Zm X3QsIGludCwgc3RydWN0IHVjcmVkICosCisgICAgTkZTUFJPQ19UICosIG1idWZfdCAqLCBtYnVm X3QgKik7CiBpbnQgbmZzdm5vX3dyaXRlKHZub2RlX3QsIG9mZl90LCBpbnQsIGludCwgaW50LCBt YnVmX3QsCiAgICAgY2hhciAqLCBzdHJ1Y3QgdWNyZWQgKiwgTkZTUFJPQ19UICopOwogaW50IG5m c3Zub19jcmVhdGVzdWIoc3RydWN0IG5mc3J2X2Rlc2NyaXB0ICosIHN0cnVjdCBuYW1laWRhdGEg KiwKLS0tIHN5cy9zeXNjYWxsc3Vici5oLnNhdjMJMjAxNC0wMi0wNCAyMDozMDo0NC4wMDAwMDAw MDAgLTA1MDAKKysrIHN5cy9zeXNjYWxsc3Vici5oCTIwMTQtMDItMDYgMTg6MjQ6NDguMDAwMDAw MDAwIC0wNTAwCkBAIC0yNTMsNiArMjUzLDExIEBAIGludAlrZXJuX3dhaXQ2KHN0cnVjdCB0aHJl YWQgKnRkLCBlbnVtIGkKIGludAlrZXJuX3dyaXRldihzdHJ1Y3QgdGhyZWFkICp0ZCwgaW50IGZk LCBzdHJ1Y3QgdWlvICphdWlvKTsKIGludAlrZXJuX3NvY2tldHBhaXIoc3RydWN0IHRocmVhZCAq dGQsIGludCBkb21haW4sIGludCB0eXBlLCBpbnQgcHJvdG9jb2wsCiAJICAgIGludCAqcnN2KTsK K2ludAlrZXJuX3NlbmRmaWxlX21idWYoc3RydWN0IHZub2RlICp2cCwgc3RydWN0IHZtX29iamVj dCAqb2JqLAorCSAgICBzdHJ1Y3QgbWJ1ZiAqKm1wLCBzdHJ1Y3QgbWJ1ZiAqKm10YWlscCwgb2Zm X3Qgb2ZmLCBvZmZfdCAqcmVtcCwKKwkgICAgaW50IGJzaXplLCBpbnQgKmRvbmVwLCBpbnQgc3Bh Y2UsIGludCBtbncsIGludCB3YWl0X2ZvcmFsbCwKKwkgICAgaW50IGZsYWdzLCBpbnQgaW9mbGFn cywgdm9pZCAqc2ZzLCBzdHJ1Y3QgdWNyZWQgKmFjcmVkLAorCSAgICBzdHJ1Y3QgdGhyZWFkICp0 ZCk7CiAKIC8qIGZsYWdzIGZvciBrZXJuX3NpZ2FjdGlvbiAqLwogI2RlZmluZQlLU0FfT1NJR1NF VAkweDAwMDEJLyogdXNlcyBvc2lnYWN0X3QgKi8K ------=_Part_1982991_1248016500.1391738315025-- From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 05:54:51 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A0BF06A8 for ; Fri, 7 Feb 2014 05:54:51 +0000 (UTC) Received: from mail.ipfw.ru (mail.ipfw.ru [IPv6:2a01:4f8:120:6141::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5FA52136F for ; Fri, 7 Feb 2014 05:54:51 +0000 (UTC) Received: from secured.by.ipfw.ru ([95.143.220.47] helo=ws.su29.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1WBaYD-000OQu-Uf; Fri, 07 Feb 2014 05:48:18 +0400 Message-ID: <52F4751B.40100@FreeBSD.org> Date: Fri, 07 Feb 2014 09:54:35 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130728 Thunderbird/17.0.7 MIME-Version: 1.0 To: Nicolas DEFFAYET Subject: Re: IPsec filtertunnel broken on FreeBSD 10 References: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> In-Reply-To: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> X-Enigmail-Version: 1.5.1 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2IXOOOGDUBDIPNCITPSTP" Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 05:54:51 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2IXOOOGDUBDIPNCITPSTP Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 07.02.2014 02:21, Nicolas DEFFAYET wrote: > Hello, >=20 > The IPsec filtertunnel is broken on FreeBSD 10: incoming packets > decapsulated are not going to firewall and to the pseudo interface enc.= >=20 > This issue affect 10.0-RELEASE and 10.0-STABLE. > 9.1-RELEASE and 9.2-RELEASE are not affected. >=20 > Of course the systctl show that filtertunnel is enabled: > net.inet.ipsec.filtertunnel=3D1 > net.inet6.ipsec.filtertunnel=3D1 >=20 > This issue is serious as it's not possible to use firewall (ipfw/pf) fo= r > secure a gre/gif/l2tp IPsec tunnel as the incoming packets decapsulated= > are not seen by the firewall. >=20 > Many peoples have reported the issue on forums.freebsd.org and a bug > report have been open: > http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/185876 >=20 > For try to provide a fix, i have run a diff on kernel source on net, > netinet, netinet6 and netipsec folders between 9.2-RELEASE and > 10.0-RELEASE but I didn't have found what change can break IPsec > filtertunnel. >=20 >=20 > Any expert or people knowing the code can help us please ? I'll take a look on this today. >=20 >=20 > Many thanks ! >=20 >=20 ------enig2IXOOOGDUBDIPNCITPSTP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlL0dR8ACgkQwcJ4iSZ1q2lHDgCfVvEpQ4bD9qr6PCu7m7H9u/+O NJMAnjUEdTnoXgzkE5qMDLsRySD9fZ6m =MHPX -----END PGP SIGNATURE----- ------enig2IXOOOGDUBDIPNCITPSTP-- From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 07:43:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A99C91A3 for ; Fri, 7 Feb 2014 07:43:57 +0000 (UTC) Received: from quix.smartspb.net (quix.smartspb.net [217.119.16.133]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5A8961B91 for ; Fri, 7 Feb 2014 07:43:57 +0000 (UTC) Received: from dyr.smartspb.net ([217.119.16.26] helo=[127.0.0.1]) by quix.smartspb.net with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.61 (FreeBSD)) (envelope-from ) id 1WBg6N-0004Mt-Kw for freebsd-net@freebsd.org; Fri, 07 Feb 2014 11:43:55 +0400 Message-ID: <52F48EB7.5010706@smartspb.net> Date: Fri, 07 Feb 2014 11:43:51 +0400 From: Dennis Yusupoff User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: PF states degrade? References: <52F3366D.3030202@smartspb.net> <52F3BAB6.7090304@shrew.net> In-Reply-To: <52F3BAB6.7090304@shrew.net> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Antivirus: avast! (VPS 140206-1, 06.02.2014), Outbound message X-Antivirus-Status: Clean X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 07:43:57 -0000 Hello, Matthew. Definitely not - see limits defined in the pf.conf below. Moreover, we had tested also after have done "pfctl -Fa -f /etc/pf.conf && pfctl -d && pfctl -e" with traffic from only one customers. 06.02.2014 20:39, Matthew Grooms пишет: > On 2/6/2014 1:14 AM, Dennis Yusupoff wrote: >> ... >> set limit { states 1000000, frags 80000, src-nodes 100000, table-entries >> 500000} >> ... > Dennis, > > Did you run out of pf state table entries? You can use pfctl to list > the current limit and usage ... > > INFO: > Status: Enabled for 14 days 19:48:29 Debug: Urgent > > State Table Total Rate > current entries 4 > searches 2030427 1.6/s > inserts 64990 0.1/s > removals 64986 0.1/s > > LIMITS: > states hard limit 10000 > src-nodes hard limit 10000 > frags hard limit 5000 > table-entries hard limit 200000 > > .. If that is the case, you can increase your state table size by > inserting some configuration parameters at the top of your pf.conf > file. For example ... > > set limit states 50000 > set limit src-nodes 50000 > set limit frags 25000 > > -Matthew > _______________________________________________ > -- Best regards, Dennis Yusupoff, network engineer of Smart-Telecom ISP Russia, Saint-Petersburg From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 11:31:47 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C5CFFFFC for ; Fri, 7 Feb 2014 11:31:47 +0000 (UTC) Received: from forward10.mail.yandex.net (forward10.mail.yandex.net [IPv6:2a02:6b8:0:202::5]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6C4731080 for ; Fri, 7 Feb 2014 11:31:47 +0000 (UTC) Received: from smtp8.mail.yandex.net (smtp8.mail.yandex.net [77.88.61.54]) by forward10.mail.yandex.net (Yandex) with ESMTP id 534211020315; Fri, 7 Feb 2014 15:31:43 +0400 (MSK) Received: from smtp8.mail.yandex.net (localhost [127.0.0.1]) by smtp8.mail.yandex.net (Yandex) with ESMTP id 1771D1B600B9; Fri, 7 Feb 2014 15:31:43 +0400 (MSK) Received: from 95.108.170.136-red.dhcp.yndx.net (95.108.170.136-red.dhcp.yndx.net [95.108.170.136]) by smtp8.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id ZiVc0LK9Id-VgNKQD23; Fri, 7 Feb 2014 15:31:42 +0400 (using TLSv1 with cipher CAMELLIA256-SHA (256/256 bits)) (Client certificate not present) X-Yandex-Uniq: 90117219-d4ae-44d6-9009-a3c8a9d553c5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1391772702; bh=9ueK8hqFy8Fm3xBEYECafP3Sq22rdXKKAGCQKo7xUv4=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:Subject: References:In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=oDXiqLr+Y96evoOD2W5ILZELfYtB0U2v+sEgu3iE09I4h8sV2jaKYkqeBkva1pb+o /PKOZA1LOZ0oMTwHUZQDQN32Bh5+A7cduJ9mmD5+VvFzXukVGx30RVzEoV6tLCdTPa ZvwmVwC6zQpOLIQbHREGnkE2igsHRwo1cMmdMSw0= Authentication-Results: smtp8.mail.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: <52F4C41B.3030101@yandex.ru> Date: Fri, 07 Feb 2014 15:31:39 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Nicolas DEFFAYET , freebsd-net@freebsd.org Subject: Re: IPsec filtertunnel broken on FreeBSD 10 References: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> In-Reply-To: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 11:31:47 -0000 On 07.02.2014 02:21, Nicolas DEFFAYET wrote: > Hello, > > The IPsec filtertunnel is broken on FreeBSD 10: incoming packets > decapsulated are not going to firewall and to the pseudo interface enc. > > This issue affect 10.0-RELEASE and 10.0-STABLE. > 9.1-RELEASE and 9.2-RELEASE are not affected. > > Of course the systctl show that filtertunnel is enabled: > net.inet.ipsec.filtertunnel=1 > net.inet6.ipsec.filtertunnel=1 Can you show what values do you have in the sysctl net.enc ? -- WBR, Andrey V. Elsukov From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 12:44:40 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B82F6825 for ; Fri, 7 Feb 2014 12:44:40 +0000 (UTC) Received: from smtp.novso.com (smtp1.novso.com [IPv6:2a00:14e8:28:3::5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 806371794 for ; Fri, 7 Feb 2014 12:44:40 +0000 (UTC) Message-ID: <1391777078.27201.2.camel@srv31.corp.novso.com> Subject: Re: IPsec filtertunnel broken on FreeBSD 10 From: Nicolas DEFFAYET To: "Andrey V. Elsukov" Date: Fri, 07 Feb 2014 12:44:38 +0000 In-Reply-To: <52F4C41B.3030101@yandex.ru> References: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> <52F4C41B.3030101@yandex.ru> Organization: DEFFAYET.COM Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4-3.noclutter Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 12:44:40 -0000 On Fri, 2014-02-07 at 15:31 +0400, Andrey V. Elsukov wrote: > On 07.02.2014 02:21, Nicolas DEFFAYET wrote: Hello Andrey, > > The IPsec filtertunnel is broken on FreeBSD 10: incoming packets > > decapsulated are not going to firewall and to the pseudo interface enc. > > > > This issue affect 10.0-RELEASE and 10.0-STABLE. > > 9.1-RELEASE and 9.2-RELEASE are not affected. > > > > Of course the systctl show that filtertunnel is enabled: > > net.inet.ipsec.filtertunnel=1 > > net.inet6.ipsec.filtertunnel=1 > > Can you show what values do you have in the > sysctl net.enc ? I use default value (value not tunned in boot/loader.conf & etc/sysctl.conf) FreeBSD 9.1-RELEASE net.enc.in.ipsec_bpf_mask: 1 net.enc.in.ipsec_filter_mask: 1 net.enc.out.ipsec_bpf_mask: 3 net.enc.out.ipsec_filter_mask: 1 FreeBSD 10.0-RELEASE net.enc.in.ipsec_bpf_mask: 1 net.enc.in.ipsec_filter_mask: 1 net.enc.out.ipsec_bpf_mask: 3 net.enc.out.ipsec_filter_mask: 1 Many thanks for your help -- Nicolas DEFFAYET From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 13:40:42 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3CF3449A for ; Fri, 7 Feb 2014 13:40:42 +0000 (UTC) Received: from smtp.novso.com (smtp1.novso.com [IPv6:2a00:14e8:28:3::5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 03D511D48 for ; Fri, 7 Feb 2014 13:40:42 +0000 (UTC) Message-ID: <1391780440.28112.2.camel@srv31.corp.novso.com> Subject: Re: IPsec filtertunnel broken on FreeBSD 10 From: Nicolas DEFFAYET To: "Andrey V. Elsukov" Date: Fri, 07 Feb 2014 13:40:40 +0000 In-Reply-To: <1391777078.27201.2.camel@srv31.corp.novso.com> References: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> <52F4C41B.3030101@yandex.ru> <1391777078.27201.2.camel@srv31.corp.novso.com> Organization: DEFFAYET.COM Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4-3.noclutter Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 13:40:42 -0000 On Fri, 2014-02-07 at 12:44 +0000, Nicolas DEFFAYET wrote: Hello Andrey, Hum, after long time (more than 30 secs), I finish by seeing packets exchange on FreeBSD 10-RELEASE 13:32:46.135752 (authentic,confidential): SPI 0x06bb885e: IP ipwan-remote > ipwan-local: GREv0, length 64: IP iptunnel-remote.20044 > iptunnel-local.22: Flags [S], seq 209981237, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 1966114362 ecr 0], length 0 13:32:46.135852 (authentic,confidential): SPI 0x0ebc5f9b: IP ipwan-local > ipwanremote: GREv0, length 64: IP iptunnel-local.22 > iptunnel-remote.20044: Flags [S.], seq 2240012658, ack 209981238, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 3945107127 ecr 1966114362], length 0 Don't know why it's so long (i use flag -n in tcpdump for disable name resolution). So peoples don't seeing packets exchange on enc0 are may be impatient like me. But the problem is still here, as you can see bellow: ipfw 00100 allow log logamount 100 ip from any to any via gre3 => packets not seen by rules100 as nothing in log and nothing in counters pf @0 pass log quick on gre3 all flags S/SA keep state => packets not seen by rule 0 as nothing in log and nothing in counters For generate this packets, I use ICMP echo-ping/echo-reply and a SSH client-server (TCP 22). Of course, i have tested to change gre3 to em0 for make sure that ipfw and pf logging works. On FreeBSD 10.0-RELEASE - packets are visible on enc0 in both direction with default net.enc settings if you are patient - ipfw don't see the incoming packet as no match - pf don't see the incoming packet as no match On FreeBSD 9.1-RELEASE everything work fine with same configuration Gleb Smirnoff wrote (http://lists.freebsd.org/pipermail/freebsd-stable/2014-January/076903.html): "nothing has changed in pf in regards to its ipsec handling" So the bug _seem_ to be related to ipsec as both ipfw and pf don't see the packet. Thanks -- Nicolas DEFFAYET From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 14:48:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 612E8B93 for ; Fri, 7 Feb 2014 14:48:57 +0000 (UTC) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 24CF0149F for ; Fri, 7 Feb 2014 14:48:57 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.7/8.14.7) with ESMTP id s17EmiKM005454; Fri, 7 Feb 2014 09:48:45 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <52F4F24A.5000202@sentex.net> Date: Fri, 07 Feb 2014 09:48:42 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: Nicolas DEFFAYET , "Andrey V. Elsukov" Subject: Re: IPsec filtertunnel broken on FreeBSD 10 References: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> <52F4C41B.3030101@yandex.ru> <1391777078.27201.2.camel@srv31.corp.novso.com> <1391780440.28112.2.camel@srv31.corp.novso.com> In-Reply-To: <1391780440.28112.2.camel@srv31.corp.novso.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.74 Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 14:48:57 -0000 On 2/7/2014 8:40 AM, Nicolas DEFFAYET wrote: > > > So the bug _seem_ to be related to ipsec as both ipfw and pf don't see > the packet. If you do a tcpdump -s0 -nvei enc0 do you see decapsulated ipsec traffic ? ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 18:26:44 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8DAA0F62 for ; Fri, 7 Feb 2014 18:26:44 +0000 (UTC) Received: from smtp.novso.com (smtp1.novso.com [IPv6:2a00:14e8:28:3::5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3DB061AF1 for ; Fri, 7 Feb 2014 18:26:44 +0000 (UTC) Message-ID: <1391797602.26050.2.camel@fr-wks3.corp.novso.com> Subject: Re: IPsec filtertunnel broken on FreeBSD 10 From: Nicolas DEFFAYET To: Mike Tancsa Date: Fri, 07 Feb 2014 19:26:42 +0100 In-Reply-To: <52F4F24A.5000202@sentex.net> References: <1391725273.22934.16.camel@fr-wks3.corp.novso.com> <52F4C41B.3030101@yandex.ru> <1391777078.27201.2.camel@srv31.corp.novso.com> <1391780440.28112.2.camel@srv31.corp.novso.com> <52F4F24A.5000202@sentex.net> Organization: DEFFAYET.COM Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4-3 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, "Andrey V. Elsukov" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 18:26:44 -0000 On Fri, 2014-02-07 at 09:48 -0500, Mike Tancsa wrote: Hello Mike, > On 2/7/2014 8:40 AM, Nicolas DEFFAYET wrote: > > > > > > So the bug _seem_ to be related to ipsec as both ipfw and pf don't see > > the packet. > > > If you do a > tcpdump -s0 -nvei enc0 > > do you see decapsulated ipsec traffic ? Yes: ICMP ping 18:17:46.694009 (authentic,confidential): SPI 0x0407cfca: (tos 0x0, ttl 25, id 50699, offset 0, flags [none], proto GRE (47), length 108) ipwan-remote > ipwan-local: GREv0, Flags [none], proto IPv4 (0x0800), length 88 (tos 0x0, ttl 64, id 50699, offset 0, flags [none], proto ICMP (1), length 84) iptunnel-remote > iptunnel-local: ICMP echo request, id 44530, seq 0, length 64 18:17:46.694074 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 55848, offset 0, flags [none], proto GRE (47), length 108, bad cksum 0 (->c314)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 88 (tos 0x0, ttl 64, id 55848, offset 0, flags [none], proto ICMP (1), length 84) iptunnel-local > iptunnel-remote: ICMP echo reply, id 44530, seq 0, length 64 18:17:46.694087 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 55848, offset 0, flags [none], proto GRE (47), length 108, bad cksum 0 (->c314)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 88 (tos 0x0, ttl 64, id 55848, offset 0, flags [none], proto ICMP (1), length 84) iptunnel-local > iptunnel-remote: ICMP echo reply, id 44530, seq 0, length 64 18:17:47.696307 (authentic,confidential): SPI 0x0407cfca: (tos 0x0, ttl 25, id 50716, offset 0, flags [none], proto GRE (47), length 108) ipwan-remote > ipwan-local: GREv0, Flags [none], proto IPv4 (0x0800), length 88 (tos 0x0, ttl 64, id 50716, offset 0, flags [none], proto ICMP (1), length 84) iptunnel-remote > iptunnel-local: ICMP echo request, id 44530, seq 1, length 64 18:17:47.696373 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 55859, offset 0, flags [none], proto GRE (47), length 108, bad cksum 0 (->c309)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 88 (tos 0x0, ttl 64, id 55859, offset 0, flags [none], proto ICMP (1), length 84) iptunnel-local > iptunnel-remote: ICMP echo reply, id 44530, seq 1, length 64 18:17:47.696383 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 55859, offset 0, flags [none], proto GRE (47), length 108, bad cksum 0 (->c309)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 88 (tos 0x0, ttl 64, id 55859, offset 0, flags [none], proto ICMP (1), length 84) iptunnel-local > iptunnel-remote: ICMP echo reply, id 44530, seq 1, length 64 TCP 22 18:20:46.388423 (authentic,confidential): SPI 0x0407cfca: (tos 0x0, ttl 25, id 54835, offset 0, flags [none], proto GRE (47), length 84) ipwan-remote > ipwan-local: GREv0, Flags [none], proto IPv4 (0x0800), length 64 (tos 0x10, ttl 64, id 54835, offset 0, flags [DF], proto TCP (6), length 60) iptunnel-remote.11054 > iptunnel-local.22: Flags [S], cksum 0xea60 (correct), seq 1449355022, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 1985194722 ecr 0], length 0 18:20:46.388508 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 56146, offset 0, flags [none], proto GRE (47), length 84, bad cksum 0 (->c202)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 64 (tos 0x0, ttl 64, id 56146, offset 0, flags [DF], proto TCP (6), length 60) iptunnel-local.22 > iptunnel-remote.11054: Flags [S.], cksum 0xfbdf (correct), seq 2705433943, ack 1449355023, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 2909993571 ecr 1985194722], length 0 18:20:46.388562 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 56146, offset 0, flags [none], proto GRE (47), length 84, bad cksum 0 (->c202)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 64 (tos 0x0, ttl 64, id 56146, offset 0, flags [DF], proto TCP (6), length 60) iptunnel-local.22 > iptunnel-remote.11054: Flags [S.], cksum 0xfbdf (correct), seq 2705433943, ack 1449355023, win 65535, options [mss 1460,nop,wscale 6,sackOK,TS val 2909993571 ecr 1985194722], length 0 18:20:46.396379 (authentic,confidential): SPI 0x0407cfca: (tos 0x0, ttl 25, id 54837, offset 0, flags [none], proto GRE (47), length 76) ipwan-remote > ipwan-local: GREv0, Flags [none], proto IPv4 (0x0800), length 56 (tos 0x10, ttl 64, id 54837, offset 0, flags [DF], proto TCP (6), length 52) iptunnel-remote.11054 > iptunnel-local.22: Flags [.], cksum 0x2693 (correct), ack 1, win 1040, options [nop,nop,TS val 1985194730 ecr 2909993571], length 0 18:20:46.428010 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 56149, offset 0, flags [none], proto GRE (47), length 110, bad cksum 0 (->c1e5)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 90 (tos 0x0, ttl 64, id 56149, offset 0, flags [DF], proto TCP (6), length 86) iptunnel-local.22 > iptunnel-remote.11054: Flags [P.], cksum 0xb16d (correct), seq 1:35, ack 1, win 1040, options [nop,nop,TS val 2909993610 ecr 1985194730], length 34 18:20:46.428024 (authentic,confidential): SPI 0x0ad42248: (tos 0x0, ttl 30, id 56149, offset 0, flags [none], proto GRE (47), length 110, bad cksum 0 (->c1e5)!) ipwan-local > ipwan-remote: GREv0, Flags [none], proto IPv4 (0x0800), length 90 (tos 0x0, ttl 64, id 56149, offset 0, flags [DF], proto TCP (6), length 86) iptunnel-local.22 > iptunnel-remote.11054: Flags [P.], cksum 0xb16d (correct), seq 1:35, ack 1, win 1040, options [nop,nop,TS val 2909993610 ecr 1985194730], length 34 18:20:46.536017 (authentic,confidential): SPI 0x0407cfca: (tos 0x0, ttl 25, id 54840, offset 0, flags [none], proto GRE (47), length 76) ipwan-remote > ipwan-local: GREv0, Flags [none], proto IPv4 (0x0800), length 56 (tos 0x10, ttl 64, id 54840, offset 0, flags [DF], proto TCP (6), length 52) iptunnel-remote.11054 > iptunnel-local.22: Flags [.], cksum 0x25be (correct), ack 35, win 1040, options [nop,nop,TS val 1985194870 ecr 2909993610], length 0 But nothing hit the firewall for the incoming traffic. I have tested both ipfw and pf as pf have been rewritten in FreeBSD. Many thanks -- Nicolas DEFFAYET From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 19:02:05 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 94D2ACE0; Fri, 7 Feb 2014 19:02:05 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 669AB1DFB; Fri, 7 Feb 2014 19:02:05 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s17J2506021607; Fri, 7 Feb 2014 19:02:05 GMT (envelope-from hiren@freefall.freebsd.org) Received: (from hiren@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s17J25J2021606; Fri, 7 Feb 2014 19:02:05 GMT (envelope-from hiren) Date: Fri, 7 Feb 2014 19:02:05 GMT Message-Id: <201402071902.s17J25J2021606@freefall.freebsd.org> To: onwahe@gmail.com, hiren@FreeBSD.org, freebsd-net@FreeBSD.org From: hiren@FreeBSD.org Subject: Re: kern/159603: [netinet] [patch] in_ifscrubprefix() - network route can be installed for interfaces marked down X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 19:02:05 -0000 Synopsis: [netinet] [patch] in_ifscrubprefix() - network route can be installed for interfaces marked down State-Changed-From-To: open->closed State-Changed-By: hiren State-Changed-When: Fri Feb 7 19:01:19 UTC 2014 State-Changed-Why: Fixed. http://www.freebsd.org/cgi/query-pr.cgi?pr=159603 From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 19:07:24 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 973DFFA1; Fri, 7 Feb 2014 19:07:24 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6BBC51E44; Fri, 7 Feb 2014 19:07:24 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s17J7OFu022125; Fri, 7 Feb 2014 19:07:24 GMT (envelope-from hiren@freefall.freebsd.org) Received: (from hiren@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s17J7OSV022124; Fri, 7 Feb 2014 19:07:24 GMT (envelope-from hiren) Date: Fri, 7 Feb 2014 19:07:24 GMT Message-Id: <201402071907.s17J7OSV022124@freefall.freebsd.org> To: hiren@FreeBSD.org, freebsd-net@FreeBSD.org, luigi@FreeBSD.org From: hiren@FreeBSD.org Subject: Re: kern/181131: [netmap] [patch] sys/dev/netmap memory allocation improvement X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 19:07:24 -0000 Synopsis: [netmap] [patch] sys/dev/netmap memory allocation improvement Responsible-Changed-From-To: freebsd-net->luigi Responsible-Changed-By: hiren Responsible-Changed-When: Fri Feb 7 19:05:57 UTC 2014 Responsible-Changed-Why: To Luigi for further consideration. http://www.freebsd.org/cgi/query-pr.cgi?pr=181131 From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 19:08:22 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C6F7318F; Fri, 7 Feb 2014 19:08:22 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 978671E53; Fri, 7 Feb 2014 19:08:22 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s17J8MR7022190; Fri, 7 Feb 2014 19:08:22 GMT (envelope-from hiren@freefall.freebsd.org) Received: (from hiren@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s17J8Mpf022189; Fri, 7 Feb 2014 19:08:22 GMT (envelope-from hiren) Date: Fri, 7 Feb 2014 19:08:22 GMT Message-Id: <201402071908.s17J8Mpf022189@freefall.freebsd.org> To: hiren@FreeBSD.org, freebsd-net@FreeBSD.org, luigi@FreeBSD.org From: hiren@FreeBSD.org Subject: Re: kern/181135: [netmap] [patch] sys/dev/netmap patch for Linux compatibility X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 19:08:22 -0000 Synopsis: [netmap] [patch] sys/dev/netmap patch for Linux compatibility Responsible-Changed-From-To: freebsd-net->luigi Responsible-Changed-By: hiren Responsible-Changed-When: Fri Feb 7 19:07:43 UTC 2014 Responsible-Changed-Why: To Luigi for further consideration. http://www.freebsd.org/cgi/query-pr.cgi?pr=181135 From owner-freebsd-net@FreeBSD.ORG Fri Feb 7 19:53:12 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 10A6DC36; Fri, 7 Feb 2014 19:53:12 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D612E12E8; Fri, 7 Feb 2014 19:53:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s17JrBrj033316; Fri, 7 Feb 2014 19:53:11 GMT (envelope-from brueffer@freefall.freebsd.org) Received: (from brueffer@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s17JrBoi033315; Fri, 7 Feb 2014 20:53:11 +0100 (CET) (envelope-from brueffer) Date: Fri, 7 Feb 2014 20:53:11 +0100 (CET) Message-Id: <201402071953.s17JrBoi033315@freefall.freebsd.org> To: sven@vyatta.com, brueffer@FreeBSD.org, freebsd-net@FreeBSD.org From: brueffer@FreeBSD.org Subject: Re: kern/182847: [netinet6] [patch] Remove dead code X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Feb 2014 19:53:12 -0000 Synopsis: [netinet6] [patch] Remove dead code State-Changed-From-To: open->closed State-Changed-By: brueffer State-Changed-When: Fri Feb 7 20:51:16 CET 2014 State-Changed-Why: Your patch was applied in r260485, however your other PR kern/185148 was erroneously referenced in the commit message. Thanks for the contribution! http://www.freebsd.org/cgi/query-pr.cgi?pr=182847 From owner-freebsd-net@FreeBSD.ORG Sat Feb 8 00:12:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EC4C7522; Sat, 8 Feb 2014 00:12:57 +0000 (UTC) Received: from mail-qc0-x22f.google.com (mail-qc0-x22f.google.com [IPv6:2607:f8b0:400d:c01::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9B557185F; Sat, 8 Feb 2014 00:12:57 +0000 (UTC) Received: by mail-qc0-f175.google.com with SMTP id x13so7251336qcv.34 for ; Fri, 07 Feb 2014 16:12:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=290YDpzC24+MbRtM0BeTpWV9PQF57zhBvZhb6GDqtZg=; b=oDJ85mloMepNIkphWUVLHQIwYe63m5k5+yNmyG2atgfvUnevtOdwVk50qxARPsTa44 iye8SIP8K3GhYly2CYQrnsX1Whv8erQOeTObIBfs6v5sFYZv+bM6VAzNo9wNaeQ6NJWV RobXfsyFrAsswtcyO9kuKibBFtfvoQkJLBYLIOAzidZ+jP0FgjHUBf6HyOdqTMoBRtbj Tz2ZsfLiuoaNtbQM2gTESBLPDdvcndlGacFTUTl/82SA8xnLAWDNBsdc0OeK4PR3TfFU dB5Z6Zf+Ygz62QWjdXiz/LHwDnSfuvLuz6rbZCzhpx54ibJ4pYsjiZYagYhl4fi0xLQC zAzQ== MIME-Version: 1.0 X-Received: by 10.140.50.235 with SMTP id s98mr3479529qga.12.1391818376799; Fri, 07 Feb 2014 16:12:56 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 7 Feb 2014 16:12:56 -0800 (PST) Date: Fri, 7 Feb 2014 16:12:56 -0800 X-Google-Sender-Auth: Zrp8Eb97FVS4GLRmXRn0pUIHxw4 Message-ID: Subject: flowtable, collisions, locking and CPU affinity From: Adrian Chadd To: "freebsd-arch@freebsd.org" , FreeBSD Net Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 00:12:58 -0000 Hi, I've been knee deep in the flowtable code looking at some of the less .. predictable ways it behaves. One of them is the collisions that do pop up from time to time. I dug into it in quite some depth and found out what's going on. This assumes it's a per-CPU flowtable. * A flowtable lookup is performed, on say CPU #0 * the flowtable lookup fails, so it goes to do a flowtable insert * .. but since in between the two, the flowtable "lock" is released so it can do a route/adjacency lookup, and that grabs a lock * .. then the flowtable insert is done on a totally different CPU * .. which happens to _have_ the flowtable entry already, so it fails as a collision which already has a matching entry. Now, the reason for this is primarily because there's no CPU pinning in the lookup path and if there's contention during the route lookup phase, the scheduler may decide to schedule the kernel thread on a totally different CPU to the one that was running the code when the lock was entered. Now, Gleb's recent changes seem to have made the instances of this drop, but he didn't set out to fix it. So there's something about his changes that has changed the locking/contention profile that I was using to easily reproduce it. In any case - the reason it's happening above is because there's no actual lock held over the whole lookup/insert path. It's a per-CPU critical enter/exit path, so the only way to guarantee consistency is to use sched_pin() for the entirety of the function. I'll go and test that out in a moment and see if it quietens the collisions that I see in lab testing. Has anyone already debugged/diagnosed this? Can anyone think of an alternate (better) way to fix this? Thanks, -a From owner-freebsd-net@FreeBSD.ORG Sat Feb 8 01:05:58 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CE638CF3; Sat, 8 Feb 2014 01:05:58 +0000 (UTC) Received: from mail-oa0-x22e.google.com (mail-oa0-x22e.google.com [IPv6:2607:f8b0:4003:c02::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7DD521BD7; Sat, 8 Feb 2014 01:05:58 +0000 (UTC) Received: by mail-oa0-f46.google.com with SMTP id n16so5145615oag.33 for ; Fri, 07 Feb 2014 17:05:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=X/E1R8j29/YJMQ0lt+NFFkfYeCo4xQyRTjCfcZ1UOKo=; b=0sBtERLIcX6RnMwD6GItOfhud47wuXL0TRTBQIqao8Cy48z+sdfyYKP9YQY2Nz5tP9 2A6EulqE6I6iyC4S/LdxeJFzB8u/k18yzAYBrvYbkpnuQkGR3nZ1x7CwZr/UU2ZXeaRq hiOkIpGiPEtYyyuoZO7nnUW+Ucedae7ZjKJV6zOJkn6J+VbsHRGuQoT3dTXWsHljcpJQ tUNeKz2IW7TIsW0GliOQXKWQ12oVrxLXj5IAcxrNCaHvI/76Vyg4xIJNLlwT86DjpkNS R7Zd2yMCLQ/RuD+THPBJ2aJwXs8p8psINaalTx4z+8o+06IYtfuFCKzn9L1v9r07M1xB gexw== MIME-Version: 1.0 X-Received: by 10.182.225.137 with SMTP id rk9mr14688254obc.51.1391821557788; Fri, 07 Feb 2014 17:05:57 -0800 (PST) Received: by 10.76.130.196 with HTTP; Fri, 7 Feb 2014 17:05:57 -0800 (PST) In-Reply-To: References: Date: Fri, 7 Feb 2014 20:05:57 -0500 Message-ID: Subject: Re: flowtable, collisions, locking and CPU affinity From: Ryan Stone To: Adrian Chadd Content-Type: text/plain; charset=ISO-8859-1 Cc: FreeBSD Net , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 01:05:58 -0000 On Fri, Feb 7, 2014 at 7:12 PM, Adrian Chadd wrote: > In any case - the reason it's happening above is because there's no > actual lock held over the whole lookup/insert path. It's a per-CPU > critical enter/exit path, so the only way to guarantee consistency is > to use sched_pin() for the entirety of the function. sched_pin seems like a very heavy hammer for what has to be a very rare and mostly harmless race. Why not redo the lookup after you have reacquired the lock, and if you don't have to do the insert anymore then don't and move on? From owner-freebsd-net@FreeBSD.ORG Sat Feb 8 01:13:02 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 864CFF4A; Sat, 8 Feb 2014 01:13:02 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5D6A41C60; Sat, 8 Feb 2014 01:13:02 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s181CugE082530 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 7 Feb 2014 17:12:56 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s181Cubw082529; Fri, 7 Feb 2014 17:12:56 -0800 (PST) (envelope-from jmg) Date: Fri, 7 Feb 2014 17:12:56 -0800 From: John-Mark Gurney To: Ryan Stone Subject: Re: flowtable, collisions, locking and CPU affinity Message-ID: <20140208011256.GH89104@funkthat.com> Mail-Followup-To: Ryan Stone , Adrian Chadd , FreeBSD Net , "freebsd-arch@freebsd.org" References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Fri, 07 Feb 2014 17:12:56 -0800 (PST) Cc: FreeBSD Net , Adrian Chadd , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 01:13:02 -0000 Ryan Stone wrote this message on Fri, Feb 07, 2014 at 20:05 -0500: > On Fri, Feb 7, 2014 at 7:12 PM, Adrian Chadd wrote: > > In any case - the reason it's happening above is because there's no > > actual lock held over the whole lookup/insert path. It's a per-CPU > > critical enter/exit path, so the only way to guarantee consistency is > > to use sched_pin() for the entirety of the function. > > sched_pin seems like a very heavy hammer for what has to be a very > rare and mostly harmless race. Why not redo the lookup after you have > reacquired the lock, and if you don't have to do the insert anymore > then don't and move on? Why not drop the work since the current CPU has the results? It sucks to throw away work, but the other option is to remeber what cpu you were on, and put it there, but that would also be expensive... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-net@FreeBSD.ORG Sat Feb 8 01:56:31 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0C84E9E5; Sat, 8 Feb 2014 01:56:31 +0000 (UTC) Received: from mail-pa0-x22a.google.com (mail-pa0-x22a.google.com [IPv6:2607:f8b0:400e:c03::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D09101FE4; Sat, 8 Feb 2014 01:56:30 +0000 (UTC) Received: by mail-pa0-f42.google.com with SMTP id kl14so3956256pab.1 for ; Fri, 07 Feb 2014 17:56:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=AYYGh5i5fu5dRqBRMnnDew3pMyPWxg2dWO2b7l4RjkU=; b=W4gvYc/GkDY/RyvL8+eBNSSf7gE3bFUetfEMopoAZbDMxwC/l+fKuKebyY/kat4PBV apiQNhD5AmYndoN8tebL/y0vfwE+vmZnmKTK+VtpxLnGg04aLHBlHRLt3Ctpa1t6TdS/ rXHIjRn+GOWXVccR8Lgk20nM5JtuHuLkXxKf2/6M5GGVjk0pM1YbL6piBfBnoW/8L+y/ Aoa9DTVFg+mpfm5qjMhuS9tuPI3ExRnr4pjZGk5+iIltvWZieleryXicjEqhWO3+U4cC thkBka3uaqmv3OJYa44ACV0PIFy1onNTCIP6hze9oyTPmYWNMx30mqUUfl83vGhXVek9 EM7w== MIME-Version: 1.0 X-Received: by 10.66.144.102 with SMTP id sl6mr11367016pab.96.1391824590415; Fri, 07 Feb 2014 17:56:30 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.70.103.174 with HTTP; Fri, 7 Feb 2014 17:56:29 -0800 (PST) In-Reply-To: References: Date: Fri, 7 Feb 2014 17:56:29 -0800 X-Google-Sender-Auth: DlLOcpG-6OiTJIEw5vOsRxf6oOc Message-ID: Subject: Re: flowtable, collisions, locking and CPU affinity From: Adrian Chadd To: Ryan Stone Content-Type: text/plain; charset=ISO-8859-1 Cc: FreeBSD Net , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 01:56:31 -0000 On 7 February 2014 17:05, Ryan Stone wrote: > On Fri, Feb 7, 2014 at 7:12 PM, Adrian Chadd wrote: >> In any case - the reason it's happening above is because there's no >> actual lock held over the whole lookup/insert path. It's a per-CPU >> critical enter/exit path, so the only way to guarantee consistency is >> to use sched_pin() for the entirety of the function. > > sched_pin seems like a very heavy hammer for what has to be a very > rare and mostly harmless race. Why not redo the lookup after you have > reacquired the lock, and if you don't have to do the insert anymore > then don't and move on? You say "rare and harmless race"; I can trick the system to do this: flowtable for IPv4: 57448 lookups 39543 hits 17905 misses 17820 collisions 382 free checks 82 frees .. and it gets stuck in a loop of never quite correctly updating/using the flowtable entries. So, it is mostly harmless, except exactly when it bites you in the ass. Yes, we could just reacquire the lock and insert if required, but I still have to be absolutely sure that the thread isn't preempted and migrated to another core. Otherwie we'd have teh same issue again. I'll keep tickling it. Thanks, -a From owner-freebsd-net@FreeBSD.ORG Sat Feb 8 06:45:08 2014 Return-Path: Delivered-To: freebsd-net@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2249DD71; Sat, 8 Feb 2014 06:45:08 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E8D8C1E4F; Sat, 8 Feb 2014 06:45:07 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id s186j7q8038091; Sat, 8 Feb 2014 06:45:07 GMT (envelope-from delphij@freefall.freebsd.org) Received: (from delphij@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s186j7Ks038090; Sat, 8 Feb 2014 06:45:07 GMT (envelope-from delphij) Date: Sat, 8 Feb 2014 06:45:07 GMT Message-Id: <201402080645.s186j7Ks038090@freefall.freebsd.org> To: delphij@FreeBSD.org, silby@FreeBSD.org, freebsd-net@FreeBSD.org From: delphij@FreeBSD.org Subject: Re: kern/25986: Socket would hang at LAST_ACK forever. X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 06:45:08 -0000 Synopsis: Socket would hang at LAST_ACK forever. Responsible-Changed-From-To: silby->freebsd-net Responsible-Changed-By: delphij Responsible-Changed-When: Sat Feb 8 06:44:40 UTC 2014 Responsible-Changed-Why: Reassign to freebsd-net@. http://www.freebsd.org/cgi/query-pr.cgi?pr=25986 From owner-freebsd-net@FreeBSD.ORG Sat Feb 8 13:39:57 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2B9E641E; Sat, 8 Feb 2014 13:39:57 +0000 (UTC) Received: from master.debian.org (master.debian.org [IPv6:2001:41b8:202:deb:216:36ff:fe40:4001]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E632C1C33; Sat, 8 Feb 2014 13:39:56 +0000 (UTC) Received: from localhost ([::1]) by master.debian.org with esmtp (Exim 4.80) (envelope-from ) id 1WC88Q-0002Sc-D1; Sat, 08 Feb 2014 13:39:54 +0000 Message-ID: <52F633A9.6020206@debian.org> Date: Sat, 08 Feb 2014 13:39:53 +0000 From: Robert Millan User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: glebius@freebsd.org Subject: in_control() rewrite breaks support for old (9.x) SIOCAIFADDR ABI X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Feb 2014 13:39:57 -0000 Hi Gleb, We've noticed that your recent rewrite of in_control() (r257692) broke backward compatibility support with the old (9.x) SIOCAIFADDR ABI (pre-r228768). I have no idea why this happens, and I'm not sure whether this is intentional. It doesn't affect us anyway (we have 10.x userland now), but I thought you might like to know. More details are available at: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732693 -- Robert Millan