From owner-freebsd-net@FreeBSD.ORG Mon Feb 3 10:29:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8E1B5EB3 for ; Mon, 3 Feb 2014 10:29:20 +0000 (UTC) Received: from mail-pb0-x22e.google.com (mail-pb0-x22e.google.com [IPv6:2607:f8b0:400e:c01::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5AB331683 for ; Mon, 3 Feb 2014 10:29:20 +0000 (UTC) Received: by mail-pb0-f46.google.com with SMTP id um1so6858707pbc.19 for ; Mon, 03 Feb 2014 02:29:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=AY528EE7qIgZpFwDctJKk0PYzxaT2u3hLRqLFeDbUI0=; b=w0412r9vWVK0U9TkKDNuFTrlpTT6kb43cwyQHMWaJRkjH+h9bRT0OYWm/NPaR8R0y0 PnodcOsI8LrcirHUL+sCH8AC2N5yH5JqOgYO4Sw7uKkeUWrQWHvsQ6TM2Wuy8rtf0HsN 28ZlTpcqsqmCBoXhiUfhQFWCeQEyCNweljjPOLktE8kwTfLoBDETwSaKb6t5fyagvr6o BLV71fcfGOEy7t5eYcWrjNfCiwf7WB1RuquXoPUd52FaThK6/f+gWVDWF6PKtbSgLLaP 178UCITsQ1JgFRxcH5ZS97N5RVcgSmsT6pHx1coIG8iMuZx2bljMceOGqOXOMdE6VUpB D4Cw== X-Received: by 10.68.130.169 with SMTP id of9mr36480366pbb.79.1391423359857; Mon, 03 Feb 2014 02:29:19 -0800 (PST) Received: from [192.168.1.7] (ppp59-167-128-11.static.internode.on.net. [59.167.128.11]) by mx.google.com with ESMTPSA id ns7sm54501645pbc.32.2014.02.03.02.29.16 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Feb 2014 02:29:19 -0800 (PST) Message-ID: <52EF6F7A.40101@FreeBSD.org> Date: Mon, 03 Feb 2014 21:29:14 +1100 From: Kubilay Kocak User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Thunderbird/27.0 MIME-Version: 1.0 To: Ben , freebsd-net@freebsd.org Subject: Re: kern/185967: Link Aggregation LAGG: LACP not working in 10.0 References: <52EF50A7.1050205@niessen.ch> <1C608452-6F29-486D-BC0F-CCC7853665C7@yahoo.com> <52EF55FE.8030901@niessen.ch> <1798FE17-5718-4125-8B00-1B00DC44B828@yahoo.com> <52EF5D1E.2000306@niessen.ch> <52EF6194.5060305@niessen.ch> <8585EA2E-116E-45A6-877D-DC8D4460C965@yahoo.com> <52EF6690.3010509@niessen.ch> <202BD17C-E68A-4B27-B7EF-E5D84AA89176@yahoo.com> <52EF6D61.7010505@niessen.ch> In-Reply-To: <52EF6D61.7010505@niessen.ch> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: koobs@FreeBSD.org List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Feb 2014 10:29:20 -0000 On 3/02/2014 9:20 PM, Ben wrote: > Hi, > > It was Juniper's active/passive mode regarding LACP. > > It was set to passive and worked as you described without sending any > packages. Now it was set to active and works perfectly again. > > I couldn't try your patch easily as I didn't have the sources installed > (and obviously no network connection). > > If the time allows I will try your patch anyway. It would be *great* if you could do that Ben :) Having a successful real-world test case will provide Scott the confidence to land a commit and merge it back to stable/10 so that everyone can benefit as soon as possible. > Thanks for your help! > > Regards > Ben > > On 03.02.2014 10:58, Scott Long wrote: >> Hi, >> >> If you can, please test the patch I sent and let me know the results. >> I’ll check it into FreeBSD 11 and 10 if it works for you. >> >> Thanks, >> Scott >> >> On Feb 3, 2014, at 2:51 AM, Ben wrote: >> >>> Thank you for your detailed explanation. >>> >>> If I understand correctly the switch is probably not set up >>> correctly, right? >>> >>> I will try to have it configured correctly first. >>> >>> Thanks a lot for your help! >>> >>> Regards >>> Ben >>> >>> On 03.02.2014 10:45, Scott Long wrote: >>>> Ok, please try the patch I emailed earlier. Since you’re not seeing >>>> any receive messages, it means that your switch isn’t generating any >>>> LACP heartbeats. The difference between FreeBSD 9.x and 10 is that >>>> in 9.x, it ran in “optimistic” mode, meaning that it didn’t rely on >>>> getting receive messages from the switch, and only took a channel >>>> down if the link state went down. In strict mode, it looks for the >>>> receive messages and only transitions to a full operational state if >>>> it gets them. So while I know it’s easy to point at the problem >>>> being FreeBSD 10, seeing as FreeBSD 9 worked for you, please check >>>> to make sure that your switch is set up correctly. >>>> >>>> I authored the original change that went into FreeBSD 10, and I >>>> tried to make it so that strict_mode=0 would keep everything working >>>> as it did in 9. I guess that since you’re getting no receive >>>> messages from the switch at all that we need to disable strict mode >>>> on setup, not afterwards. Apply the patch and everything should >>>> work as it did in FreeBSD 9. >>>> >>>> Scott >>>> >>>> On Feb 3, 2014, at 2:29 AM, Ben wrote: >>>> >>>>> Yes, via sysctl and /etc/sysctl.conf >>>>> >>>>> I waited now roughly 20 minutes without touching it but no difference. >>>>> >>>>> No, I only see these transmit messages, no receive. >>>>> >>>>> Thanks >>>>> Ben >>>>> >>>>> On 03.02.2014 10:25, Scott Long wrote: >>>>>> Did you set it to 0 via the sysctl? You might need to wait for >>>>>> several minutes if you set it after setting up the links. >>>>>> >>>>>> Also, the message that you’re seeing is from your machine >>>>>> transmitting PDU packets. Are you seeing any "lacpdu receive” >>>>>> messages on the console? >>>>>> >>>>>> Thanks, >>>>>> Scott >>>>>> >>>>>> On Feb 3, 2014, at 2:10 AM, Ben wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I set strict mode to 0 but no use. I do receive PDU messages. >>>>>>> >>>>>>> igb0: lacpdu transmit >>>>>>> actor=(...) >>>>>>> actor.state=4d >>>>>>> partner=(...) >>>>>>> partner.state=0 >>>>>>> maxdelay=0 >>>>>>> >>>>>>> Thanks >>>>>>> Ben >>>>>>> >>>>>>> On 03.02.2014 10:03, Scott Long wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Unfortunately, you can’t control the strict mode globally. My >>>>>>>> apologies for this mess, I’ll make sure that it’s fixed for >>>>>>>> FreeBSD 10.1. If the sysctl doesn’t help then maybe consider >>>>>>>> compiling a custom kernel with it defaulted to 0. You’ll need >>>>>>>> to open /sys/net/ieee802ad_lacp.c and look for the function >>>>>>>> lacp_attach(). You’ll see the strict_mode assign underneath >>>>>>>> that. I’ll also send you a patch in a few minutes. Until then, >>>>>>>> try enabling net.link.lagg.lacp.debug=1 and see if you’re >>>>>>>> receiving heartbeat PDU’s from your switch. >>>>>>>> >>>>>>>> Scott >>>>>>>> >>>>>>>> On Feb 3, 2014, at 1:40 AM, Ben wrote: >>>>>>>> >>>>>>>>> Hi Scott, >>>>>>>>> >>>>>>>>> I had tried to set it in /etc/sysctl.conf but seems it didnt >>>>>>>>> work. But will I try again and report back. >>>>>>>>> >>>>>>>>> The settings of the switch have not been changed and are set to >>>>>>>>> LACP. It worked before so I guess the switch should not be the >>>>>>>>> problem. Maybe some incompatibility between FreeBSD + >>>>>>>>> igb-driver + switch (Juniper EX3300-48T). >>>>>>>>> >>>>>>>>> I will update you after setting the sysctl setting. It seems to >>>>>>>>> be "dynamic", I guess 0 reflects the index of LACP lagg >>>>>>>>> devices. Can I switch off the strict mode globally in >>>>>>>>> /etc/sysctl.conf? >>>>>>>>> >>>>>>>>> Thanks for your help. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Ben >>>>>>>>> >>>>>>>>> On 03.02.2014 09:31, Scott Long wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> You’re probably running into the consequences of r253687. >>>>>>>>>> Check to see the value of ‘sysctl >>>>>>>>>> net.link.lagg.0.lacp.lacp_strict_mode’. If it’s ‘1’ then set >>>>>>>>>> it to 0. My original intention was for this to default to 0, >>>>>>>>>> but apparently that didn’t happen. However, the fact that >>>>>>>>>> strict mode doesn’t seem to work at all for you might hint >>>>>>>>>> that your switch either isn’t configured correctly for LACP, >>>>>>>>>> or doesn’t actually support LACP at all. You might want to >>>>>>>>>> investigate that. >>>>>>>>>> >>>>>>>>>> Scott >>>>>>>>>> >>>>>>>>>> On Feb 3, 2014, at 1:17 AM, Ben wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I upgraded from FreeBSD 9.2-RELEASE to 10.0-RELEASE. FreeBSD >>>>>>>>>>> 9.2 was configured to use LACP with two igb devices. >>>>>>>>>>> >>>>>>>>>>> Now it stopped working after the upgrade. >>>>>>>>>>> >>>>>>>>>>> This is a screenshot of ifconfig -a after the upgrade to >>>>>>>>>>> FreeBSD 10..0-RELEASE: >>>>>>>>>>> http://tinypic.com/view.php?pic=28jvgpw&s=5#.Uu9PXT1dVPM >>>>>>>>>>> >>>>>>>>>>> A PR is currently open: >>>>>>>>>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/185967 >>>>>>>>>>> >>>>>>>>>>> It is set to low, but I would like somebody to have a look >>>>>>>>>>> into it as it obviously has a great influence on our >>>>>>>>>>> infrastructure. The only way to "solve" it is currently >>>>>>>>>>> switching back to FreeBSD 9.2. >>>>>>>>>>> >>>>>>>>>>> The suggested fix "use failover" seems not to work. >>>>>>>>>>> >>>>>>>>>>> Thank you for your help. >>>>>>>>>>> >>>>>>>>>>> Best regards >>>>>>>>>>> Ben >>>>>>>>>>> _______________________________________________ -- Koobs