Date: Thu, 22 Sep 2016 20:35:50 +0100 From: Steven Hartland <killing@multiplay.co.uk> To: Gleb Smirnoff <glebius@FreeBSD.org> Cc: Ryan Stone <rysto32@gmail.com>, Kubilay Kocak <koobs@freebsd.org>, freebsd-net <freebsd-net@freebsd.org>, Karl Pielorz <kpielorz_lst@tdx.co.uk> Subject: Re: lagg Interfaces - don't do Gratuitous ARP? Message-ID: <c8ce344f-79cf-9f0c-72a4-6b9a2dfb1a0d@multiplay.co.uk> In-Reply-To: <20160922180359.GT1018@cell.glebi.us> References: <20160921235703.GG1018@cell.glebi.us> <CAFMmRNwZBEJ9Me4FSh=W7fRNjm4344jiUGuJqX8KUB_0sWcajA@mail.gmail.com> <20160922025856.GH1018@cell.glebi.us> <348d534d-ef87-f90c-aa43-cc65c2f6283c@multiplay.co.uk> <20160922150940.GK1018@cell.glebi.us> <f4100561-4977-0b19-c245-0cd09438943d@multiplay.co.uk> <20160922154144.GO1018@cell.glebi.us> <0c678da4-bf72-5a81-aee1-d82a873661b7@multiplay.co.uk> <20160922160840.GP1018@cell.glebi.us> <80fd962a-fba3-d71e-a1cb-2b09181d3925@multiplay.co.uk> <20160922180359.GT1018@cell.glebi.us>
next in thread | previous in thread | raw e-mail | index | archive | help
On 22/09/2016 19:03, Gleb Smirnoff wrote: > On Thu, Sep 22, 2016 at 05:50:09PM +0100, Steven Hartland wrote: > S> > S> We could but then what happens when its IPv6 or $other protocol that > S> > S> needs to know? That would require lagg to be edited with all the special > S> > S> cases instead of allowing the protocol to handle it they way it needs. > S> > > S> > You just said that "without GARP devices can and do ignore", didn't you? > S> > Let's take this as truth, although I doubt. So, if this is the truth, that > S> > means that if you are running IPv6 only, the switches won't recondigure > S> > theirselves due to lack of gratious ARP. > S> Not sure I follow you, gratuitous ARP is required for IPv4 to work, for > S> IPv6 you need an unsolicited neighbour announcement. > S> > Other protocols, where PPPoE is good example simply doesn't have any > S> > analogs of ARP or ND. So what would your switches do in that case? And > S> > what other layers are you going to hack, if you are going to run PPPoE > S> > service with lagg failover? > S> Good question, surely that's a good reason to have each protocol handle > S> it and not to teach LAGG about every possible protocol? > > No. It is not a good reason to have each protocol handle it. It is a > demonstration that this must be handled by a lower protocol layer - the L2, > which is the level where problem exists. > > S> > In reality, a layer 2 device must forward layer 2 traffic, and must > S> > reconfigure its forwarding table based on source addresses seen on ports. > S> > And that's what all devices I've seen do. So what if we actually try > S> > the approach, I suggested? I can write the patch for you if you want. > S> The main problem with LAGG in failover mode is ensuring the traffic is > S> sent to the correct port. > S> > S> When you have the scenario where a switch stack believes MAC XYZ is > S> accessible by port ABC then unless you tell it otherwise it will > S> continue to believe that and hence send traffic to said port. I'm sure > S> we'll agree that the standard for doing this for IPv4 is ARP and for > S> IPv6 is NA. > > No, we don't agree on that. I assert that the ARP is standard to map IPv4 > address to physical address, not to a port. Same for NA. The de-facto > standard for a switch to believe that MAC XYZ is accessible by port ABC > is looking at the source address of any packet on a port. > > S> When using LAGG and we loose the master port we need correct the > S> connected devices view (both direct and remote) of the world such that > S> traffic is now sent to a different physical port. > S> > S> Back in the day, when switches weren't so "smart", sending a correctly > S> address packet from the new port would potentially help, but with > S> smarter switches and stacking in the mix sticking to the "standards" > S> helps maintain compatibility and hence functionality with things like LAGG. > S> > S> Having tested with a number of vendor switches Cisco, Extreme and more > S> recently Arista only sending gratuitous ARP for IPv4 and unsolicited NA > S> for IPv6 reliably resulted in rapid failover between LAGG ports. > S> > S> Other methods like sending correctly addressed output from the new port > S> helped, we tested this with outbound pings from IPMI, but still resulted > S> in noticeable recovery delay. > > This means that switches are "smart" and are violating standards. If you want > to create a hack to deal with that, better keep this hack inside the module > that is affected by "smart" switches, in the lagg driver. And not plow through > all levels of network stack to satisfy demands of standard violators. > > So, please send a self made gratious ARP packet right from lagg(4). If the > switches work as you describe, that would work regardless of the actual > IPv4/IPv6/whatever configuration. > > S> > S> Overall, while the proposed change (https://reviews.freebsd.org/D4111) > S> > S> does involve changes to multiple layers it still feels like the right > S> > S> approach as it has the right layer dealing with the change instead of > S> > S> hard-coded assumptions. > S> > > S> > Sorry, it doesn't feel like the right approach. :( > S> Out of interest why has your opinion changed since your post here: > S> https://lists.freebsd.org/pipermail/freebsd-net/2012-February/031340.html ? > > I'm sorry, I didn't look at D4111, expecting that it is exactly the patch > that was backed out. I will review D4111. > They are similar in approach but incorporated additional feedback. Essentially it still follows your suggestion from 2012 which was: > 1) Network protocols should register theirselves on the ifnet_link_event > EVENTHANDLER(9). > 2) The inet4 should send gratutious ARP on this event. > 3) The inet6 should send NA. Hence my confusion ;-) Regards Steve
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c8ce344f-79cf-9f0c-72a4-6b9a2dfb1a0d>