Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 22 Sep 2016 11:03:59 -0700
From:      Gleb Smirnoff <glebius@FreeBSD.org>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        Ryan Stone <rysto32@gmail.com>, Kubilay Kocak <koobs@freebsd.org>, freebsd-net <freebsd-net@freebsd.org>, Karl Pielorz <kpielorz_lst@tdx.co.uk>
Subject:   Re: lagg Interfaces - don't do Gratuitous ARP?
Message-ID:  <20160922180359.GT1018@cell.glebi.us>
In-Reply-To: <80fd962a-fba3-d71e-a1cb-2b09181d3925@multiplay.co.uk>
References:  <20160921235703.GG1018@cell.glebi.us> <CAFMmRNwZBEJ9Me4FSh=W7fRNjm4344jiUGuJqX8KUB_0sWcajA@mail.gmail.com> <20160922025856.GH1018@cell.glebi.us> <348d534d-ef87-f90c-aa43-cc65c2f6283c@multiplay.co.uk> <20160922150940.GK1018@cell.glebi.us> <f4100561-4977-0b19-c245-0cd09438943d@multiplay.co.uk> <20160922154144.GO1018@cell.glebi.us> <0c678da4-bf72-5a81-aee1-d82a873661b7@multiplay.co.uk> <20160922160840.GP1018@cell.glebi.us> <80fd962a-fba3-d71e-a1cb-2b09181d3925@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 22, 2016 at 05:50:09PM +0100, Steven Hartland wrote:
S> > S> We could but then what happens when its IPv6 or $other protocol that
S> > S> needs to know? That would require lagg to be edited with all the special
S> > S> cases instead of allowing the protocol to handle it they way it needs.
S> >
S> > You just said that "without GARP devices can and do ignore", didn't you?
S> > Let's take this as truth, although I doubt. So, if this is the truth, that
S> > means that if you are running IPv6 only, the switches won't recondigure
S> > theirselves due to lack of gratious ARP.
S> Not sure I follow you, gratuitous ARP is required for IPv4 to work, for 
S> IPv6 you need an unsolicited neighbour announcement.
S> > Other protocols, where PPPoE is good example simply doesn't have any
S> > analogs of ARP or ND. So what would your switches do in that case? And
S> > what other layers are you going to hack, if you are going to run PPPoE
S> > service with lagg failover?
S> Good question, surely that's a good reason to have each protocol handle 
S> it and not to teach LAGG about every possible protocol?

No. It is not a good reason to have each protocol handle it. It is a
demonstration that this must be handled by a lower protocol layer - the L2,
which is the level where problem exists.

S> > In reality, a layer 2 device must forward layer 2 traffic, and must
S> > reconfigure its forwarding table based on source addresses seen on ports.
S> > And that's what all devices I've seen do. So what if we actually try
S> > the approach, I suggested? I can write the patch for you if you want.
S> The main problem with LAGG in failover mode is ensuring the traffic is 
S> sent to the correct port.
S> 
S> When you have the scenario where a switch stack believes MAC XYZ is 
S> accessible by port ABC then unless you tell it otherwise it will 
S> continue to believe that and hence send traffic to said port. I'm sure 
S> we'll agree that the standard for doing this for IPv4 is ARP and for 
S> IPv6 is NA.

No, we don't agree on that. I assert that the ARP is standard to map IPv4
address to physical address, not to a port. Same for NA. The de-facto
standard for a switch to believe that MAC XYZ is accessible by port ABC
is looking at the source address of any packet on a port.

S> When using LAGG and we loose the master port we need correct the 
S> connected devices view (both direct and remote) of the world such that 
S> traffic is now sent to a different physical port.
S> 
S> Back in the day, when switches weren't so "smart", sending a correctly 
S> address packet from the new port would potentially help, but with 
S> smarter switches and stacking in the mix sticking to the "standards" 
S> helps maintain compatibility and hence functionality with things like LAGG.
S> 
S> Having tested with a number of vendor switches Cisco, Extreme and more 
S> recently Arista only sending gratuitous ARP for IPv4 and unsolicited NA 
S> for IPv6 reliably resulted in rapid failover between LAGG ports.
S> 
S> Other methods like sending correctly addressed output from the new port 
S> helped, we tested this with outbound pings from IPMI, but still resulted 
S> in noticeable recovery delay.

This means that switches are "smart" and are violating standards. If you want
to create a hack to deal with that, better keep this hack inside the module
that is affected by "smart" switches, in the lagg driver. And not plow through
all levels of network stack to satisfy demands of standard violators.

So, please send a self made gratious ARP packet right from lagg(4). If the
switches work as you describe, that would work regardless of the actual
IPv4/IPv6/whatever configuration.

S> > S> Overall, while the proposed change (https://reviews.freebsd.org/D4111)
S> > S> does involve changes to multiple layers it still feels like the right
S> > S> approach as it has the right layer dealing with the change instead of
S> > S> hard-coded assumptions.
S> >
S> > Sorry, it doesn't feel like the right approach. :(
S> Out of interest why has your opinion changed since your post here: 
S> https://lists.freebsd.org/pipermail/freebsd-net/2012-February/031340.html ?

I'm sorry, I didn't look at D4111, expecting that it is exactly the patch
that was backed out. I will review D4111.

-- 
Totus tuus, Glebius.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160922180359.GT1018>