Date: Fri, 18 Dec 2015 02:46:30 +0300 From: Gleb Smirnoff <glebius@FreeBSD.org> To: Steven Hartland <steven@multiplay.co.uk> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r292379 - in head/sys: netinet netinet6 Message-ID: <20151217234630.GX42340@FreeBSD.org> In-Reply-To: <567344BC.20501@multiplay.co.uk> References: <201512162226.tBGMQSvs098886@repo.freebsd.org> <20151217003824.GG42340@FreeBSD.org> <5672C6AE.7070407@freebsd.org> <20151217191630.GL42340@FreeBSD.org> <567344BC.20501@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Dec 17, 2015 at 11:26:52PM +0000, Steven Hartland wrote: S> You may have not read all the detail in the review so you might not have S> noticed that I S> identified that carp IPv6 NA was broken by r251584 which was committed 2 1/2 S> years ago. I'm guessing not may people use it for IPv6. My suggestion is to look at this regression separated from the lagg failover and fix it separately. S> > The "link aggregation" itself refers to an aggregation of links between S> > two logical devices. If you build lagg(4) interface on top of two ports S> > that are plugged into different switches, you are calling for trouble. S> S> While multiple switches complicates the matter its not the only issue as S> you can S> reproduce this with a single switch and two nics in LAGG failover mode S> with a simple S> ifconfig <nic1> down. At this point any traffic entering the switch for S> LAGG member S> will back-whole instead of being received by the other nic. S> S> It is much more common in networking now to have multiple physical switches S> configured as part of bigger logical devices using protocols such as S> MLAG, which is S> what we're using with Cisco's and Arista's, so not some cheepo network ;-) Right, you are confirming what I said above. Multiple physical devices, but still one logical on each side of lagg. S> > Nevertheless, someone wants to give a kick to this initially broken S> > network design and run it somehow. And this "somehow" implies Layer2 S> > upcalling into upper layers to do something, since there is no S> > established standard layer2 heartbeat packet. I have chatted with S> > networking gurus at my job, and they said, that they don't know S> > any decent network equipment that supports such setup. However, they S> > noticed that Windows is capable for such failover. I haven't yet S> > learned on how Windows solves the problem. Actually, those who S> > pushed committing 156226 should have done these investigations. S> > Probably Windows does exactly the same, sends gratutious ARP or S> > its IPv6 analog. Or may be does something better like sending S> > useless L2 datagram, but with a proper source hardware address. S> Actually our testing here showed both Windows and Linux worked as S> expected and S> from my reading doing the GARP / UNA is actually expected in this S> situation, for this very reason. Is it possible for you to sniff the traffic and see what actually happens in there? My expectations are the same, but want to be sure. S> I'd like to step back for a second and get you feedback on the changes S> that where S> reverted, which didn't have the DELAY in the callout. What where the S> issues as you S> saw them? So we don't spam people any more I've reopened the review so S> we can S> take this there: https://reviews.freebsd.org/D4111 Before going into implementation, can we first settle on the protocol? Could be that GARP/NA is the only solution there, but let's be sure first. -- Totus tuus, Glebius.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151217234630.GX42340>