Date: Wed, 16 Jan 2019 19:16:10 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 235005] r342051 "pfsync: Performance improvement" breaks CARP when used with pfsync Message-ID: <bug-235005-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D235005 Bug ID: 235005 Summary: r342051 "pfsync: Performance improvement" breaks CARP when used with pfsync Product: Base System Version: 12.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: thomas@gibfest.dk After quite a few buildworlds+kernel and reboots I've managed to isolate ba= se r342051 "pfsync: Performance improvement" as the reason why lagg stopped working for me. I've been building a couple of carp+pf routers/firewalls, originally with 12-BETA2 but they were recently upgraded to 12-STABLE base r342254 which is when both carp nodes started being MASTER instead of one MASTER and one BAC= KUP node. The notes from my bisecting are below. All tests are with the same configuration. As you can see, base r342051 is the commit where it broke. 12-STABLE base r339946 MASTER/BACKUP 12-STABLE base r341100 MASTER/BACKUP 12-STABLE base r341677 MASTER/BACKUP 12-STABLE base r341965 MASTER/BACKUP 12-STABLE base r342037 MASTER/BACKUP 12-STABLE base r342050 MASTER/BACKUP 12-STABLE base r342051 MASTER/MASTER 12-STABLE base r342055 MASTER/MASTER 12-STABLE base r342073 MASTER/MASTER 12-STABLE base r342109 MASTER/MASTER 12-STABLE base r342254 MASTER/MASTER I've further confirmed pfsync to be at fault, when pfsync is not enabled the two nodes are MASTER and BACKUP as they should be. Immediately after I start pfsync the BACKUP node becomes MASTER and logs these messages: Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync bulk done) Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.52: BACKUP -> MASTER (preempting a slower master) Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.51: BACKUP -> MASTER (preempting a slower master) Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg3: BACKUP -> MASTER (preempt= ing a slower master) ...but the MASTER also stays MASTER, and chaos ensues, nothing works on the network. Stopping pfsync doesn't resolve the situation, only a reboot with pfsync disabled restores normal carp functionality. I suggest maybe backing out base r342051 while we investigate the cause, if= a fix can't be found quickly. I suspect it could have something to do with the pfsync carp demotion code, which the log messages above seem to confirm, bu= t I don't know. Let me know if further info is needed about my configuration or anything. S= ee also this thread on -stable https://lists.freebsd.org/pipermail/freebsd-stable/2019-January/090421.html which confirms I am not the only one experiencing this. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-235005-227>