Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Jan 2019 19:16:10 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 235005] r342051 "pfsync: Performance improvement" breaks CARP when used with pfsync
Message-ID:  <bug-235005-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D235005

            Bug ID: 235005
           Summary: r342051 "pfsync: Performance improvement" breaks CARP
                    when used with pfsync
           Product: Base System
           Version: 12.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: thomas@gibfest.dk

After quite a few buildworlds+kernel and reboots I've managed to isolate ba=
se
r342051 "pfsync: Performance improvement" as the reason why lagg stopped
working for me.

I've been building a couple of carp+pf routers/firewalls, originally with
12-BETA2 but they were recently upgraded to 12-STABLE base r342254 which is
when both carp nodes started being MASTER instead of one MASTER and one BAC=
KUP
node.

The notes from my bisecting are below. All tests are with the same
configuration. As you can see, base r342051 is the commit where it broke.

12-STABLE base r339946 MASTER/BACKUP
12-STABLE base r341100 MASTER/BACKUP
12-STABLE base r341677 MASTER/BACKUP
12-STABLE base r341965 MASTER/BACKUP
12-STABLE base r342037 MASTER/BACKUP
12-STABLE base r342050 MASTER/BACKUP
12-STABLE base r342051 MASTER/MASTER
12-STABLE base r342055 MASTER/MASTER
12-STABLE base r342073 MASTER/MASTER
12-STABLE base r342109 MASTER/MASTER
12-STABLE base r342254 MASTER/MASTER

I've further confirmed pfsync to be at fault, when pfsync is not enabled the
two nodes are MASTER and BACKUP as they should be. Immediately after I start
pfsync the BACKUP node becomes MASTER and logs these messages:

Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync bulk
done)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.52: BACKUP -> MASTER
(preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg2.51: BACKUP -> MASTER
(preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1 at lagg3: BACKUP -> MASTER (preempt=
ing
a slower master)

...but the MASTER also stays MASTER, and chaos ensues, nothing works on the
network. Stopping pfsync doesn't resolve the situation, only a reboot with
pfsync disabled restores normal carp functionality.

I suggest maybe backing out base r342051 while we investigate the cause, if=
 a
fix can't be found quickly. I suspect it could have something to do with the
pfsync carp demotion code, which the log messages above seem to confirm, bu=
t I
don't know.

Let me know if further info is needed about my configuration or anything. S=
ee
also this thread on -stable
https://lists.freebsd.org/pipermail/freebsd-stable/2019-January/090421.html
which confirms I am not the only one experiencing this.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-235005-227>