Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Jan 2019 17:39:39 +0100
From:      Thomas Steen Rasmussen <thomas@gibfest.dk>
To:        Steven Hartland <killing@multiplay.co.uk>, Pete French <petefrench@ingresso.co.uk>, freebsd-stable@freebsd.org
Subject:   Re: CARP stopped working after upgrade from 11 to 12
Message-ID:  <a066e772-bd59-c787-90f1-00ad661983de@gibfest.dk>
In-Reply-To: <a130ba8f-9c30-212d-8ca3-c46047cd3ecb@multiplay.co.uk>
References:  <E1gjlxg-000DSh-Oi@dilbert.ingresso.co.uk> <a7b651e4-68a9-e52c-0033-e8d46508590e@gibfest.dk> <a130ba8f-9c30-212d-8ca3-c46047cd3ecb@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On 1/16/19 3:53 PM, Steven Hartland wrote:

I have confirmed that pfsync is the culprit. Read on for details.

> I can't see how any of those would impact carp unless pf is now 
> incorrectly blocking carp packets, which seems unlikely from that commit.
>

Well I would agree, but nevertheless, here we are.


> Questions:
>
>  * Are you running a firewall?


Yes, pf, but it permits CARP packets, and MASTER/SLAVE works well up to 
and including r342050.

Rebuild to r342051 with the exact same configuration and now both nodes 
are MASTER.


>  * What does sysctl net.inet.carp report?

net.inet.carp.ifdown_demotion_factor: 240
net.inet.carp.senderr_demotion_factor: 240
net.inet.carp.demotion: 0
net.inet.carp.log: 1
net.inet.carp.preempt: 1
net.inet.carp.dscp: 56
net.inet.carp.allow: 1

>  * What exactly does ifconfig report about your carp on both hosts?


with 12-STABLE r342050:

[tykling@fwclu2a ~]$ uname -a
FreeBSD fwclu2a 12.0-STABLE FreeBSD 12.0-STABLE r342050 GENERIC amd64
[tykling@fwclu2a ~]$ ifconfig | grep carp
         carp: MASTER vhid 1 advbase 1 advskew 100
         carp: MASTER vhid 1 advbase 1 advskew 100
         carp: MASTER vhid 1 advbase 1 advskew 100
[tykling@fwclu2a ~]$

[tykling@fwclu2b ~]$ uname -a
FreeBSD fwclu2b 12.0-STABLE FreeBSD 12.0-STABLE r342050 GENERIC amd64
[tykling@fwclu2b ~]$ ifconfig | grep carp
         carp: BACKUP vhid 1 advbase 1 advskew 200
         carp: BACKUP vhid 1 advbase 1 advskew 200
         carp: BACKUP vhid 1 advbase 1 advskew 200
[tykling@fwclu2b ~]$

and with 12-STABLE r342051:

[tykling@fwclu2a ~]$ uname -a
FreeBSD fwclu2a 12.0-STABLE FreeBSD 12.0-STABLE r342051 GENERIC amd64
[tykling@fwclu2a ~]$ ifconfig | grep carp
         carp: MASTER vhid 1 advbase 1 advskew 100
         carp: MASTER vhid 1 advbase 1 advskew 100
         carp: MASTER vhid 1 advbase 1 advskew 100
[tykling@fwclu2a ~]$

[tykling@fwclu2b ~]$ uname -a
FreeBSD fwclu2b 12.0-STABLE FreeBSD 12.0-STABLE r342051 GENERIC amd64
[tykling@fwclu2b ~]$ ifconfig | grep carp
         carp: MASTER vhid 1 advbase 1 advskew 200
         carp: MASTER vhid 1 advbase 1 advskew 200
         carp: MASTER vhid 1 advbase 1 advskew 200
[tykling@fwclu2b ~]$

>  * Have you tried enabling more detailed carp logging using sysctl
>    net.inet.carp.log?
>
It is at 1 and increasing it to 2 doesn't appear to log anything new.


I tried disabling pfsync and rebooting both nodes, they came up as 
MASTER/SLAVE then.

Then I tried enabling pfsync and starting it, and on the SLAVE node I 
immediately got:

Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync 
bulk done)
Jan 16 16:34:56 fwclu2b kernel: carp: 1@lagg2.52: BACKUP -> MASTER 
(preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1@lagg2.51: BACKUP -> MASTER 
(preempting a slower master)
Jan 16 16:34:56 fwclu2b kernel: carp: 1@lagg3: BACKUP -> MASTER 
(preempting a slower master)

Stopping pfsync again does not make it go back to SLAVE.


Best regards,

Thomas Steen Rasmussen






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a066e772-bd59-c787-90f1-00ad661983de>