From owner-freebsd-stable@freebsd.org Wed Jan 16 16:39:46 2019 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 181F3148DE16 for ; Wed, 16 Jan 2019 16:39:46 +0000 (UTC) (envelope-from thomas@gibfest.dk) Received: from smtp2.servers.tyknet.dk (smtp2.servers.tyknet.dk [IPv6:2a01:3a0:1:1900:89:233:43:78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A99DD75256 for ; Wed, 16 Jan 2019 16:39:44 +0000 (UTC) (envelope-from thomas@gibfest.dk) Received: from [10.137.3.13] (gw.tyknet.dk [79.142.232.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp2.servers.tyknet.dk (Postfix) with ESMTPSA id EC27FF0E2; Wed, 16 Jan 2019 16:39:40 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.10.3 smtp2.servers.tyknet.dk EC27FF0E2 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=gibfest.dk; s=default; t=1547656781; bh=HLuOqBhYFS1Xn6z4iq/HIFCGBFf9BxhSKn4uKA74iLw=; h=From:Subject:To:References:Date:In-Reply-To; b=LiuOeFsIeKZuo/vSyqO2NLkHFH7HAFyyOkQ0latHvO5R9xRHyAGIvO/Ytsfgvvguy OS5QTYCsWOgDSQiiUtMwPX4uw/RJD5kboEt8+zLwwmmnh1bmnm1jfNDdYWwUcw17Ay smTMHBYggubsk3LHX5Zi6PxVphI+iiZC4uldaioXOVojsy8DUArFxqUmBfx236kf7P yxPQ0glH8CEbcu/C7dmyKkcQ9gPzwW4b15kexghW9smOezJty2v50D8lUuGwCy1JKt VAjA+56nBr+fY6eOjO+e1pPlXGeXgG5h2qQIL24s8qtRr7y/ohXZvkG65f+ovz2Spz naUPmQJTb6E7g== From: Thomas Steen Rasmussen Subject: Re: CARP stopped working after upgrade from 11 to 12 To: Steven Hartland , Pete French , freebsd-stable@freebsd.org References: Message-ID: Date: Wed, 16 Jan 2019 17:39:39 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: A99DD75256 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gibfest.dk header.s=default header.b=LiuOeFsI; spf=pass (mx1.freebsd.org: domain of thomas@gibfest.dk designates 2a01:3a0:1:1900:89:233:43:78 as permitted sender) smtp.mailfrom=thomas@gibfest.dk X-Spamd-Result: default: False [-1.92 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[gibfest.dk:s=default]; NEURAL_HAM_MEDIUM(-0.94)[-0.944,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-0.99)[-0.991,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[gibfest.dk]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.53)[0.532,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gibfest.dk:+]; MX_GOOD(-0.01)[cached: mail.tyknet.dk]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; IP_SCORE(-0.00)[country: DK(-0.02)]; ASN(0.00)[asn:9167, ipnet:2a01:3a0::/32, country:DK]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Jan 2019 16:39:46 -0000 On 1/16/19 3:53 PM, Steven Hartland wrote: I have confirmed that pfsync is the culprit. Read on for details. > I can't see how any of those would impact carp unless pf is now > incorrectly blocking carp packets, which seems unlikely from that commit. > Well I would agree, but nevertheless, here we are. > Questions: > >  * Are you running a firewall? Yes, pf, but it permits CARP packets, and MASTER/SLAVE works well up to and including r342050. Rebuild to r342051 with the exact same configuration and now both nodes are MASTER. >  * What does sysctl net.inet.carp report? net.inet.carp.ifdown_demotion_factor: 240 net.inet.carp.senderr_demotion_factor: 240 net.inet.carp.demotion: 0 net.inet.carp.log: 1 net.inet.carp.preempt: 1 net.inet.carp.dscp: 56 net.inet.carp.allow: 1 >  * What exactly does ifconfig report about your carp on both hosts? with 12-STABLE r342050: [tykling@fwclu2a ~]$ uname -a FreeBSD fwclu2a 12.0-STABLE FreeBSD 12.0-STABLE r342050 GENERIC amd64 [tykling@fwclu2a ~]$ ifconfig | grep carp         carp: MASTER vhid 1 advbase 1 advskew 100         carp: MASTER vhid 1 advbase 1 advskew 100         carp: MASTER vhid 1 advbase 1 advskew 100 [tykling@fwclu2a ~]$ [tykling@fwclu2b ~]$ uname -a FreeBSD fwclu2b 12.0-STABLE FreeBSD 12.0-STABLE r342050 GENERIC amd64 [tykling@fwclu2b ~]$ ifconfig | grep carp         carp: BACKUP vhid 1 advbase 1 advskew 200         carp: BACKUP vhid 1 advbase 1 advskew 200         carp: BACKUP vhid 1 advbase 1 advskew 200 [tykling@fwclu2b ~]$ and with 12-STABLE r342051: [tykling@fwclu2a ~]$ uname -a FreeBSD fwclu2a 12.0-STABLE FreeBSD 12.0-STABLE r342051 GENERIC amd64 [tykling@fwclu2a ~]$ ifconfig | grep carp         carp: MASTER vhid 1 advbase 1 advskew 100         carp: MASTER vhid 1 advbase 1 advskew 100         carp: MASTER vhid 1 advbase 1 advskew 100 [tykling@fwclu2a ~]$ [tykling@fwclu2b ~]$ uname -a FreeBSD fwclu2b 12.0-STABLE FreeBSD 12.0-STABLE r342051 GENERIC amd64 [tykling@fwclu2b ~]$ ifconfig | grep carp         carp: MASTER vhid 1 advbase 1 advskew 200         carp: MASTER vhid 1 advbase 1 advskew 200         carp: MASTER vhid 1 advbase 1 advskew 200 [tykling@fwclu2b ~]$ >  * Have you tried enabling more detailed carp logging using sysctl >    net.inet.carp.log? > It is at 1 and increasing it to 2 doesn't appear to log anything new. I tried disabling pfsync and rebooting both nodes, they came up as MASTER/SLAVE then. Then I tried enabling pfsync and starting it, and on the SLAVE node I immediately got: Jan 16 16:34:56 fwclu2b kernel: carp: demoted by -240 to -240 (pfsync bulk done) Jan 16 16:34:56 fwclu2b kernel: carp: 1@lagg2.52: BACKUP -> MASTER (preempting a slower master) Jan 16 16:34:56 fwclu2b kernel: carp: 1@lagg2.51: BACKUP -> MASTER (preempting a slower master) Jan 16 16:34:56 fwclu2b kernel: carp: 1@lagg3: BACKUP -> MASTER (preempting a slower master) Stopping pfsync again does not make it go back to SLAVE. Best regards, Thomas Steen Rasmussen