From owner-freebsd-pf@freebsd.org Thu Sep 14 14:21:10 2017 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F1B4E1B25A; Thu, 14 Sep 2017 14:21:10 +0000 (UTC) (envelope-from dch@skunkwerks.at) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 355C775B25; Thu, 14 Sep 2017 14:21:09 +0000 (UTC) (envelope-from dch@skunkwerks.at) Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 6001320D6E; Thu, 14 Sep 2017 10:21:08 -0400 (EDT) Received: from web6 ([10.202.2.216]) by compute7.internal (MEProxy); Thu, 14 Sep 2017 10:21:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=skunkwerks.at; h=content-transfer-encoding:content-type:date:from:message-id :mime-version:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= mesmtp; bh=/xdndg8jFeCEH1B2zVXhFA5yogLPDx4bX31QXNl+u1U=; b=WXjbz m0H6dH0EE9M3/eY7tIyLBS4gyG978i9LhhXMQJCRnt+B1l9+ypCnN/By8vmj10FJ uNkrQ8bphED3jwz8VLPnR2dEgvO6UQfeC6LiXS4AfnJcbstkbnxTeETIGtGOZEUX AdMfR55f0OYDyCf3yX9j4Ww7SALlkwVXTMETzg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; bh=/xdndg8jFeCEH1B2zVXhFA5yogLPD x4bX31QXNl+u1U=; b=dNs6qoNs+kB82nrmTXGFdOFPSyede34d9ev8p8wgKM4Jf A6pCx9zW7XZTScrW5PlKXYc7tBCtZkyvTg6RmUFbDK+7KYWRfjkl5DpA3UgUCYcP 5QbnXfsoHbrQxSAdfdv9d7bZjVFMbIbmOGKu+l6dH+7H/Qg0cRGtNVwuJ7Em9KDf koIEBVo/4ZW7x5Ir6R5VOiiK7JG47Dxf07xMnwHC61Fd836GL7dOwWKtfrLXIYD1 3F4Y/fGC3dTCLuhdCSLKokwsEbEFKGxSKrrZkDudisEwtFCX6MPg0ic4GrGuvrQw 8vsH0gmVeWUXmzIEE5nHyxrnARJ85F4/7HCT+GC3Q== X-ME-Sender: Received: by mailuser.nyi.internal (Postfix, from userid 99) id 2AA1A48004; Thu, 14 Sep 2017 10:21:08 -0400 (EDT) Message-Id: <1505398868.955393.1106053824.42CA3E40@webmail.messagingengine.com> From: Dave Cottlehuber To: freebsd-pf@freebsd.org, freebsd-net@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" X-Mailer: MessagingEngine.com Webmail Interface - ajax-64b08692 Subject: NATted outbound traffic sometimes uses backup CARP IP on LACP/LAGG interface Date: Thu, 14 Sep 2017 16:21:08 +0200 X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Sep 2017 14:21:10 -0000 Hi, Outgoing traffic (from a jail) via PF NAT over a LAGG/LACP sometimes has the *backup* CARP IP address assigned to it. Obivously as this IP is only active on the "other" server, the return TCP connection traffic never actually gets back to our CARP master, and the other server sees spurious TCP connections. This is very reproducible and appears to be deterministic, like a round robin IP allocation. In practice, inside a jail, `curl $URL` will fail repeatedly. Hopefully this is some misconfiguration on my part - what am I doing wrong? BTW I wrote this up a while back on the forums where the config files are easier to read: https://forums.freebsd.org/threads/61552 ############################### # /etc/rc.conf network ifconfig_igb0="up" ifconfig_igb1="up" cloned_interfaces="${cloned_interfaces} lagg0" defaultrouter="1.2.3.81" ipv6_defaultrouter="1:2:3:4::1" ifconfig_lagg0="inet 1.2.3.83/28 laggproto lacp laggport igb0 laggport igb1" ifconfig_lagg0_ipv6="inet6 1:2:3:4::83/64" # carp on kld_list="${kld_list} carp" ifconfig_lagg0_aliases="\ inet vhid 1 advskew 100 pass pw1 1.2.3.84/32 \ inet6 vhid 2 advskew 100 pass pw2 1:2:3:4::84/64 \ inet vhid 3 advskew 0 pass pw3 1.2.3.85/32 \ inet6 vhid 4 advskew 0 pass pw4 1:2:3:4::85/64 \ " # jail networks use their own separate cloned if cloned_interfaces="${cloned_interfaces} lo1" ifconfig_lo1_aliases="inet 10.241.0.0-15/16" ############################### # /etc/pf.conf # macros protocols = "{ tcp, udp, icmp }" # interfaces extl_if="lagg0" intl_if="lo0" jail_if="lo1" # networks intl_net = $intl_if:network jail_net = $jail_if:network internet = $extl_if:network # limits set limit { states 200000, frags 80000, src-nodes 80000 } set timeout { adaptive.start 180000, adaptive.end 200000 } # clean packets are happy packets scrub in all # jails are allowed outbound connections but not inbound nat on $extl_if proto $protocols from $jail_net to any -> ($extl_if) # o ye of little faith pass in all pass out all ############################### ######## running configs ###### pfctl indeed shows its a round-robin ############################### # pfctl -vnf /etc/pf.conf protocols = "{ tcp, udp, icmp }" extl_if = "lagg0" intl_if = "lo0" jail_if = "lo1" intl_net = "lo0:network" jail_net = "lo1:network" internet = "lagg0:network" set limit states 200000 set limit frags 80000 set limit src-nodes 80000 set timeout adaptive.start 180000 set timeout adaptive.end 200000 scrub in all fragment reassemble nat on lagg0 inet proto tcp from 10.241.0.0/16 to any -> (lagg0) round-robin nat on lagg0 inet proto tcp from 10.241.0.1 to any -> (lagg0) round-robin ... repeated for each IP ############################### # ifconfig lagg0: flags=8943 metric 0 mtu 1500 options=6403bb ether 78:45:c4:fa:d2:99 inet 1.2.3.82 netmask 0xfffffff0 broadcast 1.2.3.95 * inet 1.2.3.84 netmask 0xffffffff broadcast 1.2.3.84 vhid 1 * inet 1.2.3.85 netmask 0xffffffff broadcast 1.2.3.85 vhid 3 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! inet6 fe80::7a45:c4ff:fefa:d299%lagg0 prefixlen 64 scopeid 0x4 inet6 1:2:3:4::82 prefixlen 64 * inet6 1:2:3:4::84 prefixlen 64 vhid 2 * inet6 1:2:3:4::85 prefixlen 64 vhid 4 nd6 options=21 media: Ethernet autoselect status: active * carp: MASTER vhid 1 advbase 1 advskew 0 * carp: BACKUP vhid 3 advbase 1 advskew 100 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! * carp: MASTER vhid 2 advbase 1 advskew 0 * carp: BACKUP vhid 4 advbase 1 advskew 100 groups: lagg laggproto lacp lagghash l2,l3,l4 * laggport: igb0 flags=1c * laggport: igb1 flags=1c # I removed the lines appended with !!!!!!!!!!!.. so that the system actually works atm