From owner-freebsd-bugs@FreeBSD.ORG Sat Jul 12 12:32:13 2014 Return-Path: Delivered-To: freebsd-bugs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 771173BB for ; Sat, 12 Jul 2014 12:32:13 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 57CD42EBD for ; Sat, 12 Jul 2014 12:32:13 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.8/8.14.8) with ESMTP id s6CCWDTa070024 for ; Sat, 12 Jul 2014 12:32:13 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 191832] carp breaks the network Date: Sat, 12 Jul 2014 12:32:13 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: smh@FreeBSD.org X-Bugzilla-Status: Needs Triage X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Jul 2014 12:32:13 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191832 --- Comment #1 from Steven Hartland --- The problem occurs when we reboot one of the machines which have jails with supporting carp IP's. An example jail.conf entry:- == machine01 == test01 { host.hostname = "test01a"; ip4.addr = "10.10.10.5"; ip4.addr += "10.10.10.11"; ip4.addr += "10.10.10.12"; exec.prestart += "/sbin/ifconfig igb0 vhid 1 pass testpass alias 10.10.10.11/32"; exec.prestart += "/sbin/ifconfig igb0 vhid 2 pass testpass alias 10.10.10.12/32"; } == machine02 == test01 { host.hostname = "test01b"; ip4.addr = "10.10.10.6"; ip4.addr += "10.10.10.11"; ip4.addr += "10.10.10.12"; exec.prestart += "/sbin/ifconfig igb0 vhid 1 pass testpass advskew 100 alias 10.10.10.11/32"; exec.prestart += "/sbin/ifconfig igb0 vhid 2 pass testpass advskew 100 alias 10.10.10.12/32"; } On reboot the machine02 the machines will complain about their IP's in use e.g. Jul 12 01:12:50 machine01 kernel: Trying to mount root from zfs:tank/root []... Jul 12 01:12:51 machine01 ntpd[1136]: ntpd 4.2.4p5-a (1) Jul 12 01:12:51 machine01 kernel: . Jul 12 01:12:53 machine01 kernel: Jul 12 01:12:53 machine01 kernel: arp: 00:00:5e:00:01:02 is using my IP address 10.10.10.12 on igb0! Jul 12 01:12:53 machine01 kernel: igb0: promiscuous mode enabled Jul 12 01:12:53 machine01 kernel: carp: VHID 1@igb0: INIT -> BACKUP Jul 12 01:12:54 machine01 kernel: arp: 00:00:5e:00:01:01 is using my IP address 10.10.10.11 on igb0! ----------- Jul 12 01:12:53 machine02 kernel: arp: 10.10.10.10 moved from 00:00:5e:00:01:01 to 00:25:90:79:67:9a on igb0 In our particular case we have 6 carp interfaces on each machine, but I don't believe that's a factor. The machines are both connected to Cisco 6509 routers and when this happens the Ciscos end up with an ARP entry for the carp IP's pointing to the physical nic MAC instead of the CARP MAC e.g. > sh ip arp 10.10.10.11 > Protocol Address Age (min) Hardware Addr Type Interface > Internet 10.10.10.11 78 0025.9079.679a ARPA Vlan10 We also have the following settings in sysctl.conf: net.inet.carp.preempt=1 net.inet.carp.senderr_demotion_factor=0 The first setting is as we want the main master to stay master if its running. The second setting is for when we've used CARP on top of LAGG to prevent CARP breaking while LAGG negotiates, after which it will never recover. This however is not the case here as these machines aren't using LAGG. -- You are receiving this mail because: You are the assignee for the bug.