From owner-freebsd-net@FreeBSD.ORG Fri Mar 18 16:12:52 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E285106566C for ; Fri, 18 Mar 2011 16:12:52 +0000 (UTC) (envelope-from petersson@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id A9CCF8FC08 for ; Fri, 18 Mar 2011 16:12:51 +0000 (UTC) Received: by fxm11 with SMTP id 11so4455117fxm.13 for ; Fri, 18 Mar 2011 09:12:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:content-type:subject:date:message-id:to :mime-version:x-mailer; bh=1Oj7zl7khlWIoteNoEzclbuW9mxbMc7U5R7zQ1p3DKA=; b=VtA0dpeXQTKifh1ILgMgyUub34nklC5RNACzPYpfYGANZA13Nko6IJl7hlhkEtfg5N lnOouJAGFbPrnPUWExtOvmmquHYQHJ7uG42gDfsRDUDPeNz25L3oIwFrO6KeL+TlG7vo Yxx17UMpIAIDPX2rJyN9S/KXamIb4KVZGfRx0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:content-type:subject:date:message-id:to:mime-version:x-mailer; b=i8lxfxJv07eR0//C5psiYdaw7eUuWz2Rq+4iAFTXEQOxQcofBEajisP7l10+qebvAG lk7XOK1Kpe2WrlcAyNYznkX9EYEJBuoxPQR2njDidjLeQNznBqsj9d3ea6JLmJNLxgo6 VjCqFRyaCYnWKCaXk8hxxO/kYt0V6yLVXEmBI= Received: by 10.223.27.129 with SMTP id i1mr1461120fac.24.1300463052284; Fri, 18 Mar 2011 08:44:12 -0700 (PDT) Received: from [10.0.0.31] ([62.182.216.5]) by mx.google.com with ESMTPS id c11sm1442914fav.2.2011.03.18.08.44.04 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 18 Mar 2011 08:44:07 -0700 (PDT) From: Viktor Petersson Date: Fri, 18 Mar 2011 16:43:59 +0100 Message-Id: <00612801-A0F4-4EDC-9BED-3364A86E4F9C@gmail.com> To: freebsd-net@freebsd.org Mime-Version: 1.0 (Apple Message framework v1082) X-Mailer: Apple Mail (2.1082) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Possible CARP bug? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Mar 2011 16:12:52 -0000 Hey guys, First, a big thanks to the developers for all the hard work. You guys = rock! Now to the issue. I've been using CARP on a few servers in the past = without any issues. It usually works without any hick-ups. Now I'm = planning to move our company's infrastructure from physical hardware to = a virtual environment over at CloudSigma (http://www.cloudsigma.com). = Unfortunately I'm having some issues with getting CARP to work there. = For the record, they're using Qemu as the virtualization platform. Let me start by describing my setup in more details. I have two nodes: nas0 and nas1. Both these nodes have two interfaces, = one public and one private. I'm obviously using the private one for = CARP. nas0 is using the IP 192.168.1.11 and nas1 is using the IP = 192.168.1.12. The CARP interface is configured to use the IP = 192.168.1.10. The internal network is using a dedicated VLAN. Only these = two nodes are using this VLAN to eliminate any possible conflicts. I've also disabled all software firewalls, so we should also be able to = exclude that from the equation. Both nodes are using FreeBSD 8.2, and both the internal and external = interfaces are working (ie. the two nodes can ping each other on the = private interfaces). In rc.conf on nas0, I have the following lines: cloned_interfaces=3D"carp0" ifconfig_carp0=3D"vhid 1 pass foobar 192.168.1.10/24" On nas1 (which is the failover), the equivalent lines are: cloned_interfaces=3D"carp0" ifconfig_carp0=3D"vhid 1 advskew 100 pass foobar = 192.168.1.10/24" (note the advskew value on nas1) To verify that CARP is enabled and configured etc., here's the sysctl = output (same on both nodes): net.inet.carp.allow: 1 net.inet.carp.preempt: 1 net.inet.carp.log: 1 net.inet.carp.arpbalance: 0 net.inet.carp.suppress_preempt: 0 Normally, that should be it. nas0 should automatically become the = master, and nas1 the backup/failover. Unfortunately that doesn't happen. = Instead, what I get this on the node with the lowest advskew value = (nas0, but if I raise the advskew on nas0, the error moves to nas1): Mar 7 14:42:57 nas0 kernel: carp0: MASTER -> BACKUP (more = frequent advertisement received) Mar 7 14:42:57 nas0 kernel: carp0: 2 link states coalesced Mar 7 14:42:57 nas0 kernel: carp0: link state changed to DOWN When checking the CARP interface status, I get the following on nas0: carp0: flags=3D49 metric 0 mtu 1500 inet 192.168.1.10 netmask 0xffffff00 carp: BACKUP vhid 1 advbase 1 advskew 0 and the following on nas1: carp0: flags=3D49 metric 0 mtu 1500 inet 192.168.1.10 netmask 0xffffff00 carp: BACKUP vhid 1 advbase 1 advskew 100 I've google'd this error (carp0: 2 link states coalesced), and some of = the forum posts mentioned that they've seen this with faulty NICs or = switches. However, I've reached out to CloudSigma, and they've been very = helpful and set up a replication of the setup, but on 8.1). Their head = network guy was able to reproduce the same errors as I got, and he was = also able to confirm that the packages were indeed sent and received on = both nodes (using tcpdump). His conclusion was that this was likely a = bug in CARP (or possibly a driver).=20 It is also worth mentioning that CARP does work under OpenBSD 4.3 and = VRRT work under Linux.=20 Since it's also in their interest to get this working for us (as this is = what is holding us back from moving), they've been kind enough to = provide access to their CARP test-nodes to any developer that want to = take a stab at it. I have the credentials and details, but I don't want = to post them here, but will provide them to anyone interested. Regards, Viktor Petersson WireLoad