Date: Sun, 19 Jul 2009 16:21:58 -0500 From: Matthew Grooms <mgrooms@shrew.net> To: freebsd-net@freebsd.org Cc: max@love2party.net Subject: FreeBSD + carp on VMWare ESX Message-ID: <4A638E76.2060706@shrew.net>
next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------050006090109070805040105 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi all, I was having problems running carp on VMWare ESX 4 and did a little investigative work to determine the cause of the problem. There are several posts on the VMWare forums of other users having the same difficulty, so I know its not just me :) In any case, for carp to have a chance of working on ESX you have to enable promiscuous mode on the vSwitch the port group its associated with. But after doing this, carp interfaces immediately go into BACKUP state. If the the net.inet.carp.allow is set to 0, then they immediately move into a MASTER state. Of course this isn't useful if you actually want carp to work. tcpdump output showed multiple copies of the carp packets being bounced back to the host that emitted them. This made me suspect that the host was seeing its own advertisement, evaluating it as being sent by another host and placing its own carp interface into a BACKUP state as a result. To solve this, my first inclination was to add a pf rule to block all inbound carp traffic from itself for a given interface. Unfortunately, that didn't seem to work for some reason. I ended up writing a small kernel patch that basically does the same thing ( IPv4 only ) which does work without any problem that I can see. Unfortunately I don't have much experience with the FreeBSD kernel so I assume that its not safe to walk the interface address list without holding the appropriate lock. Would someone please have a look at this? I really need this to work in a production system. Others would likely be very happy to have this work as well, even if they have to apply a patch. Thanks in advance, -Matthew --------------050006090109070805040105 Content-Type: text/plain; name="ip_carp.c.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="ip_carp.c.diff" Index: ip_carp.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_carp.c,v retrieving revision 1.52.2.3 diff -u -r1.52.2.3 ip_carp.c --- ip_carp.c 9 May 2009 00:35:38 -0000 1.52.2.3 +++ ip_carp.c 19 Jul 2009 20:12:49 -0000 @@ -533,7 +533,9 @@ { struct ip *ip = mtod(m, struct ip *); struct carp_header *ch; - int iplen, len; + struct ifnet *ifp = m->m_pkthdr.rcvif; + struct ifaddr *ifa; + int len, iplen; carpstats.carps_ipackets++; @@ -543,21 +545,39 @@ } /* check if received on a valid carp interface */ - if (m->m_pkthdr.rcvif->if_carp == NULL) { + if (ifp->if_carp == NULL) { carpstats.carps_badif++; CARP_LOG("carp_input: packet received on non-carp " "interface: %s\n", - m->m_pkthdr.rcvif->if_xname); + ifp->if_xname); m_freem(m); return; } + /* + * verify that the source address is not valid + * for the interface it was received on. this + * tends to happen with VMWare ESX vSwitches. + */ + TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) { + struct in_addr in; + in.s_addr = ifatoia(ifa)->ia_addr.sin_addr.s_addr; + if (ifa->ifa_addr->sa_family == AF_INET && + in.s_addr == ip->ip_src.s_addr ) { + m_freem(m); + return; + } + } + /* verify that the IP TTL is 255. */ if (ip->ip_ttl != CARP_DFLTTL) { carpstats.carps_badttl++; CARP_LOG("carp_input: received ttl %d != 255i on %s\n", ip->ip_ttl, - m->m_pkthdr.rcvif->if_xname); + ifp->if_xname); m_freem(m); return; } @@ -592,7 +612,7 @@ carpstats.carps_badlen++; CARP_LOG("carp_input: packet too short %d on %s\n", m->m_pkthdr.len, - m->m_pkthdr.rcvif->if_xname); + ifp->if_xname); m_freem(m); return; } @@ -609,7 +629,7 @@ if (carp_cksum(m, len - iplen)) { carpstats.carps_badsum++; CARP_LOG("carp_input: checksum failed on %s\n", - m->m_pkthdr.rcvif->if_xname); + ifp->if_xname); m_freem(m); return; } --------------050006090109070805040105--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A638E76.2060706>