Date: Sun, 26 Jul 2009 18:25:10 -0500 From: Matthew Grooms <mgrooms@shrew.net> To: Max Laier <max@love2party.net> Cc: freebsd-net@freebsd.org Subject: Re: FreeBSD + carp on VMWare ESX Message-ID: <4A6CE5D6.3020709@shrew.net> In-Reply-To: <200907201318.08122.max@love2party.net> References: <4A638E76.2060706@shrew.net> <4A63A4B3.6090500@modulus.org> <3D3254E2-4E45-4C67-84D2-DB05660D768F@shrew.net> <200907201318.08122.max@love2party.net>
next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------030001080308090008010309 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Max Laier wrote: > > There is clearly something very wrong with how the vswitch works and it's not > really FreeBSD's job to work around these issues. The patch you posted is > rather intrusive and certainly not something we want in the tree. You should > talk to VMWare's support to fix the obvious short-comings in the vswitch > design. > I agree completely. We have a support contract with VMWare and I intend to open a ticket. However I think the likelihood of them providing a fix for this just about zero. Who knows, maybe they will surprise me. > As for your patch - you want "IF_ADDR_[UN]LOCK(ifp);" around walking the > address list. Don't forget to unlock before the return. > Thanks for the help. Here is an updated patch against 7.2 if anyone else has a VMWare + FreeBSD + CARP problem child and needs a fix today ... http://www.shrew.net/static/patches/esx-carp.diff The IPv6 code path is untested. Also, the changes were placed under a sysctl conditional so the following is required in /etc/sysctl.conf to enable it at boot time ... net.inet.carp.drop_echoed=1 Thanks again, -Matthew --------------030001080308090008010309 Content-Type: text/plain; name="esx-carp.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="esx-carp.diff" Index: ip_carp.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_carp.c,v retrieving revision 1.52.2.3 diff -u -r1.52.2.3 ip_carp.c --- ip_carp.c 9 May 2009 00:35:38 -0000 1.52.2.3 +++ ip_carp.c 26 Jul 2009 22:46:19 -0000 @@ -143,6 +143,8 @@ &carp_opts[CARPCTL_LOG], 0, "log bad carp packets"); SYSCTL_INT(_net_inet_carp, CARPCTL_ARPBALANCE, arpbalance, CTLFLAG_RW, &carp_opts[CARPCTL_ARPBALANCE], 0, "balance arp responses"); +SYSCTL_INT(_net_inet_carp, CARPCTL_DROPECHOED, drop_echoed, CTLFLAG_RW, + &carp_opts[CARPCTL_DROPECHOED], 0, "drop packets echoed to sender"); SYSCTL_INT(_net_inet_carp, OID_AUTO, suppress_preempt, CTLFLAG_RD, &carp_suppress_preempt, 0, "Preemption is suppressed"); @@ -552,6 +554,28 @@ return; } + /* + * verify that the source address is not valid + * for the interface it was received on. this + * tends to happen with VMWare ESX vSwitches. + */ + if (carp_opts[CARPCTL_DROPECHOED]) { + struct ifnet *ifp = m->m_pkthdr.rcvif; + struct ifaddr *ifa; + IF_ADDR_LOCK(ifp); + TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) { + struct in_addr in4; + in4 = ifatoia(ifa)->ia_addr.sin_addr; + if (ifa->ifa_addr->sa_family == AF_INET && + in4.s_addr == ip->ip_src.s_addr) { + IF_ADDR_UNLOCK(ifp); + m_freem(m); + return; + } + } + IF_ADDR_UNLOCK(ifp); + } + /* verify that the IP TTL is 255. */ if (ip->ip_ttl != CARP_DFLTTL) { carpstats.carps_badttl++; @@ -644,6 +668,28 @@ return (IPPROTO_DONE); } + /* + * verify that the source address is not valid + * for the interface it was received on. this + * tends to happen with VMWare ESX vSwitches. + */ + if (carp_opts[CARPCTL_DROPECHOED]) { + struct ifnet *ifp = m->m_pkthdr.rcvif; + struct ifaddr *ifa; + IF_ADDR_LOCK(ifp); + TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) { + struct in6_addr in6; + in6 = ifatoia6(ifa)->ia_addr.sin6_addr; + if (ifa->ifa_addr->sa_family == AF_INET6 && + memcmp(&in6, &ip6->ip6_src, sizeof(in6)) == 0) { + IF_ADDR_UNLOCK(ifp); + m_freem(m); + return (IPPROTO_DONE); + } + } + IF_ADDR_UNLOCK(ifp); + } + /* verify that the IP TTL is 255 */ if (ip6->ip6_hlim != CARP_DFLTTL) { carpstats.carps_badttl++; Index: ip_carp.h =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_carp.h,v retrieving revision 1.3 diff -u -r1.3 ip_carp.h --- ip_carp.h 1 Dec 2006 18:37:41 -0000 1.3 +++ ip_carp.h 26 Jul 2009 22:46:19 -0000 @@ -140,7 +140,8 @@ #define CARPCTL_LOG 3 /* log bad packets */ #define CARPCTL_STATS 4 /* statistics (read-only) */ #define CARPCTL_ARPBALANCE 5 /* balance arp responses */ -#define CARPCTL_MAXID 6 +#define CARPCTL_DROPECHOED 6 /* drop packets echoed to the sender */ +#define CARPCTL_MAXID 7 #define CARPCTL_NAMES { \ { 0, 0 }, \ --------------030001080308090008010309--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A6CE5D6.3020709>