Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Apr 2008 10:50:30 GMT
From:      Auke Zaaiman <a.zaaiman@nouzelle.com>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/123166: CARP messages filtered by Realtek driver on > 6.2
Message-ID:  <200804281050.m3SAoUtV075737@www.freebsd.org>
Resent-Message-ID: <200804281100.m3SB0BFY055021@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         123166
>Category:       kern
>Synopsis:       CARP messages filtered by Realtek driver on > 6.2
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Apr 28 11:00:11 UTC 2008
>Closed-Date:
>Last-Modified:
>Originator:     Auke Zaaiman
>Release:        6.2-RELEASE
>Organization:
Nouzelle Internet Services
>Environment:
FreeBSD loadbalance01.nouzelle.local 6.2-RELEASE FreeBSD 6.2-RELEASE #1: Sun Apr 27 18:37:02 CEST 2008 root@loadbalance01.nouzelle.local:/log/obj/log/src/sys/SMP  amd64

>Description:
On our testing environment we have the following configuration for failover/loadbalancing:
2 machines with each:
AMD Sempron(tm) Processor 3000+
1GB RAM
2x RealTek 8169S Single-chip Gigabit Ethernet (re0 and re1)
1x VIA VT6102 Rhine II 10/100BaseTX (vr0)

Our initial setup has been build upon 6.2-RELEASE.
The setup is:
- vr0 on both machines are configured with an internal IP, additionally there is a CARP device to create a global gateway for the internal network; and over vr0 also pfsync runs, but that ain't really important.
- re0 is on both machines IP less and only used for VLAN's. The vlan's are all configured with their own IP's in seperate IP-ranges (f.ex. 172.29.27.0/24, 172.29.24.0/24) For every VLAN device there is a related CARP device to provide a global gateway for the network behind that VLAN.
- re1 is on both machines the external interface, with two dedicated IP's on both in two seperate IP-ranges. And again a CARP device, to provide failover and eventually loadbalancing for the external IP's.

Next to the above machines we have several machines in the backend and frontend running on different hardware (non-realtek nic's) also in failover.

The whole setup ran fine for more then 3 months. CARP worked fine, everything was communicating fine etc.

As 7.0 was released we decided to upgrade all machines in the environment to this release. Upgrades went fine, but a problem appeared.
The loadbalancers failed to see eachother's CARP messages, that is on the Realtek NIC's, CARP running on top of vr0 is working fine.
- We checked if firewall's were all of a sudden in the way (would be surprising as nothing in configs changed).
- We checked the ULE scheduler was not playing nice, wasn't the problem either.
- Any other machine still could send traffic over the loadbalancers;
- The switch didn't give any errors, nor did the local interfaces on the loadbalancers.

As it beat us, we decided to downgrade them to 6.3-RELEASE, hoping this would fix it. The problem stayed though. After letting it rest for a few weeks, I checked again on the problem.
I checked everything above again and then decided to look with tcpdump on the loadbalancers and on a few frontend and backend machines to see what was happening. And there was the surprise:
- loadbalancers are sending out CARP messages;
- frontend and backend machines are receiving those CARP messages;
- frontend and backend machines are also sending out CARP messages for theirselves;
- loadbalancers are not receiving any CARP messages on their Realtek NIC's.

Although I haven't really been able to discover any changes regarding the Realtek NIC's as of 6.2 to 6.3, I suspect there is some anyways.

So next thing I did is downgrade the loadbalancers to 6.2-RELEASE and as soon as they booted everything was fine.

There is a problem with incoming CARP messages being somehow filtered on Realtek NIC's starting with releases after 6.2-RELEASE

>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200804281050.m3SAoUtV075737>