Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Apr 2011 11:00:36 -0500
From:      Brandon Gooch <jamesbrandongooch@gmail.com>
To:        Steve Polyack <korvus@comcast.net>
Cc:        freebsd-net@freebsd.org, Frederique Rijsdijk <frederique@isafeelin.org>
Subject:   Re: Network stack unstable after arp flapping
Message-ID:  <AANLkTi=TmSRFx5q=PYvU3gpdNug97CwMmPndJc280SH-@mail.gmail.com>
In-Reply-To: <4D95E62A.5000109@comcast.net>
References:  <20110401141655.GA5350@deta.isafeelin.org> <4D95E62A.5000109@comcast.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 1, 2011 at 9:50 AM, Steve Polyack <korvus@comcast.net> wrote:
> On 04/01/11 10:16, Frederique Rijsdijk wrote:
>>
>> Hi,
>>
>> We (hosting provider) are in the process of implementing ipv6 in our
>> network (yay). Yesterday one of the final steps in configuring and updat=
ing
>> our core routers were taken, which did not go entirely as planned. As a
>> result, the default gateway mac addresses for all our machines changed a=
bout
>> 800 times in a time span of about 4 minutes.
>>
>> Here's a small piece of the logging:
>>
>> Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d =
to
>> 00:00:0c:07:ac:3d on bge0
>> Mar 31 18:36:12 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d =
to
>> 00:00:0c:9f:f0:3d on bge0
>> Mar 31 18:36:13 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d =
to
>> 00:00:0c:07:ac:3d on bge0
>> Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d =
to
>> 00:00:0c:9f:f0:3d on bge0
>> Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d =
to
>> 00:00:0c:07:ac:3d on bge0
>> Mar 31 18:36:14 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:07:ac:3d =
to
>> 00:00:0c:9f:f0:3d on bge0
>> Mar 31 18:36:15 srv01 kernel: arp: x.x.x.1 moved from 00:00:0c:9f:f0:3d =
to
>> 00:00:0c:07:ac:3d on bge0
>>
>> The x.x.x.1 is always the same IP, the gateway of the machine.
>>
>> The result of that, is that loads of FreeBSD machines (6.x, 7.x and 8.x)
>> developed serious network issues, mainly being no or slow traffic betwee=
n
>> other (FreeBSD) machine accross different VLAN's in our own network.
>>
>> First thing that comes to mind is the network itself, but all Linux
>> machines (Ubuntu, Red Hat and CentOS) had no issues at all. Only BSD.
>>
>> An arp -ad on both machines where problems occured, didn't solve anythin=
g.
>> What worked better was /etc/rc.d/netif restart and a /etc/rc.d/routing
>> restart. Some machines even had to be rebooted in order to get networkin=
g
>> back to normal.
>>
>> This almost sounds like a bug in the network stack in BSD, but I can not
>> imagine that I'm right. The BSD networking stack is considered to be one=
 of
>> the best..
>>
>> Any ideas anyone?
>
> We experienced a similar issue here, but IIRC only on our 8.x systems (we
> don't have any 7.x). =A0Disabling flowtable cleared everything up immedia=
tely.
> =A0You can try that and see if it helps. =A0It seems like the flowtable =
=A0caches
> and associates the next-hop router MAC address with each flow, and
> unfortunately this doesn't get purged when the kernel senses and logs an =
ARP
> change. =A0The only other solution I've seen was to stop all network traf=
fic
> on the machine until the flows/cache entries expired.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D155604 has more details of m=
y
> run-in with this. =A0The title should be corrected though, as I found sho=
rtly
> after that all traffic is affected.
>
> - Steve

FYI, the FLOWTABLE option has been removed from the DEFAULT kernel
config on HEAD, a change which will be MFC'd in a couple of days to
8-STABLE...

-Brandon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=TmSRFx5q=PYvU3gpdNug97CwMmPndJc280SH->