Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 02 Jan 2012 19:53:36 -0800
From:      Doug Barton <dougb@FreeBSD.org>
To:        freebsd-net@freebsd.org
Subject:   openbgpds not talking each other since 8.2-STABLE upgrade
Message-ID:  <4F027BC0.1080101@FreeBSD.org>
In-Reply-To: <99A5FFD9-8815-4CCC-9868-FB2E3D799566@gridfury.com>
References:  <99A5FFD9-8815-4CCC-9868-FB2E3D799566@gridfury.com>

next in thread | previous in thread | raw e-mail | index | archive | help
We have a pair of physical FreeBSD systems configured as routers
designed to operate in an active/standby CARP configuration. Everything
used to work fine, but since an upgrade to 8.2-STABLE on December 29th
the two routers don't speak BGP to each other anymore. They both
function fine individually, and failover works. It is only the openbgpd
communication between them that's not flowing.

They have OpenBGPd (openbgpd-4.9.20110612_1 from ports) installed.  The
active router takes BGP full route feeds from our peers and *should*
feed it to the standby router via a direct connection (crossover cable
between physical em2 ports).

The relative "bgpctl show" reports:

10.0.0.2           12345          0          0     0 Never    Active

or

10.0.0.2           12345          0          0     0 Never    Connect

The bgp daemon for the active server periodically reports:

bgpd[6773]: neighbor 10.0.0.2: socket error: Operation timed out

There is not a connectivity problem between the two hosts; ssh for
example works fine.  Telnet'ing to the bgp port times out, even from the
same machine.

There is no firewall configured on that interface.

TCP-MD5 is *not* configured on the bgpd side.  We did try enabling it
(properly) between the two machines via /etc/ipsec.conf to see if it
would make a difference, but that also had no effect on this problem.

We've tried tcpdump, and both machines can clearly see the TCP SYN and
SYN-ACK setup packets flowing in both directions, but the ACK packet
never happens.  In netstat -an, the opening side gets:

tcp4       0      0 10.0.0.2.16797     10.0.0.1.179      SYN_SENT

and the receiving side gets:

tcp4       0      0 10.0.0.1.179       10.0.0.2.16797    SYN_RCVD

Just to make sure pf can't possibly be affecting this, right at the top
of pf.conf on both machines:

##  Pass inter-router traffic
pass quick on em2 from 10.0.0.2 to 10.0.0.1
pass quick on em2 from 10.0.0.1 to 10.0.0.2

This is sufficient because we can connect to bgpd with nc:

$ nc -S 10.0.0.2 179
????????????????-??Z?^w?A??

Produces:

$ netstat -an | fgrep 10.0.0.2
tcp4       0      0 10.0.0.1.25711     10.0.0.2.179      ESTABLISHED

and

$ netstat -an | fgrep 10.0.0.1
tcp4       0      0 10.0.0.2.179      10.0.0.1.25711     ESTABLISHED

So this appears to be some sort of weird problem specific to openbgpd
and the updated kernel.

At this point I'm at a loss as to how to proceed, so any suggestions on
how to fix, or even debug this will be greatly appreciated.


Doug



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F027BC0.1080101>