From owner-freebsd-net@FreeBSD.ORG Wed Feb 2 20:10:13 2011 Return-Path: Delivered-To: freebsd-net@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FAC5106566B for ; Wed, 2 Feb 2011 20:10:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2220E8FC12 for ; Wed, 2 Feb 2011 20:10:13 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p12KACAk039593 for ; Wed, 2 Feb 2011 20:10:12 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p12KACV5039592; Wed, 2 Feb 2011 20:10:12 GMT (envelope-from gnats) Date: Wed, 2 Feb 2011 20:10:12 GMT Message-Id: <201102022010.p12KACV5039592@freefall.freebsd.org> To: freebsd-net@FreeBSD.org From: Mark Boolootian Cc: Subject: Re: kern/146792: [flowtable] flowcleaner 100% cpu's core load X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Mark Boolootian List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Feb 2011 20:10:13 -0000 The following reply was made to PR kern/146792; it has been noted by GNATS. From: Mark Boolootian To: bug-followup@FreeBSD.org, niko@gtelecom.ru Cc: Subject: Re: kern/146792: [flowtable] flowcleaner 100% cpu's core load Date: Wed, 2 Feb 2011 11:37:16 -0800 --00163630f77dc34e0d049b51c661 Content-Type: text/plain; charset=ISO-8859-1 Hi folks, I hit this problem on a pair of anycast name servers. What I'm running: FreeBSD ns1.example.com 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 02:36:49 UTC 2010 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Here's a peak at ps: ns1b# ps auxwww | head USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND root 11 100.0 0.0 0 32 ?? RL 11Jan11 59960:15.43 [idle] root 21 100.0 0.0 0 16 ?? RL 11Jan11 1112:01.24 [flowcleaner] root 0 0.0 0.0 0 96 ?? DLs 11Jan11 0:02.94 [kernel] root 1 0.0 0.0 3204 556 ?? ILs 11Jan11 0:00.01 /sbin/init -- root 2 0.0 0.0 0 16 ?? DL 11Jan11 0:52.87 [g_event] root 3 0.0 0.0 0 16 ?? DL 11Jan11 0:10.10 [g_up] root 4 0.0 0.0 0 16 ?? DL 11Jan11 0:15.18 [g_down] root 5 0.0 0.0 0 16 ?? DL 11Jan11 0:00.00 [mpt_recovery0] The box is running Quagga with a single OSPF adjacency. It has about 500 routes. Both anycast instances of ns1 hit this problem, but neither instance of ns2, which are configured identically, saw the trouble. The ns1 name servers are much busier than ns2. It appears that one instance of ns1 died almost a week ago, which went unnoticed :-( This morning, the second instance died. At that point, it was hard not to notice :-) Traffic on the mailing list suggests that 'sysctl net.inet.flowtable.enable=0' is a work-around. We'll pursue that path and hope for a bug fix in the not-too-distant future. thanks, mark --00163630f77dc34e0d049b51c661 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi folks,
=A0
I hit this problem on a pair of anyc= ast name servers. =A0What
I'm running:
=A0
FreeBSD ns1.example.com 8.1-RELEAS= E FreeBSD 8.1-RELEASE #0: Mon Jul 19 02:36:49
UTC 2010 =A0 =A0 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENER= IC =A0amd64

Here's a peak at ps:
ns1b# ps auxwww | head
USER =A0 =A0 PID %CPU %MEM =A0 VSZ =A0 RSS =A0TT =A0STAT STARTED =A0 = =A0 =A0TIME COMMAND
root =A0 =A0 =A011 100.0 =A00.0 =A0 = =A0 0 =A0 =A032 =A0?? =A0RL =A0 11Jan11 59960:15.43 [idle]
root =A0 =A0 =A021 100.0 =A00.0 =A0 =A0 0 =A0 =A016 =A0?? =A0RL =A0 11= Jan11 1112:01.24 [flowcleaner]
root =A0 =A0 =A0 0 =A00.0 = =A00.0 =A0 =A0 0 =A0 =A096 =A0?? =A0DLs =A011Jan11 =A0 0:02.94 [kernel]
root =A0 =A0 =A0 1 =A00.0 =A00.0 =A03204 =A0 556 =A0?? =A0ILs =A011Jan= 11 =A0 0:00.01 /sbin/init --
root =A0 =A0 =A0 2 =A00.0 =A0= 0.0 =A0 =A0 0 =A0 =A016 =A0?? =A0DL =A0 11Jan11 =A0 0:52.87 [g_event]
root =A0 =A0 =A0 3 =A00.0 =A00.0 =A0 =A0 0 =A0 =A016 =A0?? =A0DL =A0 1= 1Jan11 =A0 0:10.10 [g_up]
root =A0 =A0 =A0 4 =A00.0 =A00.0= =A0 =A0 0 =A0 =A016 =A0?? =A0DL =A0 11Jan11 =A0 0:15.18 [g_down]
root =A0 =A0 =A0 5 =A00.0 =A00.0 =A0 =A0 0 =A0 =A016 =A0?? =A0DL =A0 1= 1Jan11 =A0 0:00.00 [mpt_recovery0]

The box is running Quagga with a single OSPF adjacency. =A0It= has about 500 routes.
Both anycast instances of ns1 hit this = problem, but neither instance of ns2, which are=A0
configured identically, saw the trouble. =A0The ns1 name servers are much = busier than ns2.

It appears that one instance of ns1 died almost a week ago, w= hich went unnoticed :-( =A0This
morning, the second instance d= ied. =A0At that point, it was hard not to notice :-)

Traffic on the mailing list suggests that 'sysctl ne= t.inet.flowtable.enable=3D0' is a work-around.
We'll pursue that path and hope for a bug fix in the not-too-distant f= uture.

thanks,
mark
--00163630f77dc34e0d049b51c661--