From owner-freebsd-net@FreeBSD.ORG Tue Jan 24 08:20:37 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E37FF106564A; Tue, 24 Jan 2012 08:20:37 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3D4858FC19; Tue, 24 Jan 2012 08:20:36 +0000 (UTC) Received: by eaai10 with SMTP id i10so1625099eaa.13 for ; Tue, 24 Jan 2012 00:20:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=nk3dw3ZCg/prTRJE+uFz79NWIIsXc891uVO+dzaBgUk=; b=lM//O0Wo1s6zd45blTDFxZCErtWKNWJVOhJgQ1CHradvetS2NA47yMAfMBT9EaOKpb QvnW6JOnWVbKNMxGj1HSQy4Uf37BGSvofiDOwmdvpRr1QQ19LSA/WAOn4ciJyjiq8r5r X1FPJdnhkWe40I9J9D2P0gM3ku7fDSLZFt1I8= Received: by 10.213.23.1 with SMTP id p1mr247049ebb.0.1327393235980; Tue, 24 Jan 2012 00:20:35 -0800 (PST) Received: from ndenevsa.sf.moneybookers.net (g1.moneybookers.com. [217.18.249.148]) by mx.google.com with ESMTPS id c16sm64763938eei.1.2012.01.24.00.20.33 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 24 Jan 2012 00:20:34 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v1251.1) Content-Type: text/plain; charset=windows-1252 From: Nikolay Denev In-Reply-To: <4F1DCE54.8060107@freebsd.org> Date: Tue, 24 Jan 2012 10:20:32 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <3D2ACF41-3CFB-42BE-B89D-06B202AFD2B0@gmail.com> References: <4F131A7D.4020006@zonov.org> <733BE6AF-33E0-4C16-A222-B5F5D0519194@gmail.com> <12379405.15603.1326656127893.JavaMail.mobile-sync@vbzh28> <3008402354236887854@unknownmsgid> <7D135FA9-6503-4263-AE55-5C80F94CDF5A@gmail.com> <4F1DCE54.8060107@freebsd.org> To: Andre Oppermann X-Mailer: Apple Mail (2.1251.1) Cc: freebsd-net@freebsd.org Subject: Re: ICMP attacks against TCP and PMTUD X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2012 08:20:38 -0000 On Jan 23, 2012, at 11:17 PM, Andre Oppermann wrote: > On 23.01.2012 16:01, Nikolay Denev wrote: >>=20 >> On Jan 20, 2012, at 10:32 AM, Nikolay Denev wrote: >>=20 >>> On Jan 15, 2012, at 9:52 PM, Nikolay Denev wrote: >>>=20 >>>> On 15.01.2012, at 21:35, Andrey Zonov wrote: >>>>=20 >>>>> This helped me: >>>>> /boot/loader.conf >>>>> net.inet.tcp.hostcache.hashsizee536 >>>>> net.inet.tcp.hostcache.cachelimit=1966080 >>>>>=20 >>>>> Actually, this is a workaround. As I remember, real problem is in >>>>> tcp_ctlinput(), it could not update MTU for destination IP if = hostcache >>>>> allocation fails. tcp_hc_updatemtu() should returns NULL if >>>>> tcp_hc_insert() returns NULL and tcp_ctlinput() should check this = case >>>>> and sets updated MTU for this particular connection if >>>>> tcp_hc_updatemtu() fails. Otherwise we've got infinite loop in = MTU >>>>> discovery. >>>>>=20 >>>>>=20 >>>>> On 15.01.2012 22:59, Nikolay Denev wrote: >>>>>>=20 >>>>>> % uptime >>>>>> 7:57PM up 608 days, 4:06, 1 user, load averages: 0.30, 0.21, = 0.17 >>>>>>=20 >>>>>> % vmstat -z|grep hostcache >>>>>> hostcache: 136, 15372, 15136, 236, = 44946965, 10972760 >>>>>>=20 >>>>>>=20 >>>>>> Hmm=85 probably I should increase this=85. >>>>>>=20 >>>>>=20 >>>>> -- >>>>> Andrey Zonov >>>>=20 >>>> Thanks, I will test this asap! >>>>=20 >>>> Regards, >>>> Nikolay >>>=20 >>> I've upgraded from 7.3-STABLE to 8.2-STABLE and bumped significantly = the hostcache tunables. >>> So far so good, I'll report back if I see similar traffic spikes. >>>=20 >>=20 >> Seems like I have been wrong about these traffic spikes being = attacks, and >> actually the problem seems to be the pmtu infinite loop Andrey = described. >> I'm now running 8.2-STABLE with hostcache significantly bumped and = regularly >> have more than 20K hostcache entries, which was more than the default = limit of 15K I was running with before. >=20 > The bug is real. Please try the attached patch to fix the issue for = IPv4. > It's against current but should apply to 8 or 9 as well. >=20 > --=20 > Andre >=20 > http://people.freebsd.org/~andre/tcp_subr.c-pmtud-20120123.diff >=20 > Index: netinet/tcp_subr.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- netinet/tcp_subr.c (revision 230489) > +++ netinet/tcp_subr.c (working copy) > @@ -1410,9 +1410,11 @@ > */ > if (mtu <=3D = tcp_maxmtu(&inc, NULL)) > tcp_hc_updatemtu(&inc, = mtu); > - } > - > - inp =3D (*notify)(inp, = inetctlerrmap[cmd]); > + /* XXXAO: Slighly hackish. = */ > + inp =3D (*notify)(inp, mtu); > + } else > + inp =3D (*notify)(inp, > + = inetctlerrmap[cmd]); > } > } > if (inp !=3D NULL) > @@ -1656,12 +1658,15 @@ > * based on the new value in the route. Also nudge TCP to send = something, > * since we know the packet we just sent was dropped. > * This duplicates some code in the tcp_mss() function in tcp_input.c. > + * > + * XXXAO: Slight abuse of 'errno'. > */ > struct inpcb * > tcp_mtudisc(struct inpcb *inp, int errno) > { > struct tcpcb *tp; > struct socket *so; > + int mtu; >=20 > INP_WLOCK_ASSERT(inp); > if ((inp->inp_flags & INP_TIMEWAIT) || > @@ -1671,7 +1676,12 @@ > tp =3D intotcpcb(inp); > KASSERT(tp !=3D NULL, ("tcp_mtudisc: tp =3D=3D NULL")); >=20 > - tcp_mss_update(tp, -1, NULL, NULL); > + /* Extract the MTU from errno for IPv4. */ > + if (errno > PRC_NCMDS) > + mtu =3D errno; > + else > + mtu =3D -1; > + tcp_mss_update(tp, mtu, NULL, NULL); >=20 > so =3D inp->inp_socket; > SOCKBUF_LOCK(&so->so_snd); Hi Andre, Thanks for the patch. I will apply it as soon as possible. I'll probably first try to reproduce the problem locally since I've = increased the hostcache on my nginx balancers already, and changes require reboots which I'm not = able to do at the moment. Will let you know as soon as I have results. Thanks!