Date: Mon, 10 Oct 2016 16:03:39 +0200 From: Julien Charbon <jch@freebsd.org> To: Slawa Olhovchenkov <slw@zxy.spb.ru> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info> Subject: Re: 11.0 stuck on high network load Message-ID: <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> In-Reply-To: <20161010133220.GU54003@zxy.spb.ru> References: <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com> <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru> <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX Content-Type: multipart/mixed; boundary="rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF"; protected-headers="v1" From: Julien Charbon <jch@freebsd.org> To: Slawa Olhovchenkov <slw@zxy.spb.ru> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info> Message-ID: <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> Subject: Re: 11.0 stuck on high network load References: <e4e0188c-b22b-29af-ed15-b650c3ec4553@gmail.com> <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru> <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> In-Reply-To: <20161010133220.GU54003@zxy.spb.ru> --rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Slawa, On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote: > On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: >> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: >>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: >>> >>>> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag,= the >>>> process continues and calls INP_WUNLOCK() here: >>>> >>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_= subr.c#L1568 >>> >>> Look also to sys/netinet/tcp_timewait.c:488 >>> >>> And check other locks from r160549 >> >> You are right, and here the a fix proposal for this issue: >> >> Fix a double-free when an inp transitions to INP_TIMEWAIT state after >> having been dropped >> https://reviews.freebsd.org/D8211 >> >> It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPP= ED >> inpcb should never be proceed further. >> >> Slawa, as you are the only one to reproduce this issue currently, cou= ld >> test this patch? (And remove the temporary patch I did provided to yo= u >> before). >> >> I will wait for your tests results before pushing further. >> >> Thanks! >> >> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c >> index c72f01f..37f27e0 100644 >> --- a/sys/netinet/tcp_input.c >> +++ b/sys/netinet/tcp_input.c >> @@ -921,6 +921,16 @@ findpcb: >> goto dropwithreset; >> } >> INP_WLOCK_ASSERT(inp); >> + /* >> + * While waiting for inp lock during the lookup, another threa= d >> + * can have droppedt the inpcb, in which case we need to loop= back >> + * and try to find a new inpcb to deliver to. >> + */ >> + if (inp->inp_flags & INP_DROPPED) { >> + INP_WUNLOCK(inp); >> + inp =3D NULL; >> + goto findpcb; >=20 > Are you sure about this goto? > Can this cause infinite loop by found same inpcb? > May be drop packet is more correct? Good question: Infinite loop is not possible here, as the next TCP hash lookup will return NULL or a fresh new and not dropped inp. You can check the current other usages of goto findpcb in tcp_input(). The rational here being: - Behavior before the patch: If the inp we found was deleted then goto findpcb. - Behavior after the patch: If the inp we found was deleted or dropped then goto findpcb. I just prefer having the same behavior applied everywhere: If tcp_input() loses the inp lock race and the inp was deleted or dropped then retry to find a new inpcb to deliver to. But you are right dropping the packet here will also fix the issue. Then the review process becomes quite helpful because people can argue: Dropping here is better because "blah", or goto findpcb is better because "bluh", etc. And at the review end you have a nice final patch. https://reviews.freebsd.org/D8211 -- Julien --rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF-- --Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX+5+/AAoJEKVlQ5Je6dhx6lQH/Awtgic2tUHJdoFJkzB+DWng pMiInCMiaSkF978ngUgRXjltqLVfb1YBR0Odn7UvbY3W6scOyEEUqO0aIyVXS1mY FSoiQsBlJaHRmKth4RaUPXrBrktHgY2IzVSTNITlfZKSDg0pKjRJalNiQWjyAUr0 LmkmV58/x0rNAXKi/4ZLmmAjgjnMk5n4qVwIoXuA2H12KbE+ZbFu1WIB3FsOnr+i xlN07KtRxuN84obr0UhuanEsnFw2kITr8QiRe5j9yRN+qRMr80awv6Px1cpDsokP h4VsbW4ESmf5w1C3OqqETeiXpPlnF5JPnanw0iX1x/2jInD+fOmYRfFsHeoCmuU= =qFSj -----END PGP SIGNATURE----- --Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?23f1200e-383e-befb-b76d-c88b3e1287b0>