Date: Mon, 10 Oct 2016 17:29:41 +0300 From: Slawa Olhovchenkov <slw@zxy.spb.ru> To: Julien Charbon <jch@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@FreeBSD.org, hiren panchasara <hiren@strugglingcoder.info> Subject: Re: 11.0 stuck on high network load Message-ID: <20161010142941.GV54003@zxy.spb.ru> In-Reply-To: <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> References: <20160925124626.GI2840@zxy.spb.ru> <dc2798ff-2ace-81f7-a563-18ffa1ace990@gmail.com> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <a0425aad-a421-05bc-c1a8-c6fe06b83833@freebsd.org> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote: > > Hi Slawa, > > On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote: > > On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: > >> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: > >>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: > >>> > >>>> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the > >>>> process continues and calls INP_WUNLOCK() here: > >>>> > >>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.c#L1568 > >>> > >>> Look also to sys/netinet/tcp_timewait.c:488 > >>> > >>> And check other locks from r160549 > >> > >> You are right, and here the a fix proposal for this issue: > >> > >> Fix a double-free when an inp transitions to INP_TIMEWAIT state after > >> having been dropped > >> https://reviews.freebsd.org/D8211 > >> > >> It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPPED > >> inpcb should never be proceed further. > >> > >> Slawa, as you are the only one to reproduce this issue currently, could > >> test this patch? (And remove the temporary patch I did provided to you > >> before). > >> > >> I will wait for your tests results before pushing further. > >> > >> Thanks! > >> > >> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c > >> index c72f01f..37f27e0 100644 > >> --- a/sys/netinet/tcp_input.c > >> +++ b/sys/netinet/tcp_input.c > >> @@ -921,6 +921,16 @@ findpcb: > >> goto dropwithreset; > >> } > >> INP_WLOCK_ASSERT(inp); > >> + /* > >> + * While waiting for inp lock during the lookup, another thread > >> + * can have droppedt the inpcb, in which case we need to loop back > >> + * and try to find a new inpcb to deliver to. > >> + */ > >> + if (inp->inp_flags & INP_DROPPED) { > >> + INP_WUNLOCK(inp); > >> + inp = NULL; > >> + goto findpcb; > > > > Are you sure about this goto? > > Can this cause infinite loop by found same inpcb? > > May be drop packet is more correct? > > Good question: Infinite loop is not possible here, as the next TCP > hash lookup will return NULL or a fresh new and not dropped inp. You I am not expert in this api and don't see cause of this: I am assume hash lookup don't remove from hash returned args and I am don't see any removing of this inp. Why hash lookup don't return same inp? (assume this input patch interrupt callout code on the same CPU core). > can check the current other usages of goto findpcb in tcp_input(). The > rational here being: > > - Behavior before the patch: If the inp we found was deleted then goto > findpcb. > - Behavior after the patch: If the inp we found was deleted or dropped > then goto findpcb. > > I just prefer having the same behavior applied everywhere: If > tcp_input() loses the inp lock race and the inp was deleted or dropped > then retry to find a new inpcb to deliver to. > > But you are right dropping the packet here will also fix the issue. > > Then the review process becomes quite helpful because people can argue: > Dropping here is better because "blah", or goto findpcb is better > because "bluh", etc. And at the review end you have a nice final patch. > > https://reviews.freebsd.org/D8211 I am not sure, I am see to sys/netinet/in_pcb.h:#define INP_DROPPED 0x04000000 /* protocol drop flag */ and think this is a flag 'all packets must be droped'
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161010142941.GV54003>