From owner-freebsd-stable@freebsd.org Mon Oct 10 15:58:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C3DEAC0C59A for ; Mon, 10 Oct 2016 15:58:48 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-lf0-f65.google.com (mail-lf0-f65.google.com [209.85.215.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 63CFA7B7 for ; Mon, 10 Oct 2016 15:58:48 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-lf0-f65.google.com with SMTP id x79so7422228lff.2 for ; Mon, 10 Oct 2016 08:58:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=S4dlF0IFThTa2mSq15Q2F4KeQhJUQ3rz+pXgctGlAgc=; b=bZAYWFu2hcxfsWtAd5+cwnb3JiPCDAlB+bctvkg8m2ui3e+SI/En3BpiVN0PCtVi3K oGt5PbNHCb2kggSbybuAuZV+A0b5eY4xJpYvTF3gnpGWagWcbE0WfHweRJ2E86FI98XH C3K270ahq6roqCOJmODklCOvTbZgHqsbxCkRt+xnQvRskF7fI9J5oM/Li1WQcVLH/4jy sfLtuuBy71TX0ddoMtojvYtNtzbmnewVcIIHYiuvJRHCdZVaVe5d9TSEmiEz1XnpaD7X 6oE7CoAHd0+9vXGguGqIU8K/cSmS5DsVKldJgxbxViCYit+i6V8ycftnb+aH/+pYP7BN Gnkw== X-Gm-Message-State: AA6/9RliYBXrlO5T8Aowl5JfNDnt5Jrx27YXrU+RDkJO2PyG47Z1tJRBssAyWLtALYeePQ== X-Received: by 10.25.32.69 with SMTP id g66mr12973309lfg.15.1476114267376; Mon, 10 Oct 2016 08:44:27 -0700 (PDT) Received: from [10.100.64.17] ([217.30.88.7]) by smtp.gmail.com with ESMTPSA id 94sm6646687lja.10.2016.10.10.08.44.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Oct 2016 08:44:26 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160925124626.GI2840@zxy.spb.ru> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> <20161010142941.GV54003@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara From: Julien Charbon Message-ID: <52d634aa-639c-bef7-1f10-c46dbadc4d85@freebsd.org> Date: Mon, 10 Oct 2016 17:44:21 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161010142941.GV54003@zxy.spb.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2016 15:58:48 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB Content-Type: multipart/mixed; boundary="Sp9pwUnhtrmlVgl2LKFvD2WtVtJmggQHQ"; protected-headers="v1" From: Julien Charbon To: Slawa Olhovchenkov Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Message-ID: <52d634aa-639c-bef7-1f10-c46dbadc4d85@freebsd.org> Subject: Re: 11.0 stuck on high network load References: <20160925124626.GI2840@zxy.spb.ru> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> <20161010142941.GV54003@zxy.spb.ru> In-Reply-To: <20161010142941.GV54003@zxy.spb.ru> --Sp9pwUnhtrmlVgl2LKFvD2WtVtJmggQHQ Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi, On 10/10/16 4:29 PM, Slawa Olhovchenkov wrote: > On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote: >> On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote: >>> On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: >>>> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: >>>>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: >>>>> >>>>>> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED fla= g, the >>>>>> process continues and calls INP_WUNLOCK() here: >>>>>> >>>>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tc= p_subr.c#L1568 >>>>> >>>>> Look also to sys/netinet/tcp_timewait.c:488 >>>>> >>>>> And check other locks from r160549 >>>> >>>> You are right, and here the a fix proposal for this issue: >>>> >>>> Fix a double-free when an inp transitions to INP_TIMEWAIT state afte= r >>>> having been dropped >>>> https://reviews.freebsd.org/D8211 >>>> >>>> It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DRO= PPED >>>> inpcb should never be proceed further. >>>> >>>> Slawa, as you are the only one to reproduce this issue currently, c= ould >>>> test this patch? (And remove the temporary patch I did provided to = you >>>> before). >>>> >>>> I will wait for your tests results before pushing further. >>>> >>>> Thanks! >>>> >>>> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c >>>> index c72f01f..37f27e0 100644 >>>> --- a/sys/netinet/tcp_input.c >>>> +++ b/sys/netinet/tcp_input.c >>>> @@ -921,6 +921,16 @@ findpcb: >>>> goto dropwithreset; >>>> } >>>> INP_WLOCK_ASSERT(inp); >>>> + /* >>>> + * While waiting for inp lock during the lookup, another thr= ead >>>> + * can have droppedt the inpcb, in which case we need to lo= op back >>>> + * and try to find a new inpcb to deliver to. >>>> + */ >>>> + if (inp->inp_flags & INP_DROPPED) { >>>> + INP_WUNLOCK(inp); >>>> + inp =3D NULL; >>>> + goto findpcb; >>> >>> Are you sure about this goto? >>> Can this cause infinite loop by found same inpcb? >>> May be drop packet is more correct? >> >> Good question: Infinite loop is not possible here, as the next TCP >> hash lookup will return NULL or a fresh new and not dropped inp. You >=20 > I am not expert in this api and don't see cause of this: I am assume > hash lookup don't remove from hash returned args and I am don't see > any removing of this inp. Why hash lookup don't return same inp? >=20 > (assume this input patch interrupt callout code on the same CPU core). >=20 >> can check the current other usages of goto findpcb in tcp_input(). Th= e >> rational here being: >> >> - Behavior before the patch: If the inp we found was deleted then go= to >> findpcb. >> - Behavior after the patch: If the inp we found was deleted or dropp= ed >> then goto findpcb. >> >> I just prefer having the same behavior applied everywhere: If >> tcp_input() loses the inp lock race and the inp was deleted or dropped= >> then retry to find a new inpcb to deliver to. >> >> But you are right dropping the packet here will also fix the issue. >> >> Then the review process becomes quite helpful because people can argu= e: >> Dropping here is better because "blah", or goto findpcb is better >> because "bluh", etc. And at the review end you have a nice final patc= h. >> >> https://reviews.freebsd.org/D8211 >=20 > I am not sure, I am see to >=20 > sys/netinet/in_pcb.h:#define INP_DROPPED 0x04000000 /* p= rotocol drop flag */ >=20 > and think this is a flag 'all packets must be droped' On 10/10/16 4:29 PM, Slawa Olhovchenkov wrote: > On Mon, Oct 10, 2016 at 04:03:39PM +0200, Julien Charbon wrote: >> On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote: >>> On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: >>>> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: >>>>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: >>>>> >>>>>> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag, the >>>>>> process continues and calls INP_WUNLOCK() here: >>>>>> >>>>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_subr.= c#L1568 >>>>> >>>>> Look also to sys/netinet/tcp_timewait.c:488 >>>>> >>>>> And check other locks from r160549 >>>> >>>> You are right, and here the a fix proposal for this issue: >>>> >>>> Fix a double-free when an inp transitions to INP_TIMEWAIT state afte= r >>>> having been dropped >>>> https://reviews.freebsd.org/D8211 >>>> >>>> It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPPED >>>> inpcb should never be proceed further. >>>> >>>> Slawa, as you are the only one to reproduce this issue currently, could >>>> test this patch? (And remove the temporary patch I did provided to = you >>>> before). >>>> >>>> I will wait for your tests results before pushing further. >>>> >>>> Thanks! >>>> >>>> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c >>>> index c72f01f..37f27e0 100644 >>>> --- a/sys/netinet/tcp_input.c >>>> +++ b/sys/netinet/tcp_input.c >>>> @@ -921,6 +921,16 @@ findpcb: >>>> goto dropwithreset; >>>> } >>>> INP_WLOCK_ASSERT(inp); >>>> + /* >>>> + * While waiting for inp lock during the lookup, another thr= ead >>>> + * can have droppedt the inpcb, in which case we need to loop back >>>> + * and try to find a new inpcb to deliver to. >>>> + */ >>>> + if (inp->inp_flags & INP_DROPPED) { >>>> + INP_WUNLOCK(inp); >>>> + inp =3D NULL; >>>> + goto findpcb; >>> >>> Are you sure about this goto? >>> Can this cause infinite loop by found same inpcb? >>> May be drop packet is more correct? >> >> Good question: Infinite loop is not possible here, as the next TCP >> hash lookup will return NULL or a fresh new and not dropped inp. You > > I am not expert in this api and don't see cause of this: I am assume > hash lookup don't remove from hash returned args and I am don't see > any removing of this inp. Why hash lookup don't return same inp? > > (assume this input patch interrupt callout code on the same CPU core). > >> can check the current other usages of goto findpcb in tcp_input(). Th= e >> rational here being: >> >> - Behavior before the patch: If the inp we found was deleted then go= to >> findpcb. >> - Behavior after the patch: If the inp we found was deleted or dropp= ed >> then goto findpcb. >> >> I just prefer having the same behavior applied everywhere: If >> tcp_input() loses the inp lock race and the inp was deleted or dropped= >> then retry to find a new inpcb to deliver to. >> >> But you are right dropping the packet here will also fix the issue. >> >> Then the review process becomes quite helpful because people can argu= e: >> Dropping here is better because "blah", or goto findpcb is better >> because "bluh", etc. And at the review end you have a nice final patc= h. >> >> https://reviews.freebsd.org/D8211 > > I am not sure, I am see to > > sys/netinet/in_pcb.h:#define INP_DROPPED 0x04000000 /* protocol drop flag */ > > and think this is a flag 'all packets must be droped' Hm, I believe this flag means "this inp has been dropped by the TCP stack, so don't use it anymore". Actually this flag is better described in the function that sets it: "(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid future packet delivery or event notification when a socket remains open but TCP has closed." https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb= =2Ec#L1320 /* * in_pcbdrop() removes an inpcb from hashed lists, releasing its address and * port reservation, and preventing it from being returned by inpcb looku= ps. * * It is used by TCP to mark an inpcb as unused and avoid future packet * delivery or event notification when a socket remains open but TCP has * closed. This might occur as a result of a shutdown()-initiated TCP cl= ose * or a RST on the wire, and allows the port binding to be reused while still * maintaining the invariant that so_pcb always points to a valid inpcb until * in_pcbdetach(). * */ void in_pcbdrop(struct inpcb *inp) { inp->inp_flags |=3D INP_DROPPED; ... The classical example where "goto findpcb" is useful: You receive a new connection request with a TCP SYN packet and this packet is unlucky and reached a inp being dropped: - with "goto findpcb" approach, the next lookup will most likely find the LISTEN inp and start the TCP hand-shake as usual - with "drop the packet" approach, the TCP client will need to re-transmit a TCP SYN packet It is not because a packet was unlucky once that it deserves to be dropped. :) -- Julien --Sp9pwUnhtrmlVgl2LKFvD2WtVtJmggQHQ-- --jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX+7dZAAoJEKVlQ5Je6dhxLFgH/RZKNAZlyImT1Pcw5YGevSTZ LHAtq7x84dKDQrUWZcE5K8GYvXrpOm3uEjnWMfbc6BfPz7T7emBHC3Y4GgIJ4X29 d6khxTPsgvBFTetRwDkiet5Gk8OrI7t5W3NcXvLpFcAJkBVBQ9lXP5RKqhfWxhJE 3KejwpOAyDLVLMTaN08omHmS4J72pckewe+Ud8/rRm+G/H1xuIDuRbiQGrBMVf8R HW8e7mwotOx3sJ9JIBBDFYsQ5CDUVPUgfLcN3/U4vWtcIaxuUY8AbY1s/aIh7ltW RZfJCbcWrXmZcbrp+Yw2uq7010IpPkpHJi/LaudwPJzg2izXUU9tThDhqJU+F5g= =paCp -----END PGP SIGNATURE----- --jgP2QijCNPFwwniMmTVgoP1jMrJMrPvsB--