From owner-freebsd-stable@freebsd.org Mon Oct 10 14:03:54 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AD81FC07A63 for ; Mon, 10 Oct 2016 14:03:54 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: from mail-lf0-f65.google.com (mail-lf0-f65.google.com [209.85.215.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2FC541CD for ; Mon, 10 Oct 2016 14:03:53 +0000 (UTC) (envelope-from julien.charbon@gmail.com) Received: by mail-lf0-f65.google.com with SMTP id x79so6781253lff.2 for ; Mon, 10 Oct 2016 07:03:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=i2mYNuOQL8ywpSv9012bqUlqefmYluqHwG6qk56l4nU=; b=h9aZlyVPbzQ6NHhu/IOXzQAKKAhBoTSxQod7zOJ7uNvK1wa9tO3KfhuLpOnGaPXTx0 qSOyMIvgv6le9xhUKP+0/2qSSvF98bImo8DLqrnBpcUVTy6BJg9ZgRE4+vZgg0ZP3uEM 9CJasfeG9eIuHYpj4qjy/vOvAVOd6gQcdL9vgXtqgmM6B1ZJaJaCraS7rxZYTM//cX8M +RXRKDISQN0pOg4x5l6Q2F2282x69OKhcMVZ+iPi/T2VM4Kvc3enAsIV56tD80hEDqXn rO/J6NCzFtaqKw47EKekRFLSf21cG4QQxRAFEkyGWs1EKKuv21JtqsJvVTrQqzTxvm+T eX1A== X-Gm-Message-State: AA6/9RmnRlwb/EFsJyGVk8L+CZRMHxuq/cM7UsV8/RX7UqIsPjMO3ExjUnfjyV4LOTyNTQ== X-Received: by 10.25.35.6 with SMTP id j6mr9469918lfj.147.1476108225573; Mon, 10 Oct 2016 07:03:45 -0700 (PDT) Received: from [10.100.64.17] ([217.30.88.7]) by smtp.gmail.com with ESMTPSA id s63sm6515864lja.49.2016.10.10.07.03.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Oct 2016 07:03:44 -0700 (PDT) Subject: Re: 11.0 stuck on high network load To: Slawa Olhovchenkov References: <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara From: Julien Charbon Message-ID: <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> Date: Mon, 10 Oct 2016 16:03:39 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161010133220.GU54003@zxy.spb.ru> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2016 14:03:54 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX Content-Type: multipart/mixed; boundary="rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF"; protected-headers="v1" From: Julien Charbon To: Slawa Olhovchenkov Cc: Konstantin Belousov , freebsd-stable@FreeBSD.org, hiren panchasara Message-ID: <23f1200e-383e-befb-b76d-c88b3e1287b0@freebsd.org> Subject: Re: 11.0 stuck on high network load References: <20160923200143.GG2840@zxy.spb.ru> <20160925124626.GI2840@zxy.spb.ru> <20160926172159.GA54003@zxy.spb.ru> <62453d9c-b1e4-1129-70ff-654dacea37f9@gmail.com> <20160928115909.GC54003@zxy.spb.ru> <20161006111043.GH54003@zxy.spb.ru> <1431484c-c00e-24c5-bd76-714be8ae5ed5@freebsd.org> <20161010133220.GU54003@zxy.spb.ru> In-Reply-To: <20161010133220.GU54003@zxy.spb.ru> --rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Slawa, On 10/10/16 3:32 PM, Slawa Olhovchenkov wrote: > On Mon, Oct 10, 2016 at 01:26:12PM +0200, Julien Charbon wrote: >> On 10/6/16 1:10 PM, Slawa Olhovchenkov wrote: >>> On Thu, Oct 06, 2016 at 09:28:06AM +0200, Julien Charbon wrote: >>> >>>> 2. thread1: In tcp_close() the inp is marked with INP_DROPPED flag,= the >>>> process continues and calls INP_WUNLOCK() here: >>>> >>>> https://github.com/freebsd/freebsd/blob/releng/11.0/sys/netinet/tcp_= subr.c#L1568 >>> >>> Look also to sys/netinet/tcp_timewait.c:488 >>> >>> And check other locks from r160549 >> >> You are right, and here the a fix proposal for this issue: >> >> Fix a double-free when an inp transitions to INP_TIMEWAIT state after >> having been dropped >> https://reviews.freebsd.org/D8211 >> >> It basically enforces in_pcbdrop() logic in tcp_input(): A INP_DROPP= ED >> inpcb should never be proceed further. >> >> Slawa, as you are the only one to reproduce this issue currently, cou= ld >> test this patch? (And remove the temporary patch I did provided to yo= u >> before). >> >> I will wait for your tests results before pushing further. >> >> Thanks! >> >> diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c >> index c72f01f..37f27e0 100644 >> --- a/sys/netinet/tcp_input.c >> +++ b/sys/netinet/tcp_input.c >> @@ -921,6 +921,16 @@ findpcb: >> goto dropwithreset; >> } >> INP_WLOCK_ASSERT(inp); >> + /* >> + * While waiting for inp lock during the lookup, another threa= d >> + * can have droppedt the inpcb, in which case we need to loop= back >> + * and try to find a new inpcb to deliver to. >> + */ >> + if (inp->inp_flags & INP_DROPPED) { >> + INP_WUNLOCK(inp); >> + inp =3D NULL; >> + goto findpcb; >=20 > Are you sure about this goto? > Can this cause infinite loop by found same inpcb? > May be drop packet is more correct? Good question: Infinite loop is not possible here, as the next TCP hash lookup will return NULL or a fresh new and not dropped inp. You can check the current other usages of goto findpcb in tcp_input(). The rational here being: - Behavior before the patch: If the inp we found was deleted then goto findpcb. - Behavior after the patch: If the inp we found was deleted or dropped then goto findpcb. I just prefer having the same behavior applied everywhere: If tcp_input() loses the inp lock race and the inp was deleted or dropped then retry to find a new inpcb to deliver to. But you are right dropping the packet here will also fix the issue. Then the review process becomes quite helpful because people can argue: Dropping here is better because "blah", or goto findpcb is better because "bluh", etc. And at the review end you have a nice final patch. https://reviews.freebsd.org/D8211 -- Julien --rAjpVOoPH2mHjtbjA6bhunC7dS7rgcUnF-- --Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJX+5+/AAoJEKVlQ5Je6dhx6lQH/Awtgic2tUHJdoFJkzB+DWng pMiInCMiaSkF978ngUgRXjltqLVfb1YBR0Odn7UvbY3W6scOyEEUqO0aIyVXS1mY FSoiQsBlJaHRmKth4RaUPXrBrktHgY2IzVSTNITlfZKSDg0pKjRJalNiQWjyAUr0 LmkmV58/x0rNAXKi/4ZLmmAjgjnMk5n4qVwIoXuA2H12KbE+ZbFu1WIB3FsOnr+i xlN07KtRxuN84obr0UhuanEsnFw2kITr8QiRe5j9yRN+qRMr80awv6Px1cpDsokP h4VsbW4ESmf5w1C3OqqETeiXpPlnF5JPnanw0iX1x/2jInD+fOmYRfFsHeoCmuU= =qFSj -----END PGP SIGNATURE----- --Uxu93BPJNKACrFgUljh7rp1RMd0auWIuX--