Date: Tue, 8 Aug 2017 14:33:52 +0300 From: Slawa Olhovchenkov <slw@zxy.spb.ru> To: Hans Petter Selasky <hps@selasky.org> Cc: Ben RUBSON <ben.rubson@gmail.com>, FreeBSD Net <freebsd-net@freebsd.org>, jch <jch@FreeBSD.org>, hiren <hiren@strugglingcoder.info>, FreeBSD Stable <freebsd-stable@freebsd.org> Subject: Re: mlx4en, timer irq @100%... (11.0 stuck on high network load ???) Message-ID: <20170808113352.GH18123@zxy.spb.ru> In-Reply-To: <c05c2b1c-b5a8-c39c-6dff-e6cc0d8642bf@selasky.org> References: <4C91C6E5-0725-42E7-9813-1F3ACF3DDD6E@gmail.com> <5840c25e-7472-3276-6df9-1ed4183078ad@selasky.org> <2ADA8C57-2C2D-4F97-9F0B-82D53EDDC649@gmail.com> <061cdf72-6285-8239-5380-58d9d19a1ef7@selasky.org> <92BEE83D-498F-47D5-A53C-39DCDC00A0FD@gmail.com> <5d8960d8-e1ff-8719-320f-d3ae84054714@selasky.org> <6B4A35F7-5694-4945-9575-19ADB678F9FA@gmail.com> <297a784a-3d80-b1a6-652e-a78621fe5a8b@selasky.org> <3ECCFBF1-18D9-4E33-8F39-0C366C3BB8B4@gmail.com> <c05c2b1c-b5a8-c39c-6dff-e6cc0d8642bf@selasky.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 08, 2017 at 10:31:33AM +0200, Hans Petter Selasky wrote:
> Here is the conclusion:
>
> The following code is going in an infinite loop:
>
>
> > for (;;) {
> > TW_RLOCK(V_tw_lock);
> > tw = TAILQ_FIRST(&V_twq_2msl);
> > if (tw == NULL || (!reuse && (tw->tw_time - ticks) > 0)) {
> > TW_RUNLOCK(V_tw_lock);
> > break;
> > }
> > KASSERT(tw->tw_inpcb != NULL, ("%s: tw->tw_inpcb == NULL",
> > __func__));
> >
> > inp = tw->tw_inpcb;
> > in_pcbref(inp);
> > TW_RUNLOCK(V_tw_lock);
> >
> > if (INP_INFO_TRY_RLOCK(&V_tcbinfo)) {
> >
> > INP_WLOCK(inp);
> > tw = intotw(inp);
> > if (in_pcbrele_wlocked(inp)) {
>
> in_pcbrele_wlocked() returns (1) because INP_FREED (16) is set in
> inp->inp_flags2. I guess you have invariants disabled, because the
> KASSERT() below should have caused a panic.
>
> > KASSERT(tw == NULL, ("%s: held last inp "
> > "reference but tw not NULL", __func__));
> > INP_INFO_RUNLOCK(&V_tcbinfo);
> > continue;
> > }
>
> This is a regression issue after:
>
> > commit 5630210a7f1dbbd903b77b2aef939cd47c63da58
> > Author: jch <jch@FreeBSD.org>
> > Date: Thu Oct 30 08:53:56 2014 +0000
> >
> > Fix a race condition in TCP timewait between tcp_tw_2msl_reuse() and
> > tcp_tw_2msl_scan(). This race condition drives unplanned timewait
> > timeout cancellation. Also simplify implementation by holding inpcb
> > reference and removing tcptw reference counting.
>
> Suggested fix attached.
Hmm, I am not sure, IMHO between
TW_RUNLOCK(V_tw_lock);
and
if (INP_INFO_TRY_WLOCK(&V_tcbinfo)) {
`inp` can be invalidated, freed and this pointer may be invalid?
> Index: sys/netinet/tcp_timewait.c
> ===================================================================
> --- sys/netinet/tcp_timewait.c (revision 321981)
> +++ sys/netinet/tcp_timewait.c (working copy)
> @@ -709,10 +709,11 @@
> INP_WLOCK(inp);
> tw = intotw(inp);
> if (in_pcbrele_wlocked(inp)) {
> - KASSERT(tw == NULL, ("%s: held last inp "
> - "reference but tw not NULL", __func__));
> INP_INFO_RUNLOCK(&V_tcbinfo);
> - continue;
> + if (tw == NULL)
> + continue;
> + else
> + break; /* INP_FREED flag is set */
> }
>
> if (tw == NULL) {
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170808113352.GH18123>
