From nobody Mon Mar 18 21:01:28 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Tz6j70tPnz5F0jt; Mon, 18 Mar 2024 21:01:31 +0000 (UTC) (envelope-from glebius@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Tz6j70C1fz4P0K; Mon, 18 Mar 2024 21:01:31 +0000 (UTC) (envelope-from glebius@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710795691; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gg1RjIp8mwiSaxQ/2AAJtWTaI4RJI/I+/aEogie+sdQ=; b=pdJ5gAcHXxjvvjdCQkCDLFe+ZyHuBgWi3NKnO6HXkP/VRE+WuX3VZb6r1ZKjVWh9pu3UVl mRzTEe8aoRk+y2Y7QqieZg5Iknw+CwG2C4/E6t1PL+xOYGNtjzaFGleu1/O43GRzuvTCdl TPnu1N0OptcJuuSIXPKo+Yy3c97SbBfYuR19e/8mLr1NkZF8gso0magySUZv3rEuQ8KBMi iZdJ8lDv39VxGf98EL314IUAdc348ewYsRz/eWIG97GLn9miy/HhkMlgCD+VB/U6gujGG5 6zJmEEFt4x60PWi0GDPb+VnmWxNZVckH4fmmOZq2NW8j9tQ92Ggv17YxETu37w== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1710795691; a=rsa-sha256; cv=none; b=o+Ijj9d64kXee8jgUaMvamrP9Sjzp1hdOnwEfWJLvcsLLTuZYMOSHvo2o0alQTbqsYOipk caAEoxWd0pikcyRDs4Zl36P239i/RU2x+yo210OM6i0TLrwPCWdBBskETd3z2gKS8B7AoK 69trVsc0y9FTCs6eErb57omNyto0CZdUtI7GjTw+sC2dsz9wdG9GVm/PruFky+/Ep5Z5Jz Zrnme+uJoQp5sZsIMQcBxvVq2cgLzyY8nmYprrZttwJM/CzeO9FdrI2Ugm8g8KLeHUMQnh hoLjxOx2O5jsbbIzwjzuUxlcC16aOEW+TxwsD3JW4/ZhfWdGPxb/0sh3iLrWoQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1710795691; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gg1RjIp8mwiSaxQ/2AAJtWTaI4RJI/I+/aEogie+sdQ=; b=unISR5GF2XLZEG4RxBm04xhgeQuARiDZzL3XNpWfLKCMB+8D/5k/GBSNR3xuojtdLm/fZu CyYcSxYudfVR4U70vRbQAM3GubDALsf6LQmW1Fwzi9TPaZYD5N6P2HiRuAF0RnIBJhGFt/ erkK9cFM2OipVqvaZSuGrbO4Hl2RoJpRK5eZg0qSCfqu7tFrobGBdzV3huFOYcqfoiYZSr GcxpRVNabIslhpShnjTOTeeF+VT+KU1qEFdn76j3jEAY2OMtIKcaRXTsm2Kcd1RdNwh1nj qehrOP3wpidfz47cmpPsf71FsMEaUIYwe6JhfrBF6urfbA4ZlZCFgX9fb8FxYA== Received: from cell.glebi.us (glebi.us [162.251.186.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: glebius) by smtp.freebsd.org (Postfix) with ESMTPSA id 4Tz6j64JpZzRcY; Mon, 18 Mar 2024 21:01:30 +0000 (UTC) (envelope-from glebius@freebsd.org) Date: Mon, 18 Mar 2024 14:01:28 -0700 From: Gleb Smirnoff To: dev-commits-src-main@freebsd.org, current@freebsd.org Subject: Re: git: e34ea0196f44 - main - tcp: clear all TCP timers in tcp_timer_stop() when in callout Message-ID: References: <202403182057.42IKvAad009676@gitrepo.freebsd.org> List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <202403182057.42IKvAad009676@gitrepo.freebsd.org> Hi! This commit is supposed to fix a problem we do not have a reproducer for. The problem was that sometimes a TCP connection enters tcp_discardcb() with a scheduled timer. A temporary patch 57e27ff07aff was committed to mask the problem. This patch is supposed to fix the root cause. However, again, we were not able to reproduce the problem reliably and thus were not able to test that the just committed patch indeed fixes the root cause. But the assertion is put back to tcp_discardcb(). In case you got a panic in tcp_discardcb(), please email me ASAP. On Mon, Mar 18, 2024 at 08:57:10PM +0000, Gleb Smirnoff wrote: T> The branch main has been updated by glebius: T> T> URL: https://cgit.FreeBSD.org/src/commit/?id=e34ea0196f4497d3f3939025aff3a89ee77b652e T> T> commit e34ea0196f4497d3f3939025aff3a89ee77b652e T> Author: Gleb Smirnoff T> AuthorDate: 2024-03-18 20:57:00 +0000 T> Commit: Gleb Smirnoff T> CommitDate: 2024-03-18 20:57:00 +0000 T> T> tcp: clear all TCP timers in tcp_timer_stop() when in callout T> T> When a TCP callout decides to disable self, e.g. tcp_timer_2msl() calling T> tcp_close(), we must also clear all other possible timers. Otherwise, T> upon return, the callout would be scheduled again in tcp_timer_enter(). T> T> Revert 57e27ff07aff, which was a temporary partial revert of otherwise T> correct 62d47d73b7eb, that exposed the problem being fixed now. Add an T> extra assertion in tcp_timer_enter() to check we aren't arming callout for T> a closed connection. T> T> Reviewed by: rscheff T> --- T> sys/netinet/tcp_subr.c | 3 +-- T> sys/netinet/tcp_timer.c | 6 ++++-- T> 2 files changed, 5 insertions(+), 4 deletions(-) T> T> diff --git a/sys/netinet/tcp_subr.c b/sys/netinet/tcp_subr.c T> index f618bc1ba04b..a6f84c297688 100644 T> --- a/sys/netinet/tcp_subr.c T> +++ b/sys/netinet/tcp_subr.c T> @@ -2395,10 +2395,9 @@ tcp_discardcb(struct tcpcb *tp) T> #endif T> T> INP_WLOCK_ASSERT(inp); T> + MPASS(!callout_active(&tp->t_callout)); T> MPASS(TAILQ_EMPTY(&tp->snd_holes)); T> T> - tcp_timer_stop(tp); T> - T> /* free the reassembly queue, if any */ T> tcp_reass_flush(tp); T> T> diff --git a/sys/netinet/tcp_timer.c b/sys/netinet/tcp_timer.c T> index ed50659abf8e..785f68be5621 100644 T> --- a/sys/netinet/tcp_timer.c T> +++ b/sys/netinet/tcp_timer.c T> @@ -881,6 +881,7 @@ tcp_timer_enter(void *xtp) T> if (tp_valid) { T> tcp_bblog_timer(tp, which, TT_PROCESSED, 0); T> if ((which = tcp_timer_next(tp, &precision)) != TT_N) { T> + MPASS(tp->t_state > TCPS_CLOSED); T> callout_reset_sbt_on(&tp->t_callout, T> tp->t_timers[which], precision, tcp_timer_enter, T> tp, inp_to_cpuid(inp), C_ABSOLUTE); T> @@ -939,8 +940,7 @@ tcp_timer_active(struct tcpcb *tp, tt_which which) T> /* T> * Stop all timers associated with tcpcb. T> * T> - * Called only on tcpcb destruction. The tcpcb shall already be dropped from T> - * the pcb lookup database and socket is not losing the last reference. T> + * Called when tcpcb moves to TCPS_CLOSED. T> * T> * XXXGL: unfortunately our callout(9) is not able to fully stop a locked T> * callout even when only two threads are involved: the callout itself and the T> @@ -963,6 +963,8 @@ tcp_timer_stop(struct tcpcb *tp) T> T> stopped = callout_stop(&tp->t_callout); T> MPASS(stopped == 0); T> + for (tt_which i = 0; i < TT_N; i++) T> + tp->t_timers[i] = SBT_MAX; T> } else while(__predict_false(callout_stop(&tp->t_callout) == 0)) { T> INP_WUNLOCK(inp); T> kern_yield(PRI_UNCHANGED); -- Gleb Smirnoff