From owner-freebsd-current@FreeBSD.ORG Tue Aug 30 03:05:57 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 96A2116A41F; Tue, 30 Aug 2005 03:05:57 +0000 (GMT) (envelope-from demizu@dd.iij4u.or.jp) Received: from r-dd.iij4u.or.jp (r-dd.iij4u.or.jp [210.130.0.70]) by mx1.FreeBSD.org (Postfix) with ESMTP id F3B9643D46; Tue, 30 Aug 2005 03:05:56 +0000 (GMT) (envelope-from demizu@dd.iij4u.or.jp) Received: from localhost (h124.p049.iij4u.or.jp [210.130.49.124]) by r-dd.iij4u.or.jp (4U-MR/r-dd) id j7U35oTn020072; Tue, 30 Aug 2005 12:05:51 +0900 (JST) Date: Tue, 30 Aug 2005 12:06:04 +0900 (JST) Message-Id: <20050830.120604.97293728.Noritoshi@Demizu.ORG> From: Noritoshi Demizu To: Paul Saab In-Reply-To: <431389C1.4080805@freebsd.org> References: <20050829221700.GC1118@unixpages.org> <431389C1.4080805@freebsd.org> X-Mailer: Mew version 4.1 on Emacs 21 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: current@freebsd.org Subject: Re: panic: sackhint rexmit == 0 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Aug 2005 03:05:57 -0000 > > #23 0xc063ae06 in tcp_input (m=0xc1e56400, off0=20) > > at /usr/home/build/src/sys/netinet/tcp_input.c:1915 > > 1915 KASSERT(tp->sackhint. > Should already be fixed in -current. The change of tcp_input.c rev 1.283 does not *fix* the problem. It is just a workaround of the problem. I believe the real problem is that tp->snd_cwnd, tp->snd_ssthresh, tp->snd_recover, and tp->snd_nxt are falsely recovered by an algorithm using tp->t_badrxtwin in the following scenario. 1. An Retransmission Timeout occurs. Now, t_rxtshift is 1. (snd_cwnd, snd_ssthresh, snd_recover are saved in tcp_timer_rexmt().) 2. Before t_rxtshift is reset to zero, Fast Retransmit is triggered and TCP enters Fast Recovery. (Note: t_rxtshift is reset to zero by tcp_xmit_timer() when a new RTT measurement is taken. If my memory serves correctly, this behavior is specified in RFC2988). 3. A partial ACK or a full ACK is received before "ticks" reaches at tp->t_badrxtwin. In this case, lines 2047-2056 in tcp_input.c 1.283 recovers snd_cwnd, snd_ssthresh, snd_recover and snd_nxt. (Note: t_rxtshift has not been reset to zero. It will be reset at line 2074 or 2076 of tcp_input.c 1.283.) I believe those variables must not be recovered in this case. Since snd_recover is recovered, snd_recover becomes smaller than the actual value. Hence, the condition at line 2134 of tcp_input.c 1.283 falsely becomes true, and TCP falsely exits Fast Recovery. It breaks internal states of TCP SACK. In tcp_input.c 1.282 or before, the possible corruption was detected by the lines removed in the change of tcp_input.c 1.283. Since the check was not so significant, the check was removed as a workaround. I believe the following change fixes the problem by avoiding step 3 above. Introducing a new flag indicating whether t_badrxtwin is valid would be a better solution because a TCP connection may live longer than 2^32 ticks of time. <> 1915: ENTER_FASTRECOVERY(tp); 1916: tp->snd_recover = tp->snd_max; 1917: callout_stop(tp->tt_rexmt); 1918: tp->t_rtttime = 0; + tp->t_badrxtwin = ticks; 1919: if (tp->sack_enable) { Regards, Noritoshi Demizu