From owner-freebsd-bugs@FreeBSD.ORG Thu Feb 1 23:40:20 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 17F2C16A407 for ; Thu, 1 Feb 2007 23:40:20 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 5950213C4A7 for ; Thu, 1 Feb 2007 23:40:19 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l11NeJoo035201 for ; Thu, 1 Feb 2007 23:40:19 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l11NeJWQ035200; Thu, 1 Feb 2007 23:40:19 GMT (envelope-from gnats) Resent-Date: Thu, 1 Feb 2007 23:40:19 GMT Resent-Message-Id: <200702012340.l11NeJWQ035200@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, dave baukus Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A702216A401 for ; Thu, 1 Feb 2007 23:39:09 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [69.147.83.33]) by mx1.freebsd.org (Postfix) with ESMTP id 9CE3813C441 for ; Thu, 1 Feb 2007 23:39:09 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l11Nd9BF049933 for ; Thu, 1 Feb 2007 23:39:09 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id l11Nd9nf049932; Thu, 1 Feb 2007 23:39:09 GMT (envelope-from nobody) Message-Id: <200702012339.l11Nd9nf049932@www.freebsd.org> Date: Thu, 1 Feb 2007 23:39:09 GMT From: dave baukus To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.0 Cc: Subject: kern/108670: TCP connection ETIMEDOUT X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Feb 2007 23:40:20 -0000 >Number: 108670 >Category: kern >Synopsis: TCP connection ETIMEDOUT >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Feb 01 23:40:18 GMT 2007 >Closed-Date: >Last-Modified: >Originator: dave baukus >Release: FreeBSD6.1 >Organization: FNC >Environment: FreeBSD krakatoa 6.1-RELEASE FreeBSD 6.1-RELEASE #23: Wed Aug 9 13:33:37 CDT 2006 dbaukus@krakatoa:/home/dbaukus/kern/i386/compile/KRAKATOA-FNC i386 >Description: There is a bug tcp_output() for at least freeBSD6.1 that causes a perfectly good TCP to be dropped by its retransmit timer; the application receives ETIMEDOUT. Consider a TCP that never transmits (the receive end of the ttcp utility is an example), while the TCP is established snd_max == snd_una == snd_nxt == (isr + 1) and the retransmit timer should never be started. If the retransmit timer is started then it is never stopped by tcp_input/tcp_out because snd_max == snd_una == snd_nxt (always). Once started the timer continues its count up till tp->t_rxtshift == 12 and the connection that never transmitted gets falsely killed. The bug is to blindly rely on the return value of ip_output(). If ip_output() returns ENOBUFS then the retransmit timer is activated: >From the end of tcp_output(): out: SOCKBUF_UNLOCK_ASSERT(&so->so_snd); /* Check gotos. */ if (error == ENOBUFS) { if (!callout_active(tp->tt_rexmt) && !callout_active(tp->tt_persist)) callout_reset(tp->tt_rexmt, tp->t_rxtcur, tcp_timer_rexmt, tp); tp->snd_cwnd = tp->t_maxseg; return (0); } My simple minded fix would be not to start the retransmit timer; if tcp_output() wanted to time this transmit it would have started the timer up above. This ETIMEDOUT problem is easily recreated on any old machine using a single slow ethernet device and the ttcp test utility. First, fire up a couple ttcp receivers. Second, flood the same interface with enough ttcp transmitters to cause the driver's transmit ring and interface queue to back up. Eventually, one of the ttcp receives will get ENOBUFS from ip_output() and the retransmit timer will be wrongly activated for a pure ACK segment. I was able to do it w/ the following on freeBSD6.1: box1: ttcp -s -l 16384 -p 9444 -v -b 128000 -r ttcp -s -l 16384 -p 9445 -v -b 128000 -r ttcp -s -n 6553600 -l 4096 -p 9446 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 333 -p 9447 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 8192 -p 9448 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 333 -p 9449 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 8192 -p 9450 -v -b 128000 -t 192.168.222.13 box2: ttcp -s -n 6553600 -l 8192 -p 9444 -v -b 128000 -t 192.168.222.222 ttcp -s -n 9999999 -l 128 -p 9445 -v -b 128000 -t 192.168.222.222 ttcp -s -l 16384 -p 9446 -v -b 128000 -r ttcp -s -l 16384 -p 9447 -v -b 128000 -r ttcp -s -l 16384 -p 9448 -v -b 128000 -r ttcp -s -l 16384 -p 9449 -v -b 128000 -r ttcp -s -l 16384 -p 9450 -v -b 128000 -r >How-To-Repeat: I was able to do it w/ the following on freeBSD6.1: box1: ttcp -s -l 16384 -p 9444 -v -b 128000 -r ttcp -s -l 16384 -p 9445 -v -b 128000 -r ttcp -s -n 6553600 -l 4096 -p 9446 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 333 -p 9447 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 8192 -p 9448 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 333 -p 9449 -v -b 128000 -t 192.168.222.13 ttcp -s -n 9999999 -l 8192 -p 9450 -v -b 128000 -t 192.168.222.13 box2: ttcp -s -n 6553600 -l 8192 -p 9444 -v -b 128000 -t 192.168.222.222 ttcp -s -n 9999999 -l 128 -p 9445 -v -b 128000 -t 192.168.222.222 ttcp -s -l 16384 -p 9446 -v -b 128000 -r ttcp -s -l 16384 -p 9447 -v -b 128000 -r ttcp -s -l 16384 -p 9448 -v -b 128000 -r ttcp -s -l 16384 -p 9449 -v -b 128000 -r ttcp -s -l 16384 -p 9450 -v -b 128000 -r >Fix: Do not start the retransmit timer based on error codes from ip_output() ? >Release-Note: >Audit-Trail: >Unformatted: