From owner-freebsd-net Wed Oct 23 18:45: 0 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B13937B401 for ; Wed, 23 Oct 2002 18:44:58 -0700 (PDT) Received: from isilon.com (isilon.com [65.101.129.58]) by mx1.FreeBSD.org (Postfix) with ESMTP id 049C443E42 for ; Wed, 23 Oct 2002 18:44:58 -0700 (PDT) (envelope-from bbaumann@isilon.com) Received: from localhost (localhost [127.0.0.1]) by isilon.com (8.12.2/8.11.1) with ESMTP id g9O1iv3C041120 for ; Wed, 23 Oct 2002 18:44:57 -0700 (PDT) (envelope-from bbaumann@isilon.com) Date: Wed, 23 Oct 2002 18:44:57 -0700 (PDT) From: Bill Baumann To: freebsd-net@FreeBSD.ORG Subject: tcp_input's header prediction and a collapsing send window In-Reply-To: <20021015115315.U7412-100000@mammoth.eat.frenchfries.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org I'm experiencing a bug where snd_wnd collapses. I see snd_wnd approach zero even though data is sent/received and ack'ed successfully. After taking a close look at tcp_input, I think I see a senario where this could happen. Say header prediction handles ~2 GB of data without problems, then a retransmission happens. snd_wnd starts collapsing as it should. The header prediction code is correctly skipped as the snd_wnd no long matches the advertised window. We recover from the retransmission, *BUT* the code that reopens window is skipped because of rolled over sequence numbers. In the ack processing code (step 6), the variable snd_wl1 tracks the newest sequence number that we've seen. It helps prevent snd_wnd from being reopened on re-transmitted data. If snd_wl1 is greater than received sequence #, we skip it. This is fine unless we're 2^31 bytes ahead and SEQ_LT says we're behind. Since snd_wl1 is only updated if the condition is true -- we're stuck. snd_wl1 is only updated with in SYN/FIN processing code and in step 6. So if we process 2GB in the header prediction code -- where the step 6 never executes, and then somehow reach step 6. snd_wnd collapses and tcp_output stops sending. I have a trace mechanism that dumps various tcp_input variables that corroborates this theory. I have lined this up with tcpdump. The trace shows snd_wnd collapsing and snd_wl1 > th_seq even as healthy traffic is transmitted and received. The outcome is a halted transmitter. Possible remedy: update snd_wl1 in the header prediction code. What do you all think? Is this real? Or am I missing something? Regards, Bill Baumann To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message