From owner-freebsd-current@FreeBSD.ORG Mon Nov 5 09:19:26 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A00C3BF9 for ; Mon, 5 Nov 2012 09:19:26 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 07D0F8FC12 for ; Mon, 5 Nov 2012 09:19:25 +0000 (UTC) Received: (qmail 80245 invoked from network); 5 Nov 2012 10:55:06 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 5 Nov 2012 10:55:06 -0000 Message-ID: <5097849A.30603@freebsd.org> Date: Mon, 05 Nov 2012 10:19:22 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Manfred Antar Subject: Re: weird network problems on current since 10/28/2012 References: <201211031740.qA3HeqVX001622@pozo.com> <201211040113.qA41DfLn001577@pozo.com> <50964FBB.4010600@andric.com> <50967453.5090503@freebsd.org> <5096CCDE.7090305@fgznet.ch> <5096E4D4.6000000@freebsd.org> <201211050139.qA51daHj019870@pozo.com> In-Reply-To: <201211050139.qA51daHj019870@pozo.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Nov 2012 09:19:26 -0000 On 05.11.2012 02:39, Manfred Antar wrote: > At 01:57 PM 11/4/2012, you wrote: >> On 04.11.2012 21:15, Andreas Tobler wrote: >>> On 04.11.12 14:57, Andre Oppermann wrote: >>>> On 04.11.2012 13:11, Kim Culhan wrote: >>>>> On Sun, November 4, 2012 6:21 am, Dimitry Andric wrote: >>>>>> On 2012-11-04 02:13, Manfred Antar wrote: >>>>>>> At 03:29 PM 11/3/2012, Adrian Chadd wrote: >>>>>> After the commit, there was a small discussion thread on svn-src-head@ >>>>>> about the possible problems with the approach. Maybe you are >>>>>> experiencing those? >>>>>> >>>>>> As the commit message says, you should be able to turn the feature off >>>>>> using: >>>>>> >>>>>> sysctl net.inet.tcp.experimental.initcwnd10=0 >>>>>> >>>>>> Can you please try that, and see if the problems go away? >>>>> >>>>> FWIW this did not make the problem go away on 2 machines. >>>> >>>> Yes, this very much looks like the same problem as in PR/173309. >>>> >>>> Please try the attached patch. It fixes the connection hang issue. >>>> There may be a second issue I debugging currently base on the feedback >>> >from Fabian Keil. >>> >>> I jump into this thread since I have a similar network issue. >>> >>> My scenario: >>> >>> 'make installkernel DESTDIR=/netboot/test' to a nfs mounted drive. >>> The nfs drive on the server is an ufs fs. No zfs. >>> >>> Up to r242261 I can install the kernel (or world) in a fluent way to the >>> nfs destination. >>> >>> >From r242262 it doesn't work smooth. I have stalls, sometimes my >>> patience is not enough and I kill the process. >>> >>> I tried 242266 with the above mentioned patch. No real success. >>> >>> How can I help/test? >> >> Please try the attach patch instead of the above mentioned one. >> >> -- >> Andre >> >> Index: netinet/tcp_output.c >> =================================================================== >> --- netinet/tcp_output.c (revision 242577) >> +++ netinet/tcp_output.c (working copy) >> @@ -228,7 +228,7 @@ >> tso = 0; >> mtu = 0; >> off = tp->snd_nxt - tp->snd_una; >> - sendwin = min(tp->snd_wnd, tp->snd_cwnd); >> + sendwin = ulmax(ulmin(tp->snd_wnd - off, tp->snd_cwnd), 0); >> >> flags = tcp_outflags[tp->t_state]; >> /* >> @@ -249,7 +249,7 @@ >> (p = tcp_sack_output(tp, &sack_bytes_rxmt))) { >> long cwin; >> >> - cwin = min(tp->snd_wnd, tp->snd_cwnd) - sack_bytes_rxmt; >> + cwin = ulmin(tp->snd_wnd - off, tp->snd_cwnd) - sack_bytes_rxmt; >> if (cwin < 0) >> cwin = 0; >> /* Do not retransmit SACK segments beyond snd_recover */ >> @@ -355,7 +355,7 @@ >> * sending new data, having retransmitted all the >> * data possible in the scoreboard. >> */ >> - len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd) >> + len = ((long)ulmin(so->so_snd.sb_cc, tp->snd_wnd - off) >> - off); >> /* >> * Don't remove this (len > 0) check ! > > This doesn't seem to make a difference. > I have a ssh window thats been trying to connect for the past 5 minutes. > This is on a local network 192.168.0.4 >===========SSH==============> 192.168.0.5 > Also pop from the same machines endless trying to connect. > Hopefully this mail will get thru , otherwise i will need to reboot to old kernel I've backed out the change with r242601 as it exhibits still too many problems. I'll fix these problems in the next days but in the mean time HEAD should be in a working state. I'm sorry for the trouble. -- Andre