Date: Mon, 25 Sep 2006 08:56:58 -0700 From: John-Mark Gurney <gurney_j@resnet.uoregon.edu> To: Dan Nelson <dnelson@allantgroup.com> Cc: mohans@FreeBSD.org, Andre Oppermann <andre@FreeBSD.org>, current@FreeBSD.org, net@FreeBSD.org Subject: Re: odd TCP rtt/retransmit timeout issue... Message-ID: <20060925155658.GB80527@funkthat.com> In-Reply-To: <20060925154659.GE73717@dan.emsphone.com> References: <20060925095745.GA80527@funkthat.com> <20060925154659.GE73717@dan.emsphone.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Dan Nelson wrote this message on Mon, Sep 25, 2006 at 10:46 -0500: > In the last episode (Sep 25), John-Mark Gurney said: > > I was brining up another interface that I just added to /etc/rc.conf > > and ran the command /etc/rc.d/netif start to initalize it... But > > then my connection never came back.... I found that the shell was > > still active as I could type commands like sleep 5, and another > > session's w would see sleep 5 run on the session... even filling up > > the send-q w/ 32k of data didn't get the HEAD box to send any data to > > the client... > > > > With the help of silby, I managed to find that the t_rxtcur value in > > the tcpcb was getting a very large value. The session that hung had > > a retransmit timeout of 19 days... This led us to find that the > > TCPT_RANGESET macro was letting very large tvmin values override the > > more sane tvmax values due to an extra else. I have added that so we > > shouldn't see any more multi day timeouts, but we still apparently > > have a problem where the rtt value calculated is wildly incorrect... > > > > It appears that each connection will get a different "random" rtt > > values... From a few connections to my machine: > > (kgdb) print ((struct tcpcb *)0xc3a34af8)->t_rxtcur > > $3 = 64000 > > (kgdb) print ((struct tcpcb *)0xc3a3457c)->t_rxtcur > > $6 = 1662654093 > > (kgdb) print ((struct tcpcb *)0xc3a343a8)->t_rxtcur > > $12 = 1358 > > (kgdb) print ((struct tcpcb *)0xc3a9e1d4)->t_rxtcur > > $17 = 203 > > (kgdb) print ((struct tcpcb *)0xc3a9e000)->t_rxtcur > > $19 = 284155863 > > Do you have net.inet.tcp.inflight.enable=1 ? You might be hitting Yes. > something related to kern/75122. You'll want to pull the raw gnats > repository file to read it; the query-pr.cgi web interface doesn't > parse the file right and it loses all the replies. Doesn't look like it... I just disabled inflight, and my first connection got: (kgdb) print ((struct tcpcb *)0xc3a4857c)->t_rxtcur $1 = 921479340 -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060925155658.GB80527>