From owner-freebsd-current@FreeBSD.ORG Mon Sep 25 09:57:57 2006 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3C75C16A407; Mon, 25 Sep 2006 09:57:57 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8FE1843D88; Mon, 25 Sep 2006 09:57:45 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (3t55est8alq91ii9@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id k8P9vjEQ087117; Mon, 25 Sep 2006 02:57:45 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id k8P9vjVV087116; Mon, 25 Sep 2006 02:57:45 -0700 (PDT) (envelope-from jmg) Date: Mon, 25 Sep 2006 02:57:45 -0700 From: John-Mark Gurney To: current@FreeBSD.org, net@FreeBSD.org Message-ID: <20060925095745.GA80527@funkthat.com> Mail-Followup-To: current@FreeBSD.org, net@FreeBSD.org, Andre Oppermann , mohans@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: Andre Oppermann , mohans@FreeBSD.org Subject: odd TCP rtt/retransmit timeout issue... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 25 Sep 2006 09:57:57 -0000 I was brining up another interface that I just added to /etc/rc.conf and ran the command /etc/rc.d/netif start to initalize it... But then my connection never came back.... I found that the shell was still active as I could type commands like sleep 5, and another session's w would see sleep 5 run on the session... even filling up the send-q w/ 32k of data didn't get the HEAD box to send any data to the client... With the help of silby, I managed to find that the t_rxtcur value in the tcpcb was getting a very large value. The session that hung had a retransmit timeout of 19 days... This led us to find that the TCPT_RANGESET macro was letting very large tvmin values override the more sane tvmax values due to an extra else. I have added that so we shouldn't see any more multi day timeouts, but we still apparently have a problem where the rtt value calculated is wildly incorrect... It appears that each connection will get a different "random" rtt values... From a few connections to my machine: (kgdb) print ((struct tcpcb *)0xc3a34af8)->t_rxtcur $3 = 64000 (kgdb) print ((struct tcpcb *)0xc3a3457c)->t_rxtcur $6 = 1662654093 (kgdb) print ((struct tcpcb *)0xc3a343a8)->t_rxtcur $12 = 1358 (kgdb) print ((struct tcpcb *)0xc3a9e1d4)->t_rxtcur $17 = 203 (kgdb) print ((struct tcpcb *)0xc3a9e000)->t_rxtcur $19 = 284155863 most connections are stable around the "picked" value, though I have seen some connections oscillate between 64000 and a really large value.. I was trying to track this down, and a kernel as of 9/17 exhibits the problem, but I managed to track it down to a RELENG_6 commit (which obviously would effect HEAD) when I realized that each connection got a different value, and my older tests I was getting lucky in not having a bad timeout... To obtain these values, I used kgdb kernel /dev/mem, and put the value returned by netstat -Aanfinet's first column in as the tcpcb pointer above.. (Why is the columned named Socket, when it's the control block struct and not the socket struct?) Anyone want to track down why we are getting such large values in there? I'll try to back track farther to see how old this issue is.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."