From owner-freebsd-current@FreeBSD.ORG Sun May 11 06:23:48 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9BF0C106566C for ; Sun, 11 May 2008 06:23:48 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (mail.bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 55C928FC14 for ; Sun, 11 May 2008 06:23:48 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id D62E15B47 for ; Sat, 10 May 2008 23:12:18 -0700 (PDT) To: freebsd-current@freebsd.org Date: Sat, 10 May 2008 23:12:18 -0700 From: Bakul Shah Message-Id: <20080511061218.D62E15B47@mail.bitblocks.com> Subject: tcp over slow links broken? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 May 2008 06:23:48 -0000 Sometime during the past month or so, tcp out over a slow links seems to be broken and the symptom is really puzzling to me! Here is my setup: A----B----DSL ~ ~ ~ ... ~ ~ DSL----C Copying a file from A to C works fine. Copying a file from B to C works only if a) the file is really small OR b) if I slowdown the rate of transmission to below the DSL cap so that there is no congestion. A and B run identical kernels. A is a 1.5Ghz P4 with 1GB B is a 200Mhz Ppro with 64MB A-B link is 100MBps B-DSL link is 10Mbps. DSL upload bandwidth is capped at 512Kbps. My first thought was B is running out of space so I temporarily removed a lot of processes, stripped down the kernel to bare minimum etc. but that didn't change anything. Playing with various TCP flags didn't help either. But when I reverted B to to 7.0-RELEASE and the problem went away! I still have the -current kernel around. Note that this is not program dependent. I have seen the same thing with ftp and with outgoing email (and that is how I realized there was a problem -- after some emails started bouncing!). Older kernels until at least mid april seem to behave the same way. The first trace shows file copy from B to C which stalls after sending 4744 bytes. The stall point can be farther if data is sent at a lower rate. The second trace shows a complete transfer of the same file. Both traces were captured on B on the link to DSL. B does NAT for A with pf. I looked through tcp related changes nn the sys commitlogs. There are a lot of them lately.... However, this comment in a checkin of 20008-4-17 21:38:18 UTC worries me :-) This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code. Trace 1 ------- 14:22:41.490370 IP B.55535 > C.ssh: S 2917833942:2917833942(0) win 65535 14:22:41.514221 IP C.ssh > B.55535: S 2419865928:2419865928(0) ack 2917833943 win 65535 14:22:41.514720 IP B.55535 > C.ssh: . ack 1 win 65535 14:22:41.554910 IP C.ssh > B.55535: P 1:40(39) ack 1 win 65535 14:22:41.569904 IP B.55535 > C.ssh: P 1:40(39) ack 40 win 65535 14:22:41.610699 IP C.ssh > B.55535: P 40:776(736) ack 40 win 65535 14:22:41.611184 IP B.55535 > C.ssh: P 40:792(752) ack 776 win 64964 14:22:41.758791 IP C.ssh > B.55535: . ack 792 win 65535 14:22:41.759050 IP B.55535 > C.ssh: P 792:816(24) ack 776 win 65535 14:22:41.791497 IP C.ssh > B.55535: P 776:928(152) ack 816 win 65535 14:22:41.828289 IP B.55535 > C.ssh: P 816:960(144) ack 928 win 65535 14:22:41.869674 IP C.ssh > B.55535: P 928:1584(656) ack 960 win 65535 14:22:41.931648 IP B.55535 > C.ssh: P 960:976(16) ack 1584 win 65535 14:22:42.056370 IP C.ssh > B.55535: . ack 976 win 65535 14:22:42.056689 IP B.55535 > C.ssh: P 976:1024(48) ack 1584 win 65535 14:22:42.082622 IP C.ssh > B.55535: P 1584:1632(48) ack 1024 win 65535 14:22:42.086459 IP B.55535 > C.ssh: P 1024:1088(64) ack 1632 win 65535 14:22:42.115857 IP C.ssh > B.55535: P 1632:1696(64) ack 1088 win 65535 14:22:42.116942 IP B.55535 > C.ssh: P 1088:1328(240) ack 1696 win 65535 14:22:42.152161 IP C.ssh > B.55535: P 1696:1888(192) ack 1328 win 65535 14:22:42.195766 IP B.55535 > C.ssh: P 1328:1712(384) ack 1888 win 65535 14:22:42.236584 IP C.ssh > B.55535: P 1888:1920(32) ack 1712 win 65535 14:22:42.245601 IP B.55535 > C.ssh: P 1712:1760(48) ack 1920 win 65535 14:22:42.271932 IP C.ssh > B.55535: P 1920:1968(48) ack 1760 win 65535 14:22:42.274508 IP B.55535 > C.ssh: P 1760:1824(64) ack 1968 win 65535 14:22:42.301430 IP C.ssh > B.55535: P 1968:2016(48) ack 1824 win 65535 14:22:42.317057 IP B.55535 > C.ssh: . 1824:3284(1460) ack 2016 win 65535 14:22:42.317313 IP B.55535 > C.ssh: . 3284:4744(1460) ack 2016 win 65535 14:22:42.317406 IP B.55535 > C.ssh: . 4744:6204(1460) ack 2016 win 65535 14:22:42.317489 IP B.55535 > C.ssh: . 6204:7664(1460) ack 2016 win 65535 14:22:42.410739 IP C.ssh > B.55535: . ack 4744 win 64240 14:22:42.411144 IP B.55535 > C.ssh: . 7664:9124(1460) ack 2016 win 65535 14:22:42.411259 IP B.55535 > C.ssh: . 9124:10584(1460) ack 2016 win 65535 14:22:42.468350 IP C.ssh > B.55535: . ack 4744 win 65535 14:22:42.490556 IP C.ssh > B.55535: . ack 4744 win 65535 14:22:42.830171 IP B.55535 > C.ssh: . 4744:6204(1460) ack 2016 win 65535 14:22:43.470135 IP B.55535 > C.ssh: . 4744:6204(1460) ack 2016 win 65535 14:22:44.549944 IP B.55535 > C.ssh: . 4744:6204(1460) ack 2016 win 65535 14:22:46.509750 IP B.55535 > C.ssh: . 4744:6204(1460) ack 2016 win 65535 14:22:50.229210 IP B.55535 > C.ssh: . 4744:6204(1460) ack 2016 win 65535 Trace 2 ------- 14:22:17.859900 IP A.58229 > C.ssh: S 3812262125:3812262125(0) win 65535 14:22:17.884226 IP C.ssh > A.58229: S 3421051426:3421051426(0) ack 3812262126 win 65535 14:22:17.885436 IP A.58229 > C.ssh: . ack 1 win 65535 14:22:17.920229 IP C.ssh > A.58229: P 1:40(39) ack 1 win 65535 14:22:17.921559 IP A.58229 > C.ssh: P 1:40(39) ack 40 win 65535 14:22:17.960656 IP C.ssh > A.58229: P 40:776(736) ack 40 win 65535 14:22:17.962767 IP A.58229 > C.ssh: P 40:792(752) ack 776 win 64964 14:22:18.111015 IP C.ssh > A.58229: . ack 792 win 65535 14:22:18.112436 IP A.58229 > C.ssh: P 792:816(24) ack 776 win 65535 14:22:18.143169 IP C.ssh > A.58229: P 776:928(152) ack 816 win 65535 14:22:18.148737 IP A.58229 > C.ssh: P 816:960(144) ack 928 win 65535 14:22:18.190569 IP C.ssh > A.58229: P 928:1584(656) ack 960 win 65535 14:22:18.199752 IP A.58229 > C.ssh: P 960:976(16) ack 1584 win 65535 14:22:18.324428 IP C.ssh > A.58229: . ack 976 win 65535 14:22:18.325989 IP A.58229 > C.ssh: P 976:1024(48) ack 1584 win 65535 14:22:18.353817 IP C.ssh > A.58229: P 1584:1632(48) ack 1024 win 65535 14:22:18.355349 IP A.58229 > C.ssh: P 1024:1088(64) ack 1632 win 65535 14:22:18.384918 IP C.ssh > A.58229: P 1632:1696(64) ack 1088 win 65535 14:22:18.386325 IP A.58229 > C.ssh: P 1088:1328(240) ack 1696 win 65535 14:22:18.421324 IP C.ssh > A.58229: P 1696:1888(192) ack 1328 win 65535 14:22:18.427757 IP A.58229 > C.ssh: P 1328:1712(384) ack 1888 win 65535 14:22:18.466104 IP C.ssh > A.58229: P 1888:1920(32) ack 1712 win 65535 14:22:18.467915 IP A.58229 > C.ssh: P 1712:1760(48) ack 1920 win 65535 14:22:18.496785 IP C.ssh > A.58229: P 1920:1968(48) ack 1760 win 65535 14:22:18.498193 IP A.58229 > C.ssh: P 1760:1808(48) ack 1968 win 65535 14:22:18.524278 IP C.ssh > A.58229: P 1968:2016(48) ack 1808 win 65535 14:22:18.527950 IP A.58229 > C.ssh: . 1808:3268(1460) ack 2016 win 65535 14:22:18.528354 IP A.58229 > C.ssh: . 3268:4728(1460) ack 2016 win 65535 14:22:18.528746 IP A.58229 > C.ssh: . 4728:6188(1460) ack 2016 win 65535 14:22:18.529153 IP A.58229 > C.ssh: . 6188:7648(1460) ack 2016 win 65535 14:22:18.622482 IP C.ssh > A.58229: . ack 4728 win 64240 14:22:18.624592 IP A.58229 > C.ssh: . 7648:9108(1460) ack 2016 win 65535 14:22:18.625058 IP A.58229 > C.ssh: . 9108:10568(1460) ack 2016 win 65535 14:22:18.676089 IP C.ssh > A.58229: . ack 7648 win 64240 14:22:18.677848 IP A.58229 > C.ssh: P 10568:11584(1016) ack 2016 win 65535 14:22:18.728117 IP C.ssh > A.58229: . ack 10568 win 64240 14:22:18.747845 IP C.ssh > A.58229: P 2016:2176(160) ack 11584 win 65535 14:22:18.749963 IP C.ssh > A.58229: P 2176:2240(64) ack 11584 win 65535 14:22:18.751424 IP A.58229 > C.ssh: . ack 2240 win 65535 14:22:18.751746 IP A.58229 > C.ssh: P 11584:11616(32) ack 2240 win 65535 14:22:18.751893 IP A.58229 > C.ssh: F 11616:11616(0) ack 2240 win 65535 14:22:18.777294 IP C.ssh > A.58229: . ack 11617 win 65535 14:22:18.778770 IP C.ssh > A.58229: F 2240:2240(0) ack 11617 win 65535 14:22:18.779929 IP A.58229 > C.ssh: . ack 2241 win 65534