From owner-freebsd-bugs@FreeBSD.ORG Sun May 16 10:00:15 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EAA1010656A5 for ; Sun, 16 May 2010 10:00:15 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (unknown [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EBE5F8FC30 for ; Sun, 16 May 2010 10:00:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o4GA0EBR080596 for ; Sun, 16 May 2010 10:00:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o4GA0ELB080595; Sun, 16 May 2010 10:00:14 GMT (envelope-from gnats) Resent-Date: Sun, 16 May 2010 10:00:14 GMT Resent-Message-Id: <201005161000.o4GA0ELB080595@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Matthew Luckie Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FF141065672 for ; Sun, 16 May 2010 09:52:48 +0000 (UTC) (envelope-from mjl@luckie.org.nz) Received: from mailfilter68.ihug.co.nz (mailfilter68.ihug.co.nz [203.109.136.68]) by mx1.freebsd.org (Postfix) with ESMTP id EDD7B8FC1F for ; Sun, 16 May 2010 09:52:47 +0000 (UTC) Received: from 118-93-81-147.dsl.dyn.ihug.co.nz (HELO spandex.luckie.org.nz) ([118.93.81.147]) by cust.filter4.content.vf.net.nz with ESMTP/TLS/DHE-RSA-AES256-SHA; 16 May 2010 21:52:44 +1200 Received: from mylar.luckie.org.nz ([192.168.1.24]) by spandex.luckie.org.nz with esmtps (TLSv1:AES256-SHA:256) (Exim 4.71 (FreeBSD)) (envelope-from ) id 1ODaWZ-000Awh-NZ for FreeBSD-gnats-submit@freebsd.org; Sun, 16 May 2010 21:52:43 +1200 Received: from mjl by mylar.luckie.org.nz with local (Exim 4.71 (FreeBSD)) (envelope-from ) id 1ODaWh-0000Sh-Ql for FreeBSD-gnats-submit@freebsd.org; Sun, 16 May 2010 21:52:51 +1200 Message-Id: Date: Sun, 16 May 2010 21:52:51 +1200 From: Matthew Luckie To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: kern/146628: [patch] TCP does not clear DF when MTU is below a threshold X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Matthew Luckie List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 May 2010 10:00:16 -0000 >Number: 146628 >Category: kern >Synopsis: [patch] TCP does not clear DF when MTU is below a threshold >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun May 16 10:00:14 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Matthew Luckie >Release: FreeBSD 8.0-STABLE i386 >Organization: >Environment: System: FreeBSD mylar.luckie.org.nz 8.0-STABLE FreeBSD 8.0-STABLE #3: Sun May 16 21:31:15 NZST 2010 root@mylar.luckie.org.nz:/usr/src/sys/i386/compile/mylar i386 >Description: FreeBSD, like most operating systems, will refuse to lower TCP's segment size in response to an ICMP needfrag below a threshold. In FreeBSD's case, this is 512 bytes. If a needfrag next-hop MTU 256 is received, FreeBSD will reduce the connection's segment size to 512 bytes, and will then resend the presumed missing packet, but without first clearing the DF bit. If the Path MTU is in fact less than 512 bytes FreeBSD will get another needfrag, which it will ignore. The patch below will cause subsequent segments to be sent without the DF bit set, and does not change FreeBSD's default behaviour of refusing to reduce its segment size below a defined threshold. >How-To-Repeat: install net/scamper scamper -F ipfw -I "tbit -M 256 -u '' -i " >Fix: --- patch-pmtud begins here --- --- tcp_var.h.orig 2009-08-03 20:13:06.000000000 +1200 +++ tcp_var.h 2010-05-14 21:03:42.000000000 +1200 @@ -234,6 +234,7 @@ #define TF_ECN_PERMIT 0x4000000 /* connection ECN-ready */ #define TF_ECN_SND_CWR 0x8000000 /* ECN CWR in queue */ #define TF_ECN_SND_ECE 0x10000000 /* ECN ECE in queue */ +#define TF_IPDF 0x20000000 /* set the DF bit */ #define IN_FASTRECOVERY(tp) (tp->t_flags & TF_FASTRECOVERY) #define ENTER_FASTRECOVERY(tp) tp->t_flags |= TF_FASTRECOVERY --- tcp_subr.c.orig 2009-08-03 20:13:06.000000000 +1200 +++ tcp_subr.c 2010-05-16 21:26:50.000000000 +1200 @@ -656,7 +656,9 @@ tlen += sizeof (struct tcpiphdr); ip->ip_len = tlen; ip->ip_ttl = V_ip_defttl; - if (V_path_mtu_discovery) + if (tp != NULL && tp->t_flags & TF_IPDF) + ip->ip_off |= IP_DF; + else if (tp == NULL && V_path_mtu_discovery) ip->ip_off |= IP_DF; } m->m_len = tlen; @@ -757,6 +759,9 @@ tp->t_flags = (TF_REQ_SCALE|TF_REQ_TSTMP); if (V_tcp_do_sack) tp->t_flags |= TF_SACK_PERMIT; + if (V_path_mtu_discovery) + tp->t_flags |= TF_IPDF; + TAILQ_INIT(&tp->snd_holes); tp->t_inpcb = inp; /* XXX */ /* @@ -1361,9 +1366,11 @@ if (mtu < max(296, V_tcp_minmss + sizeof(struct tcpiphdr))) mtu = 0; - if (!mtu) + if (!mtu) { mtu = V_tcp_mssdflt + sizeof(struct tcpiphdr); + tp->t_flags &= ~TF_IPDF; + } /* * Only cache the the MTU if it * is smaller than the interface --- tcp_syncache.c.orig 2010-05-16 21:30:21.000000000 +1200 +++ tcp_syncache.c 2010-05-16 21:31:00.000000000 +1200 @@ -779,6 +779,9 @@ if (sc->sc_flags & SCF_ECN) tp->t_flags |= TF_ECN_PERMIT; + if (V_path_mtu_discovery) + tp->t_flags |= TF_IPDF; + /* * Set up MSS and get cached values from tcp_hostcache. * This might overwrite some of the defaults we just set. --- tcp_output.c.orig 2009-11-18 05:17:11.000000000 +1300 +++ tcp_output.c 2010-05-16 20:38:25.000000000 +1200 @@ -1181,7 +1181,7 @@ * Section 2. However the tcp hostcache migitates the problem * so it affects only the first tcp connection with a host. */ - if (V_path_mtu_discovery) + if (tp->t_flags & TF_IPDF) ip->ip_off |= IP_DF; error = ip_output(m, tp->t_inpcb->inp_options, NULL, --- patch-pmtud ends here --- >Release-Note: >Audit-Trail: >Unformatted: