From owner-freebsd-net@FreeBSD.ORG Thu Jul 12 14:55:21 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6F01C106567B; Thu, 12 Jul 2012 14:55:21 +0000 (UTC) (envelope-from gnn@freebsd.org) Received: from vps.hungerhost.com (vps.hungerhost.com [216.38.53.176]) by mx1.freebsd.org (Postfix) with ESMTP id 241C18FC0C; Thu, 12 Jul 2012 14:55:21 +0000 (UTC) Received: from [209.249.190.124] (port=63600 helo=gnnmac.hudson-trading.com) by vps.hungerhost.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.77) (envelope-from ) id 1SpKnR-00058T-FH; Thu, 12 Jul 2012 10:55:17 -0400 Mime-Version: 1.0 (Apple Message framework v1280) Content-Type: text/plain; charset=iso-8859-1 From: George Neville-Neil In-Reply-To: <4FFDF6C7.3030301@FreeBSD.org> Date: Thu, 12 Jul 2012 10:55:16 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <86liiqrnnq.wl%gnn@neville-neil.com> <4FFDF6C7.3030301@FreeBSD.org> To: Navdeep Parhar X-Mailer: Apple Mail (2.1280) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - vps.hungerhost.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - freebsd.org Cc: net@freebsd.org Subject: Re: Interface MTU question... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jul 2012 14:55:21 -0000 On Jul 11, 2012, at 17:57 , Navdeep Parhar wrote: > On 07/11/12 14:30, gnn@freebsd.org wrote: >> Howdy, >>=20 >> Does anyone know the reason for this particular check in >> ip_output.c? >>=20 >> if (rte !=3D NULL && (rte->rt_flags & (RTF_UP|RTF_HOST))) { >> /* >> * This case can happen if the user changed the MTU >> * of an interface after enabling IP on it. Because >> * most netifs don't keep track of routes pointing to >> * them, there is no way for one to update all its >> * routes when the MTU is changed. >> */ >> if (rte->rt_rmx.rmx_mtu > ifp->if_mtu) >> rte->rt_rmx.rmx_mtu =3D ifp->if_mtu; >> mtu =3D rte->rt_rmx.rmx_mtu; >> } else { >> mtu =3D ifp->if_mtu; >> } >>=20 >> To my mind the > ought to be !=3D so that any change, up or down, of = the >> interface MTU is eventually reflected in the route. Also, this code >> does not check if it is both a HOST route and UP, but only if it is >> one other the other, so don't be fooled by that, this check happens >> for any route we have if it's up. >=20 > I believe rmx_mtu could be low due to some intermediate node between = this host and the final destination. An increase in the MTU of the = local interface should not increase the path MTU if the limit was due to = someone else along the route. Yes, it turns out to be complex. We have several places that store the = MTU. There is the interface, which knows the MTU of the directly connected link, a route, and the = host cache. All three of these are used to determine the maximum segment size (MSS) of a TCP packet. = The route and the interface determine the maximum MTU that the MSS can have, but, if there is an = entry in the host cache then it is preferred over either of the first two. See tcp_update_mss() = in tcp_input.c to see what I'm talking about. I believe that the quoted code above has been wrong from the day it was = written, in that what it really says is "if the route is up" and not "if the route is up and is a = host route" which is what I believe people to read that as. If the belief is that this code = is really only there for hosts routes, then the proper fix is to make the sense of the first if = match that belief and, again, to change the > to !=3D so that when the administrator of = the box bumps the MTU in either direction that the route reflects this. It is not possible for = PMTU on a single link to a host route to bump the number down if the interface says it's not = to be bumped. And, even so, any host cache entry will override and avoid this code. Best, George