From owner-freebsd-net@FreeBSD.ORG Thu Sep 9 17:15:17 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0BB416A4CF for ; Thu, 9 Sep 2004 17:15:17 +0000 (GMT) Received: from mail2.speakeasy.net (mail2.speakeasy.net [216.254.0.202]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3522E43D45 for ; Thu, 9 Sep 2004 17:15:17 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: (qmail 29553 invoked from network); 9 Sep 2004 17:15:16 -0000 Received: from gate.funkthat.com (HELO hydrogen.funkthat.com) ([69.17.45.168]) (envelope-sender ) by mail2.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 9 Sep 2004 17:15:16 -0000 Received: from hydrogen.funkthat.com (uxoxqv@localhost.funkthat.com [127.0.0.1])i89HFGuU073713; Thu, 9 Sep 2004 10:15:16 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.12.10/8.12.10/Submit) id i89HFFCG073712; Thu, 9 Sep 2004 10:15:15 -0700 (PDT) Date: Thu, 9 Sep 2004 10:15:15 -0700 From: John-Mark Gurney To: Andre Oppermann Message-ID: <20040909171515.GL72089@funkthat.com> Mail-Followup-To: Andre Oppermann , freebsd-net@FreeBSD.org, freebsd-arch@FreeBSD.org References: <20040906050435.GA72089@funkthat.com> <41408D4C.E33B6F98@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41408D4C.E33B6F98@freebsd.org> User-Agent: Mutt/1.4.1i X-Operating-System: FreeBSD 4.2-RELEASE i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html cc: freebsd-net@freebsd.org cc: freebsd-arch@freebsd.org Subject: Re: better MTU support... X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: John-Mark Gurney List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Sep 2004 17:15:17 -0000 Andre Oppermann wrote this message on Thu, Sep 09, 2004 at 19:05 +0200: > John-Mark Gurney wrote: > > > > In a recent experiment w/ Jumbo frames, I found out that sending ip > > frames completely ignores the MTU set on host routes. This makes it > > difficult (or next to impossible) to support a network that has both > > regular and jumbo frames on it as you can't restrict some hosts to the > > smaller frames. > > What you should do instead is to set the MTU on the interface to 9018 > or so and then have a default route with MTU 1500 for everything else. > Now you can specify larger MTUs for hosts that support it. > > Otherwise you are opening a can of worms... Actually, this will still be broken, since host routes on the local segment will be cloned from the link/net route, and ip will still try to use the if mtu... which is set to 9018... TCP works fine, the problem is using UDP in a mixed environment... > > I now have a patch to ip_output that makes it obay the MTU set on the > > route instead of that of the interface. > > Your patch corrects a problem in ip_output where a smaller MTU on an > rtentry was ignored but that is only for the non-TCP cases. When you > open a TCP session the MTU will be honored (see tcp_subr.c:tcp_maxmtu). > If not it would be a bug. TCP works fine, the problem is with icmp and udp and other types.. duplicating the MTU logic in each would seem excesive... > Could you try your large MTU setup again using the procedure I desribed > above? Turns out that my hub can't do jumbo frames.. so I can't completely test it beyound the simulation of 1000 being the normal MTU, and 1500 being "jumbo"... > That should solve your immediate problem. As I said, I'm pretty sure this would still break other hosts, since the issue I'm talking about doesn't touch the default route... > For the general 'bug' in ip_output that it doesn't honour a smaller MTU > on a route I'd like to do a more throughout fix. Routes should be > created with MTU 0 if the MTU is not different from the if_mtu. Only > in those cases where you want to have a lower MTU you set it. For cloned > routes the MTU would be cloned from the parent. This range of changes is > more intrusive. On top of that comes the new ARP code which will have a > MTU field as well. This one is supposed to store different MTUs for mixed > MTU L2 networks. How to transport the MTU information is a separate > discussion. > > If the fix above works for you I'd like to do the real fix later (< end > of year) and not change the current behaviour in ip_output at the moment. Sorry, I can't test it.. :( I stupidly assumed that all gige equipment supported jumbo frames... it was never a high priority, but as gige stuff is cheap, this is going to become more of an issue, and wanted to preemptively fix it.. Esspecially how cool 5.3-R is from a networking perspective. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."