From owner-freebsd-net@freebsd.org Tue Aug 25 00:53:55 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AC57C9C12D3 for ; Tue, 25 Aug 2015 00:53:55 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 8A3B81704 for ; Tue, 25 Aug 2015 00:53:55 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t7P0rs1p005343 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 24 Aug 2015 17:53:54 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t7P0rsWm005342 for freebsd-net@FreeBSD.org; Mon, 24 Aug 2015 17:53:54 -0700 (PDT) (envelope-from jmg) Date: Mon, 24 Aug 2015 17:53:54 -0700 From: John-Mark Gurney To: freebsd-net@FreeBSD.org Subject: CFT: Jumbo and non-Jumbo hosts on same subnet Message-ID: <20150825005354.GL33167@funkthat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Mon, 24 Aug 2015 17:53:54 -0700 (PDT) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Aug 2015 00:53:55 -0000 I've had this idea for a long time (I fixed the kernel to support it in r162205[1]) and even used a manual version of it a long time ago in production for NFS servers, but never got around to producing an automatic version of it. Now I have: https://github.com/jmgurney/automtud It's a simple script that when run, will configure an interface to it's largest support MTU, set the network route to the default normal 1500 byte MTU so that communication w/ other machines works as normal, and then monitor and probe other hosts to figure out just how large of an MTU that host will accept. Sample run: # sh automtud.sh -i re0 setting up: re0 Setting MTU on interface re0 to 6122. setting normal mtu on interface re0 for network 192.168.0.0/24 change net 192.168.0.0: gateway re0 machine 192.168.0.2 add on interface re0 adjusting 192.168.0.2 mtu to 6122 machine 192.168.0.14 add on interface re0 adjusting 192.168.0.14 mtu to 1504 The adjustment to 1504 on .14 is because the interface also has VLANs, but .14 is untagged, and so we can sneak an extra 4 bytes in the packet. The bad part of this is that the iface still appears to have the normal 1500 byte MTU: npe1: flags=8843 metric 0 mtu 1500 options=80008 The script should work on all modern releases of FreeBSD, though feedback where it breaks is welcome. I'd like to hear of any issues that you may run into, but I'm pretty sure often it'll be, we need to fix driver X as most drivers do not handle Jumbo frames well. In most cases, it'll be the driver needs to be changed to use either cluster (2k) or page sized clusters instead of 9k or 16k clusters when configured for jumbo frames. On my old 9.2-R box that I tried it on, I ran into issues where I got tons of 9k cluster allocation failures, probably just need to increase the limit, but it'd still be better to use page sized clusters instead: ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP mbuf_jumbo_9k: 9216, 6400, 0, 1110,65208532,814941, 0 I haven't looked at fixed the em driver yet. [1] https://svnweb.freebsd.org/changeset/base/r162205 P.S. Probing time could be made faster if ping -t supported sub-second values as if a host on a local segment hasn't replied in, say, 100ms, it's probably not going to, or you need to fix the network. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."