From owner-freebsd-net@freebsd.org Fri Jul 27 20:53:28 2018 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0C33B105672E for ; Fri, 27 Jul 2018 20:53:28 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (tunnel82308-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 84C468759C for ; Fri, 27 Jul 2018 20:53:27 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.15.2/8.15.2) with ESMTP id w6RKrPW4053570; Fri, 27 Jul 2018 16:53:26 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.15.2/8.14.4/Submit) id w6RKrO1o053565; Fri, 27 Jul 2018 16:53:24 -0400 (EDT) (envelope-from wollman) Date: Fri, 27 Jul 2018 16:53:24 -0400 (EDT) From: Garrett Wollman Message-Id: <201807272053.w6RKrO1o053565@hergotha.csail.mit.edu> To: ryan@ixsystems.com Subject: Re: 9k jumbo clusters X-Newsgroups: mit.lcs.mail.freebsd-net References: Organization: none Cc: freebsd-net@freebsd.org X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (hergotha.csail.mit.edu [127.0.0.1]); Fri, 27 Jul 2018 16:53:26 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on hergotha.csail.mit.edu X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jul 2018 20:53:28 -0000 In article ryan@ixsystems.com writes: >I have seen some work in the direction of avoiding larger than page size >jumbo clusters in 12-CURRENT. Many existing drivers avoid the 9k cluster >size already. The code for larger cluster sizes in iflib is #ifdef'd out >so it maxes out at the page size jumbo clusters until "CONTIGMALLOC_WORKS" >(apparently it doesn't). My view, which I've expressed before, is that we should have a special pool allocator that provides much larger buffers for systems with high-speed network interfaces that can benefit from them. On at machine with 96 GB of RAM (a small file server in my world), it would not hurt at all to reserve a few 2 GB pages worth of physical memory to be used as very large network buffers, say 64k in length, with the constraint that all of the "very large" buffers had to be the same length. This could be set up in early initialization via tunables, with the default being not to reserve any space so it doesn't affect memory allocation on systems that aren't configured for it. (If you're building a high-performance file server you obviously are going to need to tune more than just network buffers anyway!) I thought a bit about trying to implement this a few years ago when the 9k cluster issue was really biting me, but instead I just diked out the 9k cluster code in the NIC drivers I was using. -GAWollman