From owner-freebsd-net@FreeBSD.ORG Thu Jan 30 03:22:58 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6508D572 for ; Thu, 30 Jan 2014 03:22:58 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 061511AA4 for ; Thu, 30 Jan 2014 03:22:57 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.7/8.14.7) with ESMTP id s0U3MukO010030; Wed, 29 Jan 2014 22:22:56 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.7/8.14.4/Submit) id s0U3Mt3s010029; Wed, 29 Jan 2014 22:22:55 -0500 (EST) (envelope-from wollman) Date: Wed, 29 Jan 2014 22:22:55 -0500 (EST) From: Garrett Wollman Message-Id: <201401300322.s0U3Mt3s010029@hergotha.csail.mit.edu> To: nparhar@gmail.com Subject: Re: Big physically contiguous mbuf clusters X-Newsgroups: mit.lcs.mail.freebsd-net In-Reply-To: <20140129231138$3db6@grapevine.csail.mit.edu> References: <21225.20047.947384.390241@khavrinen.csail.mit.edu> Organization: none X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 29 Jan 2014 22:22:56 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,LOTS_OF_MONEY autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jan 2014 03:22:58 -0000 In article <20140129231138$3db6@grapevine.csail.mit.edu>, nparhar@gmail.com writes: >I think this would be very useful. For example, a zone_jumbo32 would >hit a sweet spot -- enough to fit 3 jumbo frames and some loose change >for metadata. I'd like to see us improve our allocators and VM system >to work better with larger contiguous allocations, rather than >deprecating the larger zones. It seems backwards to push towards >smaller allocation units when installed physical memory in a typical >system continues to rise. In order to resist fragmentation, you need to be willing to dedicate some partition of physical memory to larger allocations. That's fine for a special-purpose device like a switch, but is not so good for a general-purpose operating system. But if you were willing to reserve, say, 1/64th of physical memory at boot time, make it all direct-mapped using superpages, and allocate it in fixed-power-of-two-sized chunks, you would probably get a performance win. But the chunks *have* to be fixed-size, otherwise you are nearly guaranteed to get your arena checkerboarded. I'd consider giving 2 GB on a 128-GB machine for that. For NFS performance, you'd probably want to be able to take a whole chunk, read the desired data into it in a single VOP, then pass the whole thing to the socket layer wrapped in an mbuf. -GAWollman