Date: Sun, 11 Nov 2012 02:53:33 -0800 From: Peter Wemm <peter@wemm.org> To: Alfred Perlstein <bright@mu.org> Cc: svn-src-head@freebsd.org, Alexey Dokuchaev <danfe@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org Subject: Re: svn commit: r242847 - in head/sys: i386/include kern Message-ID: <CAGE5yCpn7znceWZsDqdw2tfCusSLroRjg6%2B=QJnonxoTQ8RjaA@mail.gmail.com> In-Reply-To: <509F72B0.90201@mu.org> References: <CAF6rxg=HPmQS1T-LFsZ=DuKEqH30iJFpkz%2BJGhLr4OBL8nohjg@mail.gmail.com> <509DC25E.5030306@mu.org> <509E3162.5020702@FreeBSD.org> <509E7E7C.9000104@mu.org> <CAF6rxgmV8dx-gsQceQKuMQEsJ%2BGkExcKYxEvQ3kY%2B5_nSjvA3w@mail.gmail.com> <509E830D.5080006@mu.org> <1352568275.17290.85.camel@revolution.hippie.lan> <CAGE5yCp4N7fML05-Tomm0TM-ROBSka5%2Bb9EKJTFR%2ByUpFuGj5Q@mail.gmail.com> <20121111061517.H1208@besplex.bde.org> <CAGE5yCpExfeJHeUuO0FEEFMgeNzftaFSWT=D-yKGdP%2B1xnjZ4A@mail.gmail.com> <20121111073352.GA96046@FreeBSD.org> <509F72B0.90201@mu.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 11, 2012 at 1:41 AM, Albert Perlstein <bright@mu.org> wrote: > The real conversation goes like this: > > user: "Why is my box seeing terrible network performance?" > bsdguy: "Increase nmbclusters." > user: "what is that?" > bsdguy: "Oh those are the mbufs, just tell me your current value." > user: "oh it's like 128000" > bsdguy: "hmm try doubling that, go sysctl kern.ipc.nmbclusters=512000 on the > command line." > user: "ok" > .... an hour passes ... > user: "hmm now I can't fork any more copies of apache.." > bsdguy: "oh, ok, you need to increase maxproc for that." > user: "so sysctl kern.ipc.maxproc=10000?" > bsdguy: "no... one second..." > .... > bsdguy: "ok, so that's sysctl kern.maxproc=10000" > user: "ok... bbiaf" > .... > user: "so now i'm getting log messages about can't open sockets..." > bsdguy: "oh you need to increase sockets bro... one second..." > user: "sysctl kern.maxsockets?" > bsdguy: "oh no.. it's actually back to kern.ipc.maxsockets" > user: "alrighty then.." > .... > .... > bsdguy: "so how is freebsd since I helped you tune it?" > user: "well i kept hitting other resource limits, boss made me switch to > Linux, it works out of the box and doesn't require an expert tuner to run a > large scale server. Y'know as a last ditch effort I looked around for this > 'maxusers' thing but it seems like some eggheads retired it and instead of > putting my job at risk, I just went with Linux, no one gets fired for using > Linux." > bsdguy: "managers are lame!" > user: "yeah! managers..." > > -Alfred Now Albert.. I know that deliberately playing dumb is fun, but there is no network difference between doubling "kern.maxusers" in loader.conf (the only place it can be set, it isn't runtime tuneable) and doubling "kern.ipc.nmbclusters" in the same place. We've always allowed people to fine-tune derived settings at runtime where it is possible. My position still is that instead of trying to dick around with maxusers curve slopes to try and somehow get the scaling right, we should instead be setting sensibly right from the start, by default. The current scaling was written when we had severe kva constraints, did reservations, etc. Now they're a cap on dynamic allocators on most platforms. "Sensible" defaults would be *way* higher than the current maxusers derived scaling curves. My quick survey: 8G ram -> 65088 clusters -> clusters capped at 6.2% of physical ram (running head) 3.5G ram -> 25600 clusters -> clusters capped at 5.0% of physical ram (running an old head) 32G ram -> 25600 clusters -> clusters capped at 1.5% of physical ram (running 9.1-stable) 72G ram -> 25600 clusters -> clusters capped at 0.06% of physical ram (9.1-stable again) As I've been saying from the beginning.. As these are limits on dynamic allocators, not reservations, they should be as high as we can comfortably set them without risking running out of other resources. As the code stands now.. the derived limits for 4k, 9k and 16k jumbo clusters is approximately the same space as 2K clusters. (ie: 1 x 4k cluster per 2 x 2k clusters, 1 x 16k cluster per 8 2k clusters, and so on). If we set a constant 6% for nmbclusters (since that's roughly where we're at now for smaller machines after albert's changes), then the worse case scenarios for 4k, 9k and 16k clusters are 6% each. ie: 24% of wired, physical ram. Plus all the other values derived from the nmbclusters tunable at boot. I started writing this with the intention of suggesting 10% but that might be a bit high given that: kern_mbuf.c: nmbjumbop = nmbclusters / 2; kern_mbuf.c: nmbjumbo9 = nmbclusters / 4; kern_mbuf.c: nmbjumbo16 = nmbclusters / 8; .. basically quadruples the worst case limits. Out of the box, 6% is infinitely better than we 0.06% we currently get on a 9-stable machine with 72G ram. But I object to dicking around with "maxusers" to derive network buffer space default limits. If we settle on something like 6%, then it should be 6%. That's easy to document and explain the meaning of the tunable. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGE5yCpn7znceWZsDqdw2tfCusSLroRjg6%2B=QJnonxoTQ8RjaA>