From owner-svn-src-all@FreeBSD.ORG Sun Nov 11 16:53:54 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 666861869; Sun, 11 Nov 2012 16:53:54 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id F35DD8FC9A; Sun, 11 Nov 2012 16:47:29 +0000 (UTC) Received: from [10.0.1.17] (c-67-180-208-218.hsd1.ca.comcast.net [67.180.208.218]) by elvis.mu.org (Postfix) with ESMTPSA id 8C68D1A3C1E; Sun, 11 Nov 2012 08:47:27 -0800 (PST) References: <509DC25E.5030306@mu.org> <509E3162.5020702@FreeBSD.org> <509E7E7C.9000104@mu.org> <509E830D.5080006@mu.org> <1352568275.17290.85.camel@revolution.hippie.lan> <20121111061517.H1208@besplex.bde.org> <20121111073352.GA96046@FreeBSD.org> <509F72B0.90201@mu.org> In-Reply-To: Mime-Version: 1.0 (1.0) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Message-Id: <15512D0F-D403-4341-92D7-DFA03FDC2D88@mu.org> X-Mailer: iPhone Mail (9B206) From: Alfred Perlstein Subject: Re: svn commit: r242847 - in head/sys: i386/include kern Date: Sun, 11 Nov 2012 08:47:24 -0800 To: Peter Wemm Cc: "svn-src-head@freebsd.org" , Alexey Dokuchaev , "src-committers@freebsd.org" , "svn-src-all@freebsd.org" X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Nov 2012 16:53:54 -0000 I think there are two issue here.=20 One: you have much better idea of how to tune nmbclusters than I do. Cool! P= lease put that into the code. I really think that's great and the time you'v= e pit into giving it serious thought is helpful to all.=20 Two: you want to divorce nmbclusters (and therefor maxsockets and some other= tunables) from maxusers even though that has been the way to flip a big swi= tch for ages now. This is think is very wrong.=20 "oh you only have to change 1 thing!" Wait... What was that sound? Oh it was the flushing of a toilet that was fl= ushing down 15 years of mailing list information, FAQs and user knowledge do= wn the toilet because the word "maxusers" is no longer hip to the community.= That is bad. Please don't do that.=20 On Nov 11, 2012, at 2:53 AM, Peter Wemm wrote: > On Sun, Nov 11, 2012 at 1:41 AM, Albert Perlstein wrote: >> The real conversation goes like this: >>=20 >> user: "Why is my box seeing terrible network performance?" >> bsdguy: "Increase nmbclusters." >> user: "what is that?" >> bsdguy: "Oh those are the mbufs, just tell me your current value." >> user: "oh it's like 128000" >> bsdguy: "hmm try doubling that, go sysctl kern.ipc.nmbclusters=3D512000 o= n the >> command line." >> user: "ok" >> .... an hour passes ... >> user: "hmm now I can't fork any more copies of apache.." >> bsdguy: "oh, ok, you need to increase maxproc for that." >> user: "so sysctl kern.ipc.maxproc=3D10000?" >> bsdguy: "no... one second..." >> .... >> bsdguy: "ok, so that's sysctl kern.maxproc=3D10000" >> user: "ok... bbiaf" >> .... >> user: "so now i'm getting log messages about can't open sockets..." >> bsdguy: "oh you need to increase sockets bro... one second..." >> user: "sysctl kern.maxsockets?" >> bsdguy: "oh no.. it's actually back to kern.ipc.maxsockets" >> user: "alrighty then.." >> .... >> .... >> bsdguy: "so how is freebsd since I helped you tune it?" >> user: "well i kept hitting other resource limits, boss made me switch to >> Linux, it works out of the box and doesn't require an expert tuner to run= a >> large scale server. Y'know as a last ditch effort I looked around for th= is >> 'maxusers' thing but it seems like some eggheads retired it and instead o= f >> putting my job at risk, I just went with Linux, no one gets fired for usi= ng >> Linux." >> bsdguy: "managers are lame!" >> user: "yeah! managers..." >>=20 >> -Alfred >=20 > Now Albert.. I know that deliberately playing dumb is fun, but there > is no network difference between doubling "kern.maxusers" in > loader.conf (the only place it can be set, it isn't runtime tuneable) > and doubling "kern.ipc.nmbclusters" in the same place. We've always > allowed people to fine-tune derived settings at runtime where it is > possible. >=20 > My position still is that instead of trying to dick around with > maxusers curve slopes to try and somehow get the scaling right, we > should instead be setting sensibly right from the start, by default. >=20 > The current scaling was written when we had severe kva constraints, > did reservations, etc. Now they're a cap on dynamic allocators on > most platforms. >=20 > "Sensible" defaults would be *way* higher than the current maxusers > derived scaling curves. >=20 > My quick survey: > 8G ram -> 65088 clusters -> clusters capped at 6.2% of physical ram > (running head) > 3.5G ram -> 25600 clusters -> clusters capped at 5.0% of physical ram > (running an old head) > 32G ram -> 25600 clusters -> clusters capped at 1.5% of physical ram > (running 9.1-stable) > 72G ram -> 25600 clusters -> clusters capped at 0.06% of physical ram > (9.1-stable again) >=20 > As I've been saying from the beginning.. As these are limits on > dynamic allocators, not reservations, they should be as high as we can > comfortably set them without risking running out of other resources. >=20 > As the code stands now.. the derived limits for 4k, 9k and 16k jumbo > clusters is approximately the same space as 2K clusters. (ie: 1 x 4k > cluster per 2 x 2k clusters, 1 x 16k cluster per 8 2k clusters, and so > on). If we set a constant 6% for nmbclusters (since that's roughly > where we're at now for smaller machines after albert's changes), then > the worse case scenarios for 4k, 9k and 16k clusters are 6% each. ie: > 24% of wired, physical ram. >=20 > Plus all the other values derived from the nmbclusters tunable at boot. >=20 > I started writing this with the intention of suggesting 10% but that > might be a bit high given that: > kern_mbuf.c: nmbjumbop =3D nmbclusters / 2; > kern_mbuf.c: nmbjumbo9 =3D nmbclusters / 4; > kern_mbuf.c: nmbjumbo16 =3D nmbclusters / 8; > .. basically quadruples the worst case limits. >=20 > Out of the box, 6% is infinitely better than we 0.06% we currently get > on a 9-stable machine with 72G ram. >=20 > But I object to dicking around with "maxusers" to derive network > buffer space default limits. If we settle on something like 6%, then > it should be 6%. That's easy to document and explain the meaning of > the tunable. > --=20 > Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJ= V > "All of this is for nothing if we don't go to the stars" - JMS/B5 > "If Java had true garbage collection, most programs would delete > themselves upon execution." -- Robert Sewell