Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Nov 2012 08:47:24 -0800
From:      Alfred Perlstein <bright@mu.org>
To:        Peter Wemm <peter@wemm.org>
Cc:        "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, Alexey Dokuchaev <danfe@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>
Subject:   Re: svn commit: r242847 - in head/sys: i386/include kern
Message-ID:  <15512D0F-D403-4341-92D7-DFA03FDC2D88@mu.org>
In-Reply-To: <CAGE5yCpn7znceWZsDqdw2tfCusSLroRjg6%2B=QJnonxoTQ8RjaA@mail.gmail.com>
References:  <CAF6rxg=HPmQS1T-LFsZ=DuKEqH30iJFpkz%2BJGhLr4OBL8nohjg@mail.gmail.com> <509DC25E.5030306@mu.org> <509E3162.5020702@FreeBSD.org> <509E7E7C.9000104@mu.org> <CAF6rxgmV8dx-gsQceQKuMQEsJ%2BGkExcKYxEvQ3kY%2B5_nSjvA3w@mail.gmail.com> <509E830D.5080006@mu.org> <1352568275.17290.85.camel@revolution.hippie.lan> <CAGE5yCp4N7fML05-Tomm0TM-ROBSka5%2Bb9EKJTFR%2ByUpFuGj5Q@mail.gmail.com> <20121111061517.H1208@besplex.bde.org> <CAGE5yCpExfeJHeUuO0FEEFMgeNzftaFSWT=D-yKGdP%2B1xnjZ4A@mail.gmail.com> <20121111073352.GA96046@FreeBSD.org> <509F72B0.90201@mu.org> <CAGE5yCpn7znceWZsDqdw2tfCusSLroRjg6%2B=QJnonxoTQ8RjaA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
I think there are two issue here.=20

One: you have much better idea of how to tune nmbclusters than I do. Cool! P=
lease put that into the code. I really think that's great and the time you'v=
e pit into giving it serious thought is helpful to all.=20

Two: you want to divorce nmbclusters (and therefor maxsockets and some other=
 tunables) from maxusers even though that has been the way to flip a big swi=
tch for ages now. This is think is very wrong.=20

"oh you only have to change 1 thing!"

Wait... What was that sound?  Oh it was the flushing of a toilet that was fl=
ushing down 15 years of mailing list information, FAQs and user knowledge do=
wn the toilet because the word "maxusers" is no longer hip to the community.=
 That is bad. Please don't do that.=20




On Nov 11, 2012, at 2:53 AM, Peter Wemm <peter@wemm.org> wrote:

> On Sun, Nov 11, 2012 at 1:41 AM, Albert Perlstein <bright@mu.org> wrote:
>> The real conversation goes like this:
>>=20
>> user: "Why is my box seeing terrible network performance?"
>> bsdguy: "Increase nmbclusters."
>> user: "what is that?"
>> bsdguy: "Oh those are the mbufs, just tell me your current value."
>> user: "oh it's like 128000"
>> bsdguy: "hmm try doubling that, go sysctl kern.ipc.nmbclusters=3D512000 o=
n the
>> command line."
>> user: "ok"
>> .... an hour passes ...
>> user: "hmm now I can't fork any more copies of apache.."
>> bsdguy: "oh, ok, you need to increase maxproc for that."
>> user: "so sysctl kern.ipc.maxproc=3D10000?"
>> bsdguy: "no... one second..."
>> ....
>> bsdguy: "ok, so that's sysctl kern.maxproc=3D10000"
>> user: "ok... bbiaf"
>> ....
>> user: "so now i'm getting log messages about can't open sockets..."
>> bsdguy: "oh you need to increase sockets bro... one second..."
>> user: "sysctl kern.maxsockets?"
>> bsdguy: "oh no.. it's actually back to kern.ipc.maxsockets"
>> user: "alrighty then.."
>> ....
>> ....
>> bsdguy: "so how is freebsd since I helped you tune it?"
>> user: "well i kept hitting other resource limits, boss made me switch to
>> Linux, it works out of the box and doesn't require an expert tuner to run=
 a
>> large scale server.  Y'know as a last ditch effort I looked around for th=
is
>> 'maxusers' thing but it seems like some eggheads retired it and instead o=
f
>> putting my job at risk, I just went with Linux, no one gets fired for usi=
ng
>> Linux."
>> bsdguy: "managers are lame!"
>> user: "yeah!  managers..."
>>=20
>> -Alfred
>=20
> Now Albert.. I know that deliberately playing dumb is fun, but there
> is no network difference between doubling "kern.maxusers" in
> loader.conf (the only place it can be set, it isn't runtime tuneable)
> and doubling "kern.ipc.nmbclusters" in the same place.  We've always
> allowed people to fine-tune derived settings at runtime where it is
> possible.
>=20
> My position still is that instead of trying to dick around with
> maxusers curve slopes to try and somehow get the scaling right, we
> should instead be setting sensibly right from the start, by default.
>=20
> The current scaling was written when we had severe kva constraints,
> did reservations, etc.  Now they're a cap on dynamic allocators on
> most platforms.
>=20
> "Sensible" defaults would be *way* higher than the current maxusers
> derived scaling curves.
>=20
> My quick survey:
> 8G ram -> 65088 clusters -> clusters capped at 6.2% of physical ram
> (running head)
> 3.5G ram -> 25600 clusters -> clusters capped at 5.0% of physical ram
> (running an old head)
> 32G ram -> 25600 clusters -> clusters capped at 1.5% of physical ram
> (running 9.1-stable)
> 72G ram -> 25600 clusters -> clusters capped at 0.06% of physical ram
> (9.1-stable again)
>=20
> As I've been saying from the beginning..  As these are limits on
> dynamic allocators, not reservations, they should be as high as we can
> comfortably set them without risking running out of other resources.
>=20
> As the code stands now..  the derived limits for 4k, 9k and 16k jumbo
> clusters is approximately the same space as 2K clusters.  (ie: 1 x 4k
> cluster per 2 x 2k clusters, 1 x 16k cluster per 8 2k clusters, and so
> on).  If we set a constant 6% for nmbclusters (since that's roughly
> where we're at now for smaller machines after albert's changes), then
> the worse case scenarios for 4k, 9k and 16k clusters are 6% each.  ie:
> 24% of wired, physical ram.
>=20
> Plus all the other values derived from the nmbclusters tunable at boot.
>=20
> I started writing this with the intention of suggesting 10% but that
> might be a bit high given that:
> kern_mbuf.c:        nmbjumbop =3D nmbclusters / 2;
> kern_mbuf.c:        nmbjumbo9 =3D nmbclusters / 4;
> kern_mbuf.c:        nmbjumbo16 =3D nmbclusters / 8;
> .. basically quadruples the worst case limits.
>=20
> Out of the box, 6% is infinitely better than we 0.06% we currently get
> on a 9-stable machine with 72G ram.
>=20
> But I object to dicking around with "maxusers" to derive network
> buffer space default limits.  If we settle on something like 6%, then
> it should be 6%.  That's easy to document and explain the meaning of
> the tunable.
> --=20
> Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJ=
V
> "All of this is for nothing if we don't go to the stars" - JMS/B5
> "If Java had true garbage collection, most programs would delete
> themselves upon execution." -- Robert Sewell



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15512D0F-D403-4341-92D7-DFA03FDC2D88>