FreeBSD Mail Archives

Date:      Sun, 11 Nov 2012 02:53:33 -0800
From:      Peter Wemm <peter@wemm.org>
To:        Alfred Perlstein <bright@mu.org>
Cc:        svn-src-head@freebsd.org, Alexey Dokuchaev <danfe@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org
Subject:   Re: svn commit: r242847 - in head/sys: i386/include kern
Message-ID:  <CAGE5yCpn7znceWZsDqdw2tfCusSLroRjg6%2B=QJnonxoTQ8RjaA@mail.gmail.com>
In-Reply-To: <509F72B0.90201@mu.org>
References:  <CAF6rxg=HPmQS1T-LFsZ=DuKEqH30iJFpkz%2BJGhLr4OBL8nohjg@mail.gmail.com> <509DC25E.5030306@mu.org> <509E3162.5020702@FreeBSD.org> <509E7E7C.9000104@mu.org> <CAF6rxgmV8dx-gsQceQKuMQEsJ%2BGkExcKYxEvQ3kY%2B5_nSjvA3w@mail.gmail.com> <509E830D.5080006@mu.org> <1352568275.17290.85.camel@revolution.hippie.lan> <CAGE5yCp4N7fML05-Tomm0TM-ROBSka5%2Bb9EKJTFR%2ByUpFuGj5Q@mail.gmail.com> <20121111061517.H1208@besplex.bde.org> <CAGE5yCpExfeJHeUuO0FEEFMgeNzftaFSWT=D-yKGdP%2B1xnjZ4A@mail.gmail.com> <20121111073352.GA96046@FreeBSD.org> <509F72B0.90201@mu.org>

On Sun, Nov 11, 2012 at 1:41 AM, Albert Perlstein <bright@mu.org> wrote:
> The real conversation goes like this:
>
> user: "Why is my box seeing terrible network performance?"
> bsdguy: "Increase nmbclusters."
> user: "what is that?"
> bsdguy: "Oh those are the mbufs, just tell me your current value."
> user: "oh it's like 128000"
> bsdguy: "hmm try doubling that, go sysctl kern.ipc.nmbclusters=512000 on the
> command line."
> user: "ok"
> .... an hour passes ...
> user: "hmm now I can't fork any more copies of apache.."
> bsdguy: "oh, ok, you need to increase maxproc for that."
> user: "so sysctl kern.ipc.maxproc=10000?"
> bsdguy: "no... one second..."
> ....
> bsdguy: "ok, so that's sysctl kern.maxproc=10000"
> user: "ok... bbiaf"
> ....
> user: "so now i'm getting log messages about can't open sockets..."
> bsdguy: "oh you need to increase sockets bro... one second..."
> user: "sysctl kern.maxsockets?"
> bsdguy: "oh no.. it's actually back to kern.ipc.maxsockets"
> user: "alrighty then.."
> ....
> ....
> bsdguy: "so how is freebsd since I helped you tune it?"
> user: "well i kept hitting other resource limits, boss made me switch to
> Linux, it works out of the box and doesn't require an expert tuner to run a
> large scale server.  Y'know as a last ditch effort I looked around for this
> 'maxusers' thing but it seems like some eggheads retired it and instead of
> putting my job at risk, I just went with Linux, no one gets fired for using
> Linux."
> bsdguy: "managers are lame!"
> user: "yeah!  managers..."
>
> -Alfred

Now Albert.. I know that deliberately playing dumb is fun, but there
is no network difference between doubling "kern.maxusers" in
loader.conf (the only place it can be set, it isn't runtime tuneable)
and doubling "kern.ipc.nmbclusters" in the same place.  We've always
allowed people to fine-tune derived settings at runtime where it is
possible.

My position still is that instead of trying to dick around with
maxusers curve slopes to try and somehow get the scaling right, we
should instead be setting sensibly right from the start, by default.

The current scaling was written when we had severe kva constraints,
did reservations, etc.  Now they're a cap on dynamic allocators on
most platforms.

"Sensible" defaults would be *way* higher than the current maxusers
derived scaling curves.

My quick survey:
8G ram -> 65088 clusters -> clusters capped at 6.2% of physical ram
(running head)
3.5G ram -> 25600 clusters -> clusters capped at 5.0% of physical ram
(running an old head)
32G ram -> 25600 clusters -> clusters capped at 1.5% of physical ram
(running 9.1-stable)
72G ram -> 25600 clusters -> clusters capped at 0.06% of physical ram
(9.1-stable again)

As I've been saying from the beginning..  As these are limits on
dynamic allocators, not reservations, they should be as high as we can
comfortably set them without risking running out of other resources.

As the code stands now..  the derived limits for 4k, 9k and 16k jumbo
clusters is approximately the same space as 2K clusters.  (ie: 1 x 4k
cluster per 2 x 2k clusters, 1 x 16k cluster per 8 2k clusters, and so
on).  If we set a constant 6% for nmbclusters (since that's roughly
where we're at now for smaller machines after albert's changes), then
the worse case scenarios for 4k, 9k and 16k clusters are 6% each.  ie:
24% of wired, physical ram.

Plus all the other values derived from the nmbclusters tunable at boot.

I started writing this with the intention of suggesting 10% but that
might be a bit high given that:
kern_mbuf.c:		nmbjumbop = nmbclusters / 2;
kern_mbuf.c:		nmbjumbo9 = nmbclusters / 4;
kern_mbuf.c:		nmbjumbo16 = nmbclusters / 8;
.. basically quadruples the worst case limits.

Out of the box, 6% is infinitely better than we 0.06% we currently get
on a 9-stable machine with 72G ram.

But I object to dicking around with "maxusers" to derive network
buffer space default limits.  If we settle on something like 6%, then
it should be 6%.  That's easy to document and explain the meaning of
the tunable.
-- 
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGE5yCpn7znceWZsDqdw2tfCusSLroRjg6%2B=QJnonxoTQ8RjaA>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation