From owner-svn-src-all@FreeBSD.ORG Sun Nov 11 10:53:35 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8E0A97ED for ; Sun, 11 Nov 2012 10:53:35 +0000 (UTC) (envelope-from peter@wemm.org) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id EA2CF8FC08 for ; Sun, 11 Nov 2012 10:53:34 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id e12so4998116lag.13 for ; Sun, 11 Nov 2012 02:53:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wemm.org; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Hx8awx8rkQh9kRHYXo0aHqdgTdemN+nZTgWfzIWkWB0=; b=gRgMm1j5WkZc86uqjqfPxbb9YAGTb/Rqtsv+frFl9npa7Bi431zxK1Agxdvp8/AVK3 0+U8/BQTQj0QAl0g8YxU0Aj0D32xQ3AuWAxz1hH/SFaTm90GDHC7twhAfhUgrXpKK97x UApg95ogvYt4Iws4Yyl13Do0PjJzigty8FQvc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=Hx8awx8rkQh9kRHYXo0aHqdgTdemN+nZTgWfzIWkWB0=; b=XxnmpfO68cTKlmro8+J4dwC541XgP6QKZZWsjCeaWoP8JxjbokhhDRKiRG2OSfgLrK a2IbKoTBUDWnDJVNLksv879/gpV51+qIFOyjuEoz1P0LiU5negoUdO5KeHw+ZygNcT9S Po+fVrfHyis22bVxWL+FNPyBqnBfkkmeIhwJq//r0f87f/5fBhcTRfAy498p2Wp/SWQ9 55Zuo3AmUsRXDwYPZBKPMeMuP/jzWvl2rKWWzpYGzLODK8Mv+NXymroXrWxEU4voBiuu vank6XhSrH5CTpjS/3iVlgy3KsGdVZEcmfjCy5z/tBstnUDSXl+ce8kEqY4wTGM8stV0 KcyQ== MIME-Version: 1.0 Received: by 10.152.108.37 with SMTP id hh5mr15332512lab.52.1352631213508; Sun, 11 Nov 2012 02:53:33 -0800 (PST) Received: by 10.112.7.41 with HTTP; Sun, 11 Nov 2012 02:53:33 -0800 (PST) In-Reply-To: <509F72B0.90201@mu.org> References: <509DC25E.5030306@mu.org> <509E3162.5020702@FreeBSD.org> <509E7E7C.9000104@mu.org> <509E830D.5080006@mu.org> <1352568275.17290.85.camel@revolution.hippie.lan> <20121111061517.H1208@besplex.bde.org> <20121111073352.GA96046@FreeBSD.org> <509F72B0.90201@mu.org> Date: Sun, 11 Nov 2012 02:53:33 -0800 Message-ID: Subject: Re: svn commit: r242847 - in head/sys: i386/include kern From: Peter Wemm To: Alfred Perlstein Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmK0hZa83BlO42Z0qCOcF0O5qtlk1xJmfT1RHQbL2CODu8GoVYIyWYPZwIOATv+oIVRJ2mI Cc: svn-src-head@freebsd.org, Alexey Dokuchaev , src-committers@freebsd.org, svn-src-all@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Nov 2012 10:53:35 -0000 On Sun, Nov 11, 2012 at 1:41 AM, Albert Perlstein wrote: > The real conversation goes like this: > > user: "Why is my box seeing terrible network performance?" > bsdguy: "Increase nmbclusters." > user: "what is that?" > bsdguy: "Oh those are the mbufs, just tell me your current value." > user: "oh it's like 128000" > bsdguy: "hmm try doubling that, go sysctl kern.ipc.nmbclusters=512000 on the > command line." > user: "ok" > .... an hour passes ... > user: "hmm now I can't fork any more copies of apache.." > bsdguy: "oh, ok, you need to increase maxproc for that." > user: "so sysctl kern.ipc.maxproc=10000?" > bsdguy: "no... one second..." > .... > bsdguy: "ok, so that's sysctl kern.maxproc=10000" > user: "ok... bbiaf" > .... > user: "so now i'm getting log messages about can't open sockets..." > bsdguy: "oh you need to increase sockets bro... one second..." > user: "sysctl kern.maxsockets?" > bsdguy: "oh no.. it's actually back to kern.ipc.maxsockets" > user: "alrighty then.." > .... > .... > bsdguy: "so how is freebsd since I helped you tune it?" > user: "well i kept hitting other resource limits, boss made me switch to > Linux, it works out of the box and doesn't require an expert tuner to run a > large scale server. Y'know as a last ditch effort I looked around for this > 'maxusers' thing but it seems like some eggheads retired it and instead of > putting my job at risk, I just went with Linux, no one gets fired for using > Linux." > bsdguy: "managers are lame!" > user: "yeah! managers..." > > -Alfred Now Albert.. I know that deliberately playing dumb is fun, but there is no network difference between doubling "kern.maxusers" in loader.conf (the only place it can be set, it isn't runtime tuneable) and doubling "kern.ipc.nmbclusters" in the same place. We've always allowed people to fine-tune derived settings at runtime where it is possible. My position still is that instead of trying to dick around with maxusers curve slopes to try and somehow get the scaling right, we should instead be setting sensibly right from the start, by default. The current scaling was written when we had severe kva constraints, did reservations, etc. Now they're a cap on dynamic allocators on most platforms. "Sensible" defaults would be *way* higher than the current maxusers derived scaling curves. My quick survey: 8G ram -> 65088 clusters -> clusters capped at 6.2% of physical ram (running head) 3.5G ram -> 25600 clusters -> clusters capped at 5.0% of physical ram (running an old head) 32G ram -> 25600 clusters -> clusters capped at 1.5% of physical ram (running 9.1-stable) 72G ram -> 25600 clusters -> clusters capped at 0.06% of physical ram (9.1-stable again) As I've been saying from the beginning.. As these are limits on dynamic allocators, not reservations, they should be as high as we can comfortably set them without risking running out of other resources. As the code stands now.. the derived limits for 4k, 9k and 16k jumbo clusters is approximately the same space as 2K clusters. (ie: 1 x 4k cluster per 2 x 2k clusters, 1 x 16k cluster per 8 2k clusters, and so on). If we set a constant 6% for nmbclusters (since that's roughly where we're at now for smaller machines after albert's changes), then the worse case scenarios for 4k, 9k and 16k clusters are 6% each. ie: 24% of wired, physical ram. Plus all the other values derived from the nmbclusters tunable at boot. I started writing this with the intention of suggesting 10% but that might be a bit high given that: kern_mbuf.c: nmbjumbop = nmbclusters / 2; kern_mbuf.c: nmbjumbo9 = nmbclusters / 4; kern_mbuf.c: nmbjumbo16 = nmbclusters / 8; .. basically quadruples the worst case limits. Out of the box, 6% is infinitely better than we 0.06% we currently get on a 9-stable machine with 72G ram. But I object to dicking around with "maxusers" to derive network buffer space default limits. If we settle on something like 6%, then it should be 6%. That's easy to document and explain the meaning of the tunable. -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI6FJV "All of this is for nothing if we don't go to the stars" - JMS/B5 "If Java had true garbage collection, most programs would delete themselves upon execution." -- Robert Sewell