From owner-freebsd-hackers Fri Nov 30 18:32:41 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from ussenterprise.ufp.org (ussenterprise.ufp.org [208.185.30.210]) by hub.freebsd.org (Postfix) with ESMTP id A9D2337B405 for ; Fri, 30 Nov 2001 18:32:36 -0800 (PST) Received: (from bicknell@localhost) by ussenterprise.ufp.org (8.11.1/8.11.1) id fB12WY604716; Fri, 30 Nov 2001 21:32:34 -0500 (EST) (envelope-from bicknell) Date: Fri, 30 Nov 2001 21:32:34 -0500 From: Leo Bicknell To: Luigi Rizzo Cc: Mike Silbersack , Alfred Perlstein , freebsd-hackers@FreeBSD.ORG Subject: Re: TCP Performance Graphs Message-ID: <20011130213234.A4327@ussenterprise.ufp.org> Mail-Followup-To: Luigi Rizzo , Mike Silbersack , Alfred Perlstein , freebsd-hackers@FreeBSD.ORG References: <20011130171418.B96592@ussenterprise.ufp.org> <20011130173033.G33041@iguana.aciri.org> <20011130203905.A2944@ussenterprise.ufp.org> <20011130174816.H33041@iguana.aciri.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20011130174816.H33041@iguana.aciri.org>; from rizzo@aciri.org on Fri, Nov 30, 2001 at 05:48:16PM -0800 Organization: United Federation of Planets Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Nov 30, 2001 at 05:48:16PM -0800, Luigi Rizzo wrote: > On Fri, Nov 30, 2001 at 08:39:05PM -0500, Leo Bicknell wrote: > > Note that if we implement a 'fair share' buffering scheme we would > > never get a failure, which would be a good thing. Unfortuantely > > fair share is relatively complicated. > > i don't get this. There is no relation among the max number > of mbufs and their potential consumers, such as network interfaces, > sockets, dummynet pipes, and others. And so it is unavoidable > that even giving 1 mbuf each, you'll eventually fail an allocation. Well, this is true. If the number of sockets exceeds the number of MBUF's you will run out, no matter how well you allocate them. A corner case that should be handled delicately, no doubt, but one much less likely to happen. If each client was limited to one, or even two MBUF's total throughput would be so slow that the admin of the box would notice. That, added to that fact that there are thousands of MBUF's by default makes it nearly impossible that the "ignorant sysadmin" (aka desktop it should just work user) would run into this case. So, I will rephrase. I think a fair-share scheme would solve this for at least 5 9's of the problem. > But note that what you say about bad failures is not really true. > Many pieces of the kernel now are pretty robust in the face of > failures -- certainly dummynet pipes, and the "sis" and "dc" drivers I'm my 'bad failures' is not so much that the box would crash or otherwise completely break itself. Rather my experience with exhausing MBUF's is that you can experience a sort of "capture" situation, where one or more busy connections can essentially starve out inactive connections. Those inactive connections may well be your ssh session where you're trying to fix it. Network performance when MBUF's are exhausted is eratic at best, and at worst completely stopped for a large number of processes on the system today. The nasty QoS word popped up when we talked about this before, that a QoS scheme could insure some connections go MBUF's, or even if there were more connections than MBUF's insure that connections got two at a time in a 'round robin' fashion or some other sheme to keep everything moving. If I could redesign buffering (from a TCP point of view) from the ground up I would: - Make the buffer size dymanic. Perhaps not at interrupt, but in a "unified vm" network should be able to take resources if it is active. - Make the buffers dynamically track individual connections. - Implement a fair-share mechanism. - Provide instrumentation to track when connections are slowed for lack of MBUF's. - Provide tuning parameters and maybe QoS parameters to be able to manage total buffer usage, individual connection buffer usage, and connection priorities. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message