From owner-freebsd-current@FreeBSD.ORG Thu Jun 3 10:03:53 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 307DB16A4CE for ; Thu, 3 Jun 2004 10:03:53 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0E8DA43D1F; Thu, 3 Jun 2004 10:03:52 -0700 (PDT) (envelope-from bmilekic@FreeBSD.org) Received: from freefall.freebsd.org (bmilekic@localhost [127.0.0.1]) i53H3qhZ009488; Thu, 3 Jun 2004 10:03:52 -0700 (PDT) (envelope-from bmilekic@freefall.freebsd.org) Received: (from bmilekic@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i53H3qCT009487; Thu, 3 Jun 2004 10:03:52 -0700 (PDT) (envelope-from bmilekic) Date: Thu, 3 Jun 2004 10:03:52 -0700 From: Bosko Milekic To: Wes Peters Message-ID: <20040603170352.GA9029@freefall.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i cc: freebsd-current@freebsd.org Subject: Re: [HEADS-UP] mbuma is in the tree X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Jun 2004 17:03:53 -0000 Wes Peters wrote: >It may also be worthwhile investigating eliminating clusters entirely. This >is the point Poul-Henning, Robert and I were trying to make at the end of >you talk at BSDCan. I believe only Poul-Henning was actually suggesting something along these lines at the end of the talk. As I explained then, this is not a good approach. First of all, it pessimizes the send case. You no longer have 2K of space for payload, you have 2K - whatever the mbuf struct is. Secondly, even on the receiver side, it is probably not worth the complication, especially with mbuma now. You should really both read the paper and read the code before you go ahead and toy around with an idea like this. You will notice that when we need both a cluster and an mbuf, there is no longer a double-allocation in the common case. It is a single object allocation as we maintain a secondary zone which caches mbufs with pre-allocated clusters. That means you're looking at one pcpu UMA lock in the common case right now and, if I have my way eventually, no locks whatsoever in the common UMA allocation path. By increasing the mbuf size, you will create a third type of object called a "large mbuf." You'll still need clusters for the sender side, and regular mbufs for smaller packets (there are many of these, refer to the paper). So in actual fact, you're not fixing anything, you're just introducing another object type that the mbuf code now needs to identify and free appropriately from the free routine. How about instead of this, you look into creating mini-mbufs, which are sort of like regular mbufs, but without the internal data region, and which are ONLY used for external storage. They work for all types of external storage, waste less space, and can be cached within a UMA zone and thus allocated as effectively a single object in the common case; this is exactly what happens now already with m_getcl(), except that there is some additional wastage due to the internal mbuf data region not being used. >Since the double allocation required to create a cluster makes the locking >(and cache slushing) requirements go up, it is probably worthwhile to >investigate if raising the nominal mbuf size doesn't end up decreasing >overall memory pressure. If you allocate more memory, but the allocation >takes less time due to the simpler locking, you may actually decrease the >total memory need. No. See above. >This is worth investigating partly because it is such a simple change. I >propose investigating with mbuf size of 2K, large enough to fit standard >ethernet frames, and a cluster size of 8K, which means a cluster mbuf is >large enough to hold a 9K jumbo frame. LOL. >Now that you've got mbuma in the tree, I can test this for you, unless this >proposal catches your interest enough to give it a try. I'll see if I >can't get a couple of our beefier machines at work updated to -CURRENT in >the next week. > >Thanks for the good work. Sure. You can test whatever you like. -Bosko