From owner-freebsd-net Fri Jul 12 15:56:19 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E553837B400 for ; Fri, 12 Jul 2002 15:56:12 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0487843E42 for ; Fri, 12 Jul 2002 15:56:12 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.3/8.9.3) with ESMTP id SAA04553; Fri, 12 Jul 2002 18:56:07 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id g6CMtb043246; Fri, 12 Jul 2002 18:55:37 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15663.24169.445698.304534@grasshopper.cs.duke.edu> Date: Fri, 12 Jul 2002 18:55:37 -0400 (EDT) To: Julian Elischer Cc: Bosko Milekic , freebsd-net@FreeBSD.ORG Subject: Re: mbuf external buffer reference counters In-Reply-To: References: <20020712122811.GA52803@hades.hell.gr> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Julian Elischer writes: > > > On Fri, 12 Jul 2002, Giorgos Keramidas wrote: > > > On 2002-07-12 07:45 +0000, Bosko Milekic wrote: > > > > > > So I guess that what we're dealing with isn't really a > > > "monodirectional" ring. Right? > > > > No it isn't. It looks more like the "dining philosophers" problem. > > But that problem's solution would require at least one mutex for every > > part of the ring :-( > > Te stuff under consideration originally came from OSF/1 which became > true-64 > > that was heavily SMP > can anyone find out what they did? From looking at a Tru64 5.1 header file, it looks like they do per-ext locking and declare an MBUF_EXT_LOCK(m) macro. It is not clear how one is supposed to use this & it appears to be undocumented. Tru64 also has a global mbuf lock. Tru64 4.x does not appear to have the MBUF_EXT_LOCK (so I think it uses just the global MBUF_LOCK for all mbuf manipulations; and I'll bet that just does a 'splimp' on UP systems). AIX also has this nice ext_refq structure and it also appears to be doing per-ext locking. From mbuf.h, AIX's ext mbufs are all just malloc'ed memory. This jives with the pain & suffering I had when writing an ethernet driver for AIX & finding mbuf's which cross page boundaries. MacOS-X seems to have both a refq and a refcnt array like in -stable. It appears to use the refq for externally managed data and the refcnt for system clusters. As for locking, it looks a lot like Tru64 4.x -- it has a global mbuf lock. Perhaps this is what the original Mach did? WRT to using refqs -- I think that Bosko's system in -current is just as nice from a user's perspective, and if we can work out an acceptable solution for doing refcnts, lets not revert to refqs. I agree with John about where to put the refcnts: I think we should have a big hunk of memory for the refcnts like in -stable. My understanding is that the larger virtually contig mbufs are the only thing that would cause a problem for this, or is that incorrect? If so, then why not just put their counter elsewhere? One concrete example against putting the refcnts into the cluster is that it would cause NFS servers & clients to use 25% more mbufs for a typical 8K read or write request. Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message