From owner-freebsd-net Fri Jul 5 7:14:41 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5BCD837B400 for ; Fri, 5 Jul 2002 07:14:38 -0700 (PDT) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F14143E31 for ; Fri, 5 Jul 2002 07:14:37 -0700 (PDT) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.9.3/8.9.3) with ESMTP id KAA22717; Fri, 5 Jul 2002 10:14:35 -0400 (EDT) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id g65EE5W28982; Fri, 5 Jul 2002 10:14:05 -0400 (EDT) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15653.43437.658461.49860@grasshopper.cs.duke.edu> Date: Fri, 5 Jul 2002 10:14:05 -0400 (EDT) To: Bosko Milekic Cc: "Kenneth D. Merry" , net@FreeBSD.ORG Subject: Re: virtually contig jumbo mbufs (was Re: new zero copy sockets snapshot) In-Reply-To: <20020705093435.A25047@unixdaemons.com> References: <20020619120641.A18434@unixdaemons.com> <15633.17238.109126.952673@grasshopper.cs.duke.edu> <20020619233721.A30669@unixdaemons.com> <15633.62357.79381.405511@grasshopper.cs.duke.edu> <20020620114511.A22413@unixdaemons.com> <15634.534.696063.241224@grasshopper.cs.duke.edu> <20020620134723.A22954@unixdaemons.com> <15652.46870.463359.853754@grasshopper.cs.duke.edu> <20020705002056.A5365@unixdaemons.com> <15653.35919.24295.698563@grasshopper.cs.duke.edu> <20020705093435.A25047@unixdaemons.com> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Bosko Milekic writes: > > [ -current trimmed ] > > On Fri, Jul 05, 2002 at 08:08:47AM -0400, Andrew Gallatin wrote: > > Would this be easier or harder than simple, physically contiguous > > buffers? I think that its only worth doing if its easier to manage at > > the system level, otherwise you might as well use physically > > contiguous mbufs. My main goal is to see the per-driver cache's of > > physical memory disappear ;) > > It would be much easier. The problem with getting physically > contiguous memory is that shortly after the system gets going, memory > becomes fragmented. So, from a system's perspective, it's very hard to > get physically contiguous pages. This is why you see most drivers that > actually do this sort of thing pre-allocate a pool of such beasts early > on during boot up. Unfortunately, this means that they'll be eating up > a lot of memory (some may stay unused forever) at a very early stage. What's worse, you might have 2 drivers, and have one driver with buffers but no load, and another with load but no buffers. And contigmalloc fails often for loadable modules. > As for the guarantee that the data region will start at a page > boundary, yes I can ensure that as long as you don't tamper with > offsetting the m_data field of the mbuf after the allocator hands it to > you. That is, I can guarantee this: > > [ mbuf ] > [ ] > [ m_data -]--->[ jumbo buf ] > [ (page 1) ] > [-----------] > [ (page 2) ] > [-----------] > [ (page 3) ] > > So, as you see, it is deffinately do-able to have all the jumbo bufs > start at a page boundary; however, it may be more worthwhile to have > some of them start later. We would have to implement it first and we > would have to do some measurements to see what really works best. > > What I hate about jumbo bufs is that there's a lot of wastage in the > last (3rd page). I can't exactly use the last half of that last page > for, say, a 2K cluster, because then I wouldn't be respecting the > bucket "infrastructure" in mb_alloc that allows easy implementation of > page freeing. Pretty much the only "realistic" thing I could do is > allocate jumbo bufs in groups of 3 or 4; this would ensure much less > wastage but would mean that not all of them would start at page > boundaries. I think this would be fine, But we'd need to know more about the hardware limitations of the popular GiGE boards out there. We know Tigon-II can handle 4 scatters, but are there any that can handle 3 but not four? Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message