From owner-freebsd-net Thu Jul 11 14:13:23 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E57737B400 for ; Thu, 11 Jul 2002 14:13:19 -0700 (PDT) Received: from tesla.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3161A43E09 for ; Thu, 11 Jul 2002 14:13:18 -0700 (PDT) (envelope-from bmilekic@unixdaemons.com) Received: (from bmilekic@localhost) by tesla.distributel.net (8.11.6/8.11.6) id g6BLCtC19034; Thu, 11 Jul 2002 17:12:55 -0400 (EDT) (envelope-from bmilekic@unixdaemons.com) Date: Thu, 11 Jul 2002 17:12:55 -0400 From: Bosko Milekic To: Luigi Rizzo Cc: freebsd-net@FreeBSD.ORG Subject: Re: mbuf external buffer reference counters Message-ID: <20020711171255.A19014@unixdaemons.com> References: <20020711162026.A18717@unixdaemons.com> <20020711133802.A31827@iguana.icir.org> <20020711164225.A18852@unixdaemons.com> <20020711135608.A32460@iguana.icir.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20020711135608.A32460@iguana.icir.org>; from rizzo@icir.org on Thu, Jul 11, 2002 at 01:56:08PM -0700 Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Thu, Jul 11, 2002 at 01:56:08PM -0700, Luigi Rizzo wrote: > example: userland does an 8KB write, in the old case this requires > 4 clusters, with the new one you end up using 4 clusters and stuff > the remaining 16 bytes in a regular mbuf, then depending on the > relative producer-consumer speed the next write will try to fill > the mbuf and attach a new cluster, and so on... and when TCP hits > these data-in-mbuf blocks will have to copy rather than reference > the data blocks... > > Maybe it is irrelevant for performance, maybe it is not, > i am not sure. I see what you're saying. I think that what this means is simply that the `optimal' chunk of data to send is just a different size, so instead of it being 8192 bytes, it'll be something like 8180 bytes or something (to account for the counters). So, in other words, it really depends on the frequency of exact 8192 sized sends in userland applications. This is a good observation if we're going to be doing benchmarking, but I'm not sure whether the repercussions are that important (unless, as I said, there's a lot of applications that send exactly 8192 byte chunks?). Basically, what we're doing is shifting the optimal send size when using exactly 4 clusters, in this case, to (8192 - 16) bytes. We can still send with exactly 4 clusters, it's just that the optimal send size is a little different, that's all (this produces a small shift in block send benchmark curves, usually). > > The problem with this approach is that I'm probably going to be > > allocating jumbo bufs from the same map, in which case you would have > > huge `gaps' in your address <-> ref. count location map and, as a > > how huge ? and do you really need to use the same map rather than > two different ones ? Well, I can use a different map, I guess (I use a different map for mbufs in order to not let huge cluster allocations eat up all of the address space reserved for mbufs). However, it seems that jumbo bufs and clusters are logically equivalent (your driver will either use one or the other) so it would make sense to have them share the same `chunk' of address space. As for the gaps, they are quite huge. I think we calculated a week or so ago when discussing jumbo bufs that we would probably end up allocating them in chunks of 3 or 4 at a time. So that would mean at least ~9 page 'holes' in the address space from which clusters are allocated, so that would mean ~18 counters wasted, at least, for every hole. With the number of jumbo bufs we would have, that can really add up. > cheers > luigi > -- Bosko Milekic bmilekic@unixdaemons.com bmilekic@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message