Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Feb 2002 14:33:30 -0500
From:      Bosko Milekic <bmilekic@unixdaemons.com>
To:        Terry Lambert <tlambert2@mindspring.com>
Cc:        Jeff Roberson <jroberson@chesapeake.net>, arch@FreeBSD.ORG
Subject:   Re: Slab allocator
Message-ID:  <20020227143330.A34054@unixdaemons.com>
In-Reply-To: <3C7D1E31.B13915E7@mindspring.com>; from tlambert2@mindspring.com on Wed, Feb 27, 2002 at 09:58:09AM -0800
References:  <20020227005915.C17591-100000@mail.chesapeake.net> <3C7D1E31.B13915E7@mindspring.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Wed, Feb 27, 2002 at 09:58:09AM -0800, Terry Lambert wrote:
> First, let me say OUTSTANDING WORK!
> 
> Jeff Roberson wrote:
> > There are also per cpu queues of items, with a per cpu lock.  This allows
> > for very effecient allocation, and also it provides near linear
> > performance as the number of cpus increase.  I do still depend on giant to
> > talk to the back end page supplier (kmem_alloc, etc.).  Once the VM is
> > locked the allocator will not require giant at all.
> 
> What is the per-CPU lock required for?  I think it can be
> gotten rid of, or at least taken out of the critical path,
> with more information.

  Per-CPU caches. Reduces lock contention and trashes caches less often.
 
> > I would eventually like to pull other allocators into uma (The slab
> > allocator).  We could get rid of some of the kernel submaps and provide a
> > much more dynamic amount of various resources.  Something I had in mind
> > were pbufs and mbufs, which could easily come from uma.  This gives us the
> > ability to redistribute memory to wherever it is needed, and not lock it
> > in a particular place once it's there.
> 
> How do you handle interrupt-time allocation of mbufs, in
> this case?  The zalloci() handles this by pre-creation of
> the PTE's for the page mapping in the KVA, and then only
> has to deal with grabbing free physical pages to back them,
> which is a non-blocking operation that can occur at interrupt,
> and which, if it fails, is not fatal (i.e. it's handled; I've
> considered doing the same for the page mapping and PTE's, but
> that would make the time-to-run far less deterministic).

  Terry, how long will you keep thinking that mbufs come through the
zone allocator? :-) For G*d's sake man, we've been over this before!

> > There are a few things that need to be fixed right now.  For one, the zone
> > statistics don't reflect the items that are in the per cpu queues.  I'm
> > thinking about clean ways to collect this without locking every zone and
> > per cpu queue when some one calls sysctl.
> 
> The easy way around this is to say that these values are
> snpashots.  So you maintain the figures of merit on a per
> CPU basis in the context of the CPU doing the allocations
> and deallocations, and treat it as read-only for the
> purposes of statistics reporting.  This means that you
> don't need locks to get the statistics.  For debugging,
> you could provide a rigid locked interface (e.g. by only
> enabling locking for the statistics gathering via a sysctl
> that defaults to "off").

  Yes, this is exactly what we did with mb_alloc.

  This is also what I was trying to say in my last Email.
  
> > The other problem is with the per cpu buckets.  They are a
> > fixed size right now.  I need to define several zones for
> > the buckets to come from and a way to manage growing/shrinking
> > the buckets.
> 
> I built a "chain" allocator that dealt with this issue, and
> also the object granularit issue.  Basically, it calculated
> the LCM of the object size rounded to a MAX(sizeof(long),8)
> boundary for processor alignment sensitivity reasons, and
> the page size (also for processor sensitivity reasons), and
> then allocated a contiguous region from which it obtained
> objects of that type.  All in all, it meant zero unnecessary
> space wastage (for 1,000,000 TCP connections, the savings
> were 1/4 of a Gigabyte for one zone alone).

  That's great, until you run out of pre-allocated contiguous space.

[...]
> And thanks again for the most excellent work!
> 
> -- Terry

-- 
Bosko Milekic
bmilekic@unixdaemons.com
bmilekic@FreeBSD.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020227143330.A34054>