Date: Wed, 27 Feb 2002 14:33:30 -0500 From: Bosko Milekic <bmilekic@unixdaemons.com> To: Terry Lambert <tlambert2@mindspring.com> Cc: Jeff Roberson <jroberson@chesapeake.net>, arch@FreeBSD.ORG Subject: Re: Slab allocator Message-ID: <20020227143330.A34054@unixdaemons.com> In-Reply-To: <3C7D1E31.B13915E7@mindspring.com>; from tlambert2@mindspring.com on Wed, Feb 27, 2002 at 09:58:09AM -0800 References: <20020227005915.C17591-100000@mail.chesapeake.net> <3C7D1E31.B13915E7@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 27, 2002 at 09:58:09AM -0800, Terry Lambert wrote: > First, let me say OUTSTANDING WORK! > > Jeff Roberson wrote: > > There are also per cpu queues of items, with a per cpu lock. This allows > > for very effecient allocation, and also it provides near linear > > performance as the number of cpus increase. I do still depend on giant to > > talk to the back end page supplier (kmem_alloc, etc.). Once the VM is > > locked the allocator will not require giant at all. > > What is the per-CPU lock required for? I think it can be > gotten rid of, or at least taken out of the critical path, > with more information. Per-CPU caches. Reduces lock contention and trashes caches less often. > > I would eventually like to pull other allocators into uma (The slab > > allocator). We could get rid of some of the kernel submaps and provide a > > much more dynamic amount of various resources. Something I had in mind > > were pbufs and mbufs, which could easily come from uma. This gives us the > > ability to redistribute memory to wherever it is needed, and not lock it > > in a particular place once it's there. > > How do you handle interrupt-time allocation of mbufs, in > this case? The zalloci() handles this by pre-creation of > the PTE's for the page mapping in the KVA, and then only > has to deal with grabbing free physical pages to back them, > which is a non-blocking operation that can occur at interrupt, > and which, if it fails, is not fatal (i.e. it's handled; I've > considered doing the same for the page mapping and PTE's, but > that would make the time-to-run far less deterministic). Terry, how long will you keep thinking that mbufs come through the zone allocator? :-) For G*d's sake man, we've been over this before! > > There are a few things that need to be fixed right now. For one, the zone > > statistics don't reflect the items that are in the per cpu queues. I'm > > thinking about clean ways to collect this without locking every zone and > > per cpu queue when some one calls sysctl. > > The easy way around this is to say that these values are > snpashots. So you maintain the figures of merit on a per > CPU basis in the context of the CPU doing the allocations > and deallocations, and treat it as read-only for the > purposes of statistics reporting. This means that you > don't need locks to get the statistics. For debugging, > you could provide a rigid locked interface (e.g. by only > enabling locking for the statistics gathering via a sysctl > that defaults to "off"). Yes, this is exactly what we did with mb_alloc. This is also what I was trying to say in my last Email. > > The other problem is with the per cpu buckets. They are a > > fixed size right now. I need to define several zones for > > the buckets to come from and a way to manage growing/shrinking > > the buckets. > > I built a "chain" allocator that dealt with this issue, and > also the object granularit issue. Basically, it calculated > the LCM of the object size rounded to a MAX(sizeof(long),8) > boundary for processor alignment sensitivity reasons, and > the page size (also for processor sensitivity reasons), and > then allocated a contiguous region from which it obtained > objects of that type. All in all, it meant zero unnecessary > space wastage (for 1,000,000 TCP connections, the savings > were 1/4 of a Gigabyte for one zone alone). That's great, until you run out of pre-allocated contiguous space. [...] > And thanks again for the most excellent work! > > -- Terry -- Bosko Milekic bmilekic@unixdaemons.com bmilekic@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020227143330.A34054>