Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 7 Nov 2002 15:40:33 -0800
From:      David Schultz <dschultz@uclink.Berkeley.EDU>
To:        Gary Thorpe <gathorpe79@yahoo.com>
Cc:        freebsd-arch@FreeBSD.ORG
Subject:   Re: malloc(9) performance
Message-ID:  <20021107234033.GA67211@HAL9000.homeunix.com>
In-Reply-To: <20021107143721.74192.qmail@web11201.mail.yahoo.com>
References:  <20021107055331.GA63812@HAL9000.homeunix.com> <20021107143721.74192.qmail@web11201.mail.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Thus spake Gary Thorpe <gathorpe79@yahoo.com>:
> > When you make a single small allocation with malloc(3), you're
> > effectively making a page allocation and then fiddling with a
> > bitmap of free space within that page, so it takes longer than
> > allocating full pages.  The effort required is inversely
> > proportional to the size of the allocation for sub-pagesize
> > allocations.  The routine could be micro-optimized, e.g. by
> > replacing 'while (i >>= 1) j++;' with 'ffs(i & i-1);' and by
> > adding a hint to speed up searching of the bitmap, but I'm
> > skeptical that these tweaks would be worthwhile.  I don't know
> > anything about malloc(9), but I suspect it is orthogonal.
> 
> Shouldn't page allocations be minimized to the situation where there is
> not enough memory in partially free pages to satisfy the request? If
> this is the case, then the first small malloc() would have to allocate
> a page and set up bitmaps, but subsequent malloc()'s would only have to
> manipulate the bitmap as long as they can hold in the same page (slower
> than page-aligned malloc()'s when a new page is needed but it should be
> otherwise faster). Since the kernel also does internal management of
> its address space, would the same logic apply?

Remember that I was talking about malloc(3), which doesn't know
anything about memory pressure on the VM system.  For purposes
relevant to this discussion, it's allocating from an infinite pool
of virtual memory, which just happens to become backed by physical
memory when the program steps on it.  It does not easily run out
of pages that it can allocate; rather, it runs out of space among
the pages it has already requested from the operating system.

Specifically, it keeps lazily-allocated pools of memory in sizes
that are powers of two.  You're suggesting that it should
preallocate bitmaps for the entire address space (but not fault
them in, hopefully).  While this approach would reduce copying and
mmaping, there are a number of practical problems with it.  First
of all, you can't know ahead of time whether to make bitmaps for
512 byte allocations or for 2048 byte allocations, or some other
size.  Second, the application can manipulate the address space in
ways that are not under the control of malloc(), e.g. via mmap()
or sbrk().

I don't know enough about the kernel allocator to answer your last
question, but certainly the problem of not knowing in advance what
allocations will be used for and how big they will be is present.
Maybe you could do something clever with a radix tree bitmap, but
I'm not sure how worthwhile that would be.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20021107234033.GA67211>