Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Dec 2001 10:43:16 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Anjali Kulkarni <anjali@indranetworks.com>
Cc:        osa@freebsd.org.ru, freebsd-hackers@freebsd.org
Subject:   Re: Kernel Memory Limit
Message-ID:  <3C30B1C4.A492E138@mindspring.com>
References:  <006e01c1905f$25fd1380$0a00a8c0@indranet> <20011229164322.A73212@freebsd.org.ru> <002801c191c2$9c497fb0$0a00a8c0@indranet>

next in thread | previous in thread | raw e-mail | index | archive | help
Anjali Kulkarni wrote:
> I have tried this too, it makes absoutely no difference at all. My mallocs
> fail after a certain no. of runs of my code(and there is no memory leak),
> and there was no difference by increasing MAXDSIZ/DFLDSIZ.

You were asking how to increase your VM space.

What you should have been asking is how to keep your malloc's from
failing.

Most kernel memory allocations are type-stable.  That means that
once something is allocated for one purpose, it can't be reallocated
for another purpose, even if that original purpose represented a
burst usage condition.

This is good and bad.  It's good, in that it makes it very easy to
debug certain classes of problems.  It's bad, in that most of those
classes of problems have already been debugged (I think I found one
of the last of these when I found the credential reference count
overflow, more than six months ago).

Most of the allocations are also intrinsically type-stable, in that
when you allocate a page for an object type, it gets apportioned
out, and unless all objects in that memory region are given back to
the system, it can't be coelesced back to a whole page, and then
given back to the system.

It gets very tempting sometimes, to go to a "handle" based system,
like the Macintosh, so that the kernel can move memory allocations
around out from under the programs doing the allocation, without
then leaving invalid pointer references all over the place because
of hidden pointer cloning.

In any case, what this means is that:

1)	The steady state of kernel memory is page fragmented

	This means that you cannot expect large allocations
	to work, except at boot time, for any allocations that
	require contiguous physical pages.

	It's possible to work around this, but we would have
	to add ELF attribution of pages in addition to the
	tiny number of bits we support (ELF supports 32), to
	identify code in the paging (and then also relocation)
	path, since that would let us move around physical
	memory (in page chhunks) transparent to the other
	kernel subsystem's usage of memory.

2)	The steady state of pages is fragmented

	This means that you really need to get your pages at
	the earliest possible opportunity, or you won't be
	able to reclaim them.

	The exception to this rule is clean pages acting as
	buffer for FS data pages (clean backing objects), and
	dirty swap pages, which could be forced to swap.

Within these limits, it's possible to do larger allocations of
pages, discontiguous in the kernel virtual address space.


One common appraoch, if the only problem you need to solve is a
contiguous region of kernel virtual memory space, is to allocate
the page maps up front, but not to fill them in with backing
pages until later (this is how the mbuf, socket, intcb, pctcb,
and similar allocations based on the maximum number of files
allowed are typically handled currently).

To do this, allocate a zone using ZALLOCI() (a zone in which the
allocations are permitted to occur at interrupt time, by way of
page fault handling alone, rather than by way of allocation of
KVA space, which might involve forced swapping, which can not be
done at interrupt time).  This must be done at startup.

Another approach is to use boot time allocation, where a region
that is physically contiguous is allocated in the startup code in
/sys/i386/i386/machdep.c; this is difficult to do, unless you
understand the code in great detail (it is one of the tutorials
in my "FreeBSD memory mamangement at startup" article -- not yet
published).

I recommend *against* this approach, for your usage, since it
requires static allocation of a chunk of memory which can then
not be reused for any other purpose.  This is most useful in a
dedicated purpose embedded system product (for example).

Most likely, what you want is a defragger of code and data not in
the paging path.  Mike Smith and Matt Dillon probably could point
you in the right direction on this, but you will need some compiler
work (#pragma's for setting section attribution) to make it work,
and to identify code in FreeBSD's paging path and your defragger
code itself).

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C30B1C4.A492E138>