Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Jun 2001 10:11:29 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Bosko Milekic <bmilekic@technokratis.com>
Cc:        Terry Lambert <tlambert@primenet.com>, freebsd-alpha@FreeBSD.ORG
Subject:   Re: vx, lge, nge, dc, rl, sf, sis, sk, vr, wb users please TEST
Message-ID:  <3B30D941.6AE93443@mindspring.com>
References:  <20010619191602.A28591@technokratis.com> <200106200224.TAA24251@usr05.primenet.com> <20010619232624.A29829@technokratis.com> <3B304ADF.C5131399@mindspring.com> <20010620123029.A34452@technokratis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Bosko Milekic wrote:
> > I wouldn't exactly say "absolutely", unless you are actually
> > measuring significant faulting as a result; I would say that
> > they are harmless.
> 
>         Oh, I think I wasn't too clear above. When I said
> "page fault" I meant "fatal page fault" (i.e. the address
> space isn't even mapped, it may even sit in another submap
> of kmem_map). It's not a problem right now because the mbuf
> allocator allocates "sequentially" from the base of mb_map.

VALLOC() is a lot of protection, given the base of
mb_map.  /sys/i386/i386/machdep.c protects you pretty
well, no problem.


> So let's say we have a cluster in a page X, where X >
> mb_map_base. Then, we're guaranteed that all Y, where
> X > Y > mb_map_base consists of already wired-down pages,
> so we're safe. Actually, now that I think about it, if
> we just so happen to hit the mbuf on the first page in
> mb_map and the space before that is not
> mapped, then we may see this too - but it's very unlikely.
> As I mentionned previously, instead of just having `mb_map'
> where clusters and mbufs are allocated from, in the new
> allocator I have a `mbuf_map' and a `clust_map.' So, when
> the data being bcopy()'d comes from the first cluster of
> clust_map, and the previous space is unmapped, the
> fault is a sure thing and we'll end up crashing.

I can see where that would be a problem with the new
allocator.

May I offer a suggestion?  The purpose of having the
clusters be a fixed size, given a much larger number
of mbufs has never really been clear to me, given that
mbufs are allocated to act as cluster headers, to make
things a tiny bit (but not substantially) easier, when
it comes to freing chains, etc..

It seems to me that what you really want to do is
allocate _different sizes_ of mbufs, and have the
deallocator sort them out on free.

This could result in a substantial space savings, as
the majority of the mbuf used for the cluster header
is not used for anything useful.

Do you have any interest in generalizing your allocator?

Eventually, you will probably want to do allocations of
things other than mbufs.

Also, the buckets should probably be permitted to be
some multiple of the page size, in order to permit the
allocation of odd-sized structures, and allow them to
span page boundaries, if you went ahead with a more
general approach.

I guess the next thing to think about after that would
be allocation at interrupt time.  I think this can be
done using the ziniti() approach; but you would then
reserve the KVA space for use by the allocator at page
fault time, instead.

I have some other methods of getting around faulting,
when you have sufficient backing store.  Now that there
are systems that have the capability of having as much
RAM as the KVA space (e.g. the KVA space is really no
longer sparse), there are a number of optimizations
that become pretty obvious.

That's not true for the Alpha, yet, but the Intel KVA
space is definitely matured to the point where physical
RAM equals the possible KVA + user space addressable
memory.

NB: The windowed access to more than 4G (e.g. the AMD
16G processor that uses "megasegments" to access the
extra memory) has nevery struck me as being useful,
unless you can partition you working set data enough
to deal with the window flipping overhead issue that
would result, so I'm not too concerned about trying
to support them -- they seem more useful for VMWare
type applications, where you context switch at a very
low level in order to run multiple instances of a
kernel, or similar pig tricks.  The IA64 and AMD
"Sledgehammer" will make us think about these things
again, as soon as someone comes up with a MB/Memory
sticks that beat the 4G limit without melting down.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-alpha" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B30D941.6AE93443>