Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Jan 2003 15:28:46 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Julian Elischer <julian@elischer.org>
Cc:        David Schultz <dschultz@uclink.berkeley.edu>, "Andrew R. Reiter" <arr@watson.org>, Scott Long <scott_long@btc.adaptec.com>, arch@FreeBSD.ORG
Subject:   Re: PAE (was Re: bus_dmamem_alloc_size())
Message-ID:  <3E39B52E.E46AF9EA@mindspring.com>
References:  <Pine.BSF.4.21.0301301059310.35796-100000@InterJet.elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Julian Elischer wrote:
> The reason for PAE is simple.
> 
> Disk caches need not be in mapped memory. Physical memory will do.
> If you want to cache more than  4GB, then PAE is an effective answer.
> 
> (Assuming I have my TLAs the right way around..)

Using the memory by declaring a small copy window that's accessed
via PAE in the kernel, and not really supporting PAE at all, can
make this work...

BUT... it will only work with 64 bit disk controllers, on
motherboards with chipsets that support the full 64 bit DMA
address path to memory.  Many nominally "64 bit" systems apply
only to data path, not to addressing, and it is nearly inpossible
to determine if a given card supports it, or not, based on
attribution in the driver device control block, since most of
the drivers in FreeBSD are not written to capture this information,
or make it available.

In addition, you can not use it for mbufs for scatter/gather DMA,
without spreading the bank selection code throughout the kernel,
and accessing the mbufs through a window you remap, for puppup into
32 bit processes from the "36 bit" address space, for received
mbufs.  You *might* be able to deal with this for writes to the
wire, but you are talking about adding copies to get the data
there, or talking about some serious stack modifications to define
a "PAE" or "PSE36" external mbuf type.

The relative access penalty for memory because of clock multipliers
that are way in the hell up there (133MHz memory on 2.1GHz systems),
means that the relative cost of bank selection is very expensive
relative to just doing bus transfers.

It *may* be useful for something like an NFS server, but it's *not*
useful for a proxy cache, and it's not useful for most other
applications that need shared access to the same data, rather than
their own copies of data.  And it doesn't help with databases with
more than a single access session, unless both sessions have a high
locality of reference with each other.  For something like a big
Oracle server at a credit processing center, you aren't ever going
to get the necessary locality.

What this boils down to is stall barriers, waiting for the CPU to
process the data in one mapped window, so that it can then remap
the window and reprocess the data there, instead, if you are doing
any work on the data, whatsoever.

Add to this the fact that you can buy 64bit systems today, down at
Fry's, and online from other vendors, and FreeBSD 5.0 runs on these
systems in native 64bit mode, there's really no reason to try and
cram more than 4G of RAM into a 32 bit system, and then grab another
4 bits worth of RAM through bank selection,and eat the stall and
processing barriers.

Personally, I don't really see the point; in the best case, the
intrusions into the VM and other subsystems will be done in such
a way that there are macros to make the code go away; it would be
really painful to deal with 2M vs. 4M pages for PSE36, and it
would be real painful to have to eat overhead that buys you nothing,
and add complexity, for the common case of 32 bit machines with
memory less than or equal to what can be addressed with 32 bits.


-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E39B52E.E46AF9EA>