Date: Thu, 30 Jan 2003 17:22:11 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Scott Long <scott_long@btc.adaptec.com> Cc: Julian Elischer <julian@elischer.org>, David Schultz <dschultz@uclink.berkeley.edu>, "Andrew R. Reiter" <arr@watson.org>, arch@freebsd.org Subject: Re: PAE (was Re: bus_dmamem_alloc_size()) Message-ID: <3E39CFC3.9EF4A67E@mindspring.com> References: <Pine.BSF.4.21.0301301059310.35796-100000@InterJet.elischer.org> <3E39B52E.E46AF9EA@mindspring.com> <3E39C764.3070500@btc.adaptec.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Scott Long wrote: > > Using the memory by declaring a small copy window that's accessed > > via PAE in the kernel, and not really supporting PAE at all, can > > make this work... > > [...] > > This troll is totally unneccessary. Making peripheral devices work with > PAE is a matter handled between the device driver and the busdma system. > Drivers that cannot pass 64 bit bus addresses to their hardware will > have the data bounced by busdma, just like what happens in the ISA > world. The whole point of the busdma push that Robert and Maxime > started a few months ago is to prepare drivers for the possible coming > of PAE. This is a great idea, until you get that scatter/gather for network cards won't work very well in the context of the mbuf system, as it exists today, unless you are willing to split incoming and outgoing mbufs into two different pools, or you're willing to add a copy operation to everything. > Honestly, though, if you're going to spend the money on a PAE-capable > motherboard and all the memory to go along with it, are you really going > to put a Realtek nic and an Advansys scsi card into it? I'm going to have whatever the manufacturer put on the motherboard, most likely, which may or may not be 64bit capable. If I'm spending all the money building it up from "to spec" components in the first place, I'm more likely to just buy a 64bit machine, instead. My biggest cost is going to end up going to 3rd parth 64 bit capable cards, and RAM, anyway. > Also, the PAE work that might happen is not going to affect the vast > majority of FreeBSD/i386 users at all; I can only imagine that it will > be a config(8) option that will most likely default to 'off'. This would result in potentially significant duplicate sections of code in the VM system, seperated by #ifdef's, if true, unless all the VM references that needed to switch between 32 and 36 bits were macrotized, and certain parts rewritten from scratch. That's always possible, I suppose. > There is nothing to bikeshed here. Please respect that there are people > who need PAE, understand PAE, and will happily accept PAE. Those who do > not need, understand, or accept it can go along with their lives > blissfully happy with it turned off. Realize that I've personally built a system with 4G of memory, based on FreeBSD, that could handle 1.6M simultaneous connections, for a proxy caching company. We had a lot of reason to look into PAE, because number of simultaneous connections and number of mbufs available for caching data, are inversely proportional (obviously). Using PAE was one potential approach to the "add more RAM" approach to throwing resources rather than intelligence at the problem. The problem with using PAE for this application is that the mbuf chains can not be simultaneously available in the inbound and outbound space, without copying. Now it doesn't matter whether the inbound space is from a network card, or from a disk controller: if there is host processing that has to take place, then you have to span multiple of these PAE pages simultaneously. As Peter rightly points out, the regions are large enough to be problematic for paging. Effectively, you have to disassociate the VM and buffer cache, or find some way of supporting paging of much-larger-than-4K units. I have yet to see someone suggest a real application for PAE that wasn't tantamount to an L3 cache and/or a RAMdisk. It does not increase your UVA or KVA size above the 4G limit: all your pointers your compiler generates are still 32 bits, and you are still limited to 4G. What *would* have been useful is if the Intel guys had gone 64bit, like the AMD folks did, so that the UVA or KVA or both could be made larger than 4G. Frankly, the most useful thing that might come out of this is a change to the copyin/copyout/copyinstr/etc. code to seperate the UVA and KVA spaces, making them both 4G. At *that* point, it could be useful to make programs larger. But you could have that *without* PAE, and with PAE, you would *still* need to split the VM and buffer cache apart to create a copy boundary for kernel vs. user data. At least with an explicit coherency requirment, and the code to implement it, we could expect FS stacking to start working like it was designed to work, ten years ago. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E39CFC3.9EF4A67E>