Date: Tue, 08 Oct 2002 08:17:42 -0700 From: Terry Lambert <tlambert2@mindspring.com> To: "Vladimir B. Grebenschikov" <vova@sw.ru> Cc: Mikhail Teterin <mi+mx@aldan.algebra.com>, arch@FreeBSD.org Subject: Re: using mem above 4Gb was: swapon some regular file Message-ID: <3DA2F716.C5B69C7C@mindspring.com> References: <200210071630.42512.mi%2Bmx@aldan.algebra.com> <1034074876.917.23.camel@vbook.express.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
"Vladimir B. Grebenschikov" wrote: > May be we need add new type to md device, like "highmem", to access > memory above 4G as memory disk, and as consequence use it as swap-device > or as fast /tmp/ partition or whatever ? > > In this case we will be able to use more than 3Gb of RAM. We can use 4G now. But: KVA + UVA + window <= 4G ...so by adding a window in which to access the extra memory above 4G, you actually *reduce* the amount of RAM available to either each process, or the kernel, or both. A RAM-disk is probably not worth doing this; the place most people are bumping their head is the UVA (data space for the process itself) or KVA (data space for mbufs, mappings for pages, etc.). For example, if you have 4G of RAM, to support a large number of network connections, you have to spend ~2G on mbufs, which means spending 1G on mappings and other kernel structures, leaving only 1G for UVA. That means that, in order to get your RAM disk, you have to either firther reduce the size of your server processes, or you have to reduce the number of connections you will be able to support simultaneously. Example: 64k simultaneous connections * 32k window per connection = 2G of mbufs ...say you overcommit this memory by a factor of 4; you are still only talking a quarter of a million connections. If you hack all your kernel allocations to use the minimum amount possible, and pare down your structures to get rid of kevent pointers that you don't use, and other things you don't use, you can steal some of the 1G KVA for more mbufs. Then, if you hack the TCP stack window management code rather signficantly (e.g. drop the average window to 4k), then you can push 1,000,000 connections. That leaves you about 512b of context per connection in the user space applicaition. The best I've ever done is 1.6 million simultaneous connections; to do that, I had to drop space out of a lot of structures (64 bytes for 1,000,000 connections is 64M of RAM -- not insignificant). So whatever connections you are getting now... halve that, or less, to get a window for your RAM disk (you will need KVA for mappings for all the memory that *can* be in the window, etc.). It's not really worth using it directly. On the other hand, if you could allocate pools of memory for per processor use, you basically gain most of that overhead back -- though, without TCP/IP stack changes and interrupt processing changes, you can't use the regained memory for, e.g., mbufs, because the way things stand now, mbufs have to be visible at: o DMA o IRQ o NETISR o Application ...and you can't guarantee a nice clean division, because you don't route interrupts to a particular CPU, or have connections to sockets in a particular CPU's address space, or have your applications running on a particular CPU, so that the CPU can have a seperate address space, so you don't have to worry about migration, etc.. So for an extremely high capacity box, you will have to do tricks, like logically splitting the box into seperate virtual machines, and seperating out the code path from the network card, all the way to the application. Just like we were discussing earlier. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DA2F716.C5B69C7C>