FreeBSD Mail Archives

Date:      Tue, 08 Oct 2002 08:17:42 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        "Vladimir B. Grebenschikov" <vova@sw.ru>
Cc:        Mikhail Teterin <mi+mx@aldan.algebra.com>, arch@FreeBSD.org
Subject:   Re: using mem above 4Gb was: swapon some regular file
Message-ID:  <3DA2F716.C5B69C7C@mindspring.com>
References:  <200210071630.42512.mi%2Bmx@aldan.algebra.com> <1034074876.917.23.camel@vbook.express.ru>

"Vladimir B. Grebenschikov" wrote:
> May be we need add new type to md device, like "highmem", to access
> memory above 4G as memory disk, and as consequence use it as swap-device
> or as fast /tmp/ partition or whatever ?
> 
> In this case we will be able to use more than 3Gb of RAM.

We can use 4G now.  But:

KVA + UVA + window <= 4G

...so by adding a window in which to access the extra memory
above 4G, you actually *reduce* the amount of RAM available
to either each process, or the kernel, or both.

A RAM-disk is probably not worth doing this; the place most
people are bumping their head is the UVA (data space for the
process itself) or KVA (data space for mbufs, mappings for
pages, etc.).

For example, if you have 4G of RAM, to support a large number
of network connections, you have to spend ~2G on mbufs, which
means spending 1G on mappings and other kernel structures,
leaving only 1G for UVA.

That means that, in order to get your RAM disk, you have to
either firther reduce the size of your server processes, or
you have to reduce the number of connections you will be able
to support simultaneously.

Example:

	64k simultaneous connections
	*
	32k window per connection
	= 2G of mbufs

...say you overcommit this memory by a factor of 4; you are still
only talking a quarter of a million connections.

If you hack all your kernel allocations to use the minimum amount
possible, and pare down your structures to get rid of kevent
pointers that you don't use, and other things you don't use, you
can steal some of the 1G KVA for more mbufs.

Then, if you hack the TCP stack window management code rather
signficantly (e.g. drop the average window to 4k), then you can
push 1,000,000 connections.

That leaves you about 512b of context per connection in the user
space applicaition.

The best I've ever done is 1.6 million simultaneous connections;
to do that, I had to drop space out of a lot of structures (64
bytes for 1,000,000 connections is 64M of RAM -- not insignificant).

So whatever connections you are getting now... halve that, or less,
to get a window for your RAM disk (you will need KVA for mappings
for all the memory that *can* be in the window, etc.).

It's not really worth using it directly.

On the other hand, if you could allocate pools of memory for per
processor use, you basically gain most of that overhead back --
though, without TCP/IP stack changes and interrupt processing
changes, you can't use the regained memory for, e.g., mbufs,
because the way things stand now, mbufs have to be visible at:

o	DMA
o	IRQ
o	NETISR
o	Application

...and you can't guarantee a nice clean division, because you
don't route interrupts to a particular CPU, or have connections
to sockets in a particular CPU's address space, or have your
applications running on a particular CPU, so that the CPU can
have a seperate address space, so you don't have to worry about
migration, etc..

So for an extremely high capacity box, you will have to do
tricks, like logically splitting the box into seperate virtual
machines, and seperating out the code path from the network
card, all the way to the application.

Just like we were discussing earlier.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3DA2F716.C5B69C7C>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation