Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 07 Dec 2000 18:21:04 +0900
From:      Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>
To:        arch@freebsd.org
Cc:        Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>
Subject:   Even 1GB KVA is not enough, but we have no more space
Message-ID:  <vmsno08u4f.wl@rina.r.dl.itc.u-tokyo.ac.jp>

next in thread | raw e-mail | index | archive | help
As you may know, we now have a KVA space of 1GB. Some parts of our
kernel, however, believes that they can scale up the size of memory to
allocate in the KVA proportionally to the amount of physical
memory. The result is again shortage of KVA space, but we cannot
extend our KVA any further. (I understand that 1GB is the upper limit
of KVA on i386, am I right?)

The following is a mail I sent to Matt Dillon a few hours ago.

On Thu, 07 Dec 2000 14:56:32 +0900,
  Seigo Tanimura <tanimura> said:

Seigo> I recently bought a Dell PowerEdge 6400/700 with RAM of 3GB in my
Seigo> lab. The box runs -current quite well, except that it panics upon
Seigo> swapping out data pages.

Seigo> Here is how the PowerEdge dies. swap_zone in vm/swap_pager.c is not
Seigo> initialized because zinit() attempts to allocate for swblock entries
Seigo> an entry of about 250MB, which does not fit in any free entries in
Seigo> kernel_map. The pagedaemon eventally calls zalloc(swap_pages) in
Seigo> swp_pager_meta_build() to build swap metadata, leading to dereference
Seigo> of a NULL pointer. Another box of mine at home with 256MB RAM also
Seigo> runs -current, but the swap pager works fine.

Seigo> Attached is a patch to adjust the number of swap metadata entries so
Seigo> that the metadata fits in the KVA. The number of the entries are
Seigo> divided by 2 until zinit() succeeds. If the initial value of n in
Seigo> swap_pager_swap_init() (which is cnt.v_page_count * 2) is too big or
Seigo> zinit() does not succeed at all (hopefully not likely), you will see a
Seigo> note or warning. zlist is cleaned up if zinitna() fails to avoid
Seigo> vmstat -z messing up.
(patch moved to the bottom of this mail)

First my eye was only on the size of swap metadata, but that was
shortsighted. After fixing allocation of swap metadata, my kernel died
in ffs_vget(), when kernel_map held only one free page. I then
estimated how big swap metadata grows up with respect to the amount of
physical memory. We assume that the amount of swap metadata is
proportional to the amount of physical memory, and that swap metadata
takes 8% of physical memory (according to my measurement). The results
are shown below.


Physical Memory		swap metadata
256M			20.5M
512M			41.0M
1G			81.9M
2G			163.8M
3G			245.8M
4G			327.7M


So, on my PowerEdge, the kernel first attempts to allocate about 1/4
of the KVA for swap metadata. Although the size of swap metadata
reduces to around 64MB with my patch, the size of the remaining free
entry in kernel_map is only about 120MB.

The solution I have is that we do not count the size of physical
memory larger than the size of our KVA, or 1GB, upon estimating the
size of KVA space to allocate in kernel_map. Hence the kernel
allocates the same amount of memory for swap metadata or whatever, on
a machine with 1GB, 2GB, 3GB and 4GB RAM. This solution might degrage
the performance of our kernel, but you would have no other options
than to switch to alpha or ia64 in order to expand the size of KVA.

Thanks, and any comments, flames or whatever are welcome.

-- 
Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp> <tanimura@FreeBSD.org>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?vmsno08u4f.wl>