Date: Wed, 29 Aug 2012 00:25:14 -0500 From: Alan Cox <alc@rice.edu> To: "Jayachandran C." <c.jayachandran@gmail.com> Cc: mips@freebsd.org Subject: Re: mips pmap patch Message-ID: <503DA7BA.3030102@rice.edu> In-Reply-To: <CA%2B7sy7AbGvwu8UMhtOO-vX1b2gdhQQpb3wfmOuci3UNEZ8Z7EQ@mail.gmail.com> References: <50228F5C.1000408@rice.edu> <CA%2B7sy7DxqhGhJt%2BwE3WW2-j4SxnPweULjYS5GQ=NgMYSrwJHtw@mail.gmail.com> <50269AD4.9050804@rice.edu> <CA%2B7sy7AZ-s2H6COfvz60N=kxw%2BWUiUC9diVfWg9aOzWSZKGWRQ@mail.gmail.com> <5029635A.4050209@rice.edu> <CA%2B7sy7Cnsy7Ag1iG=_Kj04gEXeYp7kZnpACQpD8THvkp0VKdcA@mail.gmail.com> <502D2271.6080105@rice.edu> <CA%2B7sy7CK=EXu88XKYYXDV1uf3U7eebq3e6rfwgHRhQyFTMv7dQ@mail.gmail.com> <50325DC3.3090201@rice.edu> <CA%2B7sy7AbGvwu8UMhtOO-vX1b2gdhQQpb3wfmOuci3UNEZ8Z7EQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 08/27/2012 10:24, Jayachandran C. wrote: > On Mon, Aug 20, 2012 at 9:24 PM, Alan Cox<alc@rice.edu> wrote: >> On 08/20/2012 05:36, Jayachandran C. wrote: >>> On Thu, Aug 16, 2012 at 10:10 PM, Alan Cox<alc@rice.edu> wrote: >>>> On 08/15/2012 17:21, Jayachandran C. wrote: >>>>> On Tue, Aug 14, 2012 at 1:58 AM, Alan Cox<alc@rice.edu> wrote: >>>>>> On 08/13/2012 11:37, Jayachandran C. wrote: >>> [...] >>>>>>> I could not test for more than an hour on 32-bit due to another >>>>>>> problem (freelist 1 containing direct-mapped pages runs out of pages >>>>>>> after about an hour of compile test). This issue has been there for a >>>>>>> long time, I am planning to look at it when I get a chance. >>>>>>> >>>>>> What exactly happens? panic? deadlock? >>>>> The build slows down to a crawl and hangs when it runs out of pages in >>>>> the freelist. >>>> >>>> I'd like to see the output of "sysctl vm.phys_segs" and "sysctl >>>> vm.phys_free" from this machine. Even better would be running "sysctl >>>> vm.phys_free" every 60 seconds during the buildworld. Finally, I'd like >>>> to >>>> know whether or not either "ps" or "top" shows any threads blocked on the >>>> "swwrt" wait channel once things slow to a crawl. >>> I spent some time looking at this issue. I use a very large kernel >>> image with built-in root filesystem, and this takes about 120 MB out >>> of the direct mapped area. The remaining pages (~64 MB) are not enough >>> for the build process. If I increase free memory in this area either >>> by reducing the rootfs size of by adding a few more memory segments to >>> this area, the build goes through fine. >> >> I'm still curious to see what "sysctl vm.phys_segs" says. It sounds like >> roughly half of the direct map region is going to DRAM and half to >> memory-mapped I/O devices. Is that correct? > Yes, about half of the direct mapped region in 32-bit is taken by > flash, PCIe and other memory mapped IO. I also made the problem even > worse by not reclaiming some bootloader areas in the direct mapped > region, which reduced the available direct mapped memory. > > Here's the output of sysctls: > > root@testboard:/root # sysctl vm.phys_segs > vm.phys_segs: > SEGMENT 0: > > start: 0x887e000 > end: 0xc000000 > domain: 0 > free list: 0x887a407c > > SEGMENT 1: > > start: 0x1d000000 > end: 0x1fc00000 > domain: 0 > free list: 0x887a407c > > SEGMENT 2: > > start: 0x20000000 > end: 0xbc0b3000 > domain: 0 > free list: 0x887a3f38 > > SEGMENT 3: > > start: 0xe0000000 > end: 0xfffff000 > domain: 0 > free list: 0x887a3f38 > > root@testboard:/root # sysctl vm.phys_free > vm.phys_free: > FREE LIST 0: > > ORDER (SIZE) | NUMBER > | POOL 0 | POOL 1 | POOL 2 > -- -- -- -- -- -- -- -- > 8 ( 1024K) | 2877 | 0 | 0 > 7 ( 512K) | 0 | 1 | 0 > 6 ( 256K) | 1 | 0 | 0 > 5 ( 128K) | 0 | 1 | 0 > 4 ( 64K) | 0 | 1 | 0 > 3 ( 32K) | 0 | 1 | 0 > 2 ( 16K) | 0 | 1 | 0 > 1 ( 8K) | 0 | 0 | 0 > 0 ( 4K) | 0 | 0 | 0 > > FREE LIST 1: > > ORDER (SIZE) | NUMBER > | POOL 0 | POOL 1 | POOL 2 > -- -- -- -- -- -- -- -- > 8 ( 1024K) | 66 | 0 | 0 > 7 ( 512K) | 1 | 1 | 0 > 6 ( 256K) | 0 | 0 | 0 > 5 ( 128K) | 0 | 0 | 0 > 4 ( 64K) | 0 | 1 | 0 > 3 ( 32K) | 0 | 0 | 0 > 2 ( 16K) | 0 | 0 | 0 > 1 ( 8K) | 1 | 1 | 0 > 0 ( 4K) | 0 | 1 | 0 > >>> I also found that when the build slows down, most of the pages taken >>> from freelist 1 are allocated by the UMA subsystem, which seems to >>> keep quite a few pages allocated. >> >> At worst, it may be necessary to disable the use of uma_small_alloc() for >> this machine configuration. At best, uma_small_alloc() could be revised >> opportunistically use pages in the direct map region, but have the ability >> to fall back to pages that have to be mapped. > I think this probably is not a bug, but a configuration problem (we > cannot have such a huge built-in root filesystem when the direct > mapped area is at this low). Anyway, I have checked in code to > recover more areas from the bootloader, and this mostly solves the > issue for me. The above output is taken before the check-in. I'm afraid that exhaustion of freelist 1 is still highly likely to occur under some workloads that require the allocation of a lot of small objects in the kernel's heap. Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?503DA7BA.3030102>