From owner-freebsd-stable  Sat Aug 31 14: 6:27 2002
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 5721637B400; Sat, 31 Aug 2002 14:06:21 -0700 (PDT)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 9F3DA43E42; Sat, 31 Aug 2002 14:06:20 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g7VL6KPQ002377;
	Sat, 31 Aug 2002 14:06:20 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.5/8.12.4/Submit) id g7VL6JEU002376;
	Sat, 31 Aug 2002 14:06:19 -0700 (PDT)
	(envelope-from dillon)
Date: Sat, 31 Aug 2002 14:06:19 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200208312106.g7VL6JEU002376@apollo.backplane.com>
To: Arnvid Karstad <arnvid@karstad.org>
Cc: bmah@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG
Subject: Re: Problems with FreeBSD - causing zalloc to return 0 ?!
References: <20020830094151.41DC.ARNVID@karstad.org> <200208301652.g7UGq3Ud059184@intruder.bmah.org> <20020830190849.8B8A.ARNVID@karstad.org>
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-stable.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-stable>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-stable>
X-Loop: FreeBSD.ORG


:> almost identical.
:
:Just for an intresting side note....
:
:With option INVARIANTS we get no problems and vmstat's shows new highs'
:
:root@irc:/usr# vmstat -z | grep VNODE
:VNODE:           192,        0, 170782,     90,   170782
:
:With out.. it dies horribly when the number reaches around 44000-45000.
:
:Arnvid

    Ok, I've examined the kernel core dump.  I'm still not sure why
    INVARIANTS made any difference, but the kernel is definitely running
    out of KVM and it looks like the main culprit is the number of 
    mbufs and mbuf clusters configured.  It looks like they were 
    manually configured up.

    (kgdb) printf "%08x\n", kernel_vm_end
    ffc00000				<<<< indicates kernel ran out of KVM
    (kgdb) print nmbclusters
    $9 = 129536				<<<< this is huge.  autoconfigure
					does not do this, you must be 
					overriding it.
    (kgdb) print nmbufs
    $10 = 518144
    (kgdb) print nmbufs * 256 + nmbclusters * 2048
    $11 = 397934592			<<<< too much.  397MB reserved!

    (kgdb) print clean_map->header.end - clean_map->header.start
    $21 = 186744832			<<<< (mainly buffer cache)
    (kgdb) print mb_map->header.end - mb_map->header.start
    $22 = 397934592			<<<< KVM reservation for MBUFs
    (kgdb) print maxswzone
    $4 = 73400320			<<<< maxswzone (used to manage
					     swap)
    (kgdb) printf "%d\n", zone_kmem_kvaspace
    214933504				<<<< zones eating 214MB

define zlist
    set $zp = zlist
    while ($zp != 0)
        set $initmem = $zp->zmax * $zp->zsize
        set $addmem = $zp->ztotal * $zp->zsize
        printf "%p\t%-15s\t%8d init + %8d dyn = %8d\n", $zp, $zp->zname, $initmem, $addmem, $initmem + $addmem
        set $zp = $zp->znext
    end
    set $initmem = zone_kmem_kvaspace
    set $addmem = (zone_kmem_pages + zone_kern_pages ) * 0x1000
    printf "TOTAL ZONE KMEM RESERVED: %d init + %d dynamic = %d\n", $initmem, $addmem, $initmem + $addmem
end

(kgdb) zlist
0xda1c4e80      PIPE                   0 init +    16320 dyn =    16320
0xda15e780      SWAPMETA        51381120 init +        0 dyn = 51381120
0xda0d3100      ripcb           24870912 init +     4032 dyn = 24874944
0xda0d3180      syncache         2457440 init +     4000 dyn =  2461440
0xda0d3200      tcpcb           70467584 init +     8160 dyn = 70475744
0xda0d3280      udpcb           24870912 init +     8064 dyn = 24878976
0xda0d3300      unpcb                  0 init +     8000 dyn =     8000
0xda0d3380      socket          24870912 init +     8064 dyn = 24878976
0xda0d3400      DIRHASH                0 init +   729088 dyn =   729088
0xda0d3480      KNOTE                  0 init +     8192 dyn =     8192
0xda011e80      VNODE                  0 init +  8120448 dyn =  8120448
0xda011f00      NAMEI                  0 init +    16384 dyn =    16384
0xc2436900      VMSPACE                0 init +    12288 dyn =    12288
0xc2436a00      PROC                   0 init +    20384 dyn =    20384
0xc02b7a40      DP fakepg              0 init +        0 dyn =        0
0xc02c8700      PV ENTRY        21327600 init +  9174200 dyn = 30501800
0xc02b7be0      MAP ENTRY              0 init +    22464 dyn =    22464
0xc02b7b80      KMAP ENTRY       3859728 init +    10224 dyn =  3869952
0xc02b7c40      MAP                    0 init +     1080 dyn =     1080
0xc02bb320      VM OBJECT              0 init +  4080768 dyn =  4080768
TOTAL ZONE KMEM RESERVED: 214933504 init + 13156352 dynamic = 228089856

    You've run out of KVM, it looks mainly due to increasing the number
    of mbufs in the system beyond the autoconfig and you've also
    massively increased maxsockets, so much so that the zone allocator
    is reserving over 110 MB just to hold tcpcb and udpcb allocations.
    The tcpcb and udpcb zmemory reservations are huge!

    There are a couple of things you can do.  I recommend setting the 
    following kernel boot variables in /boot/loader.conf:

	kern.maxswzone="32m"
	kern.ipc.maxsockets="30000"

	(how many active sockets do you actually normally have?  Either you
	set your maxsockets to 129536 or the system autoconfig did it)

    In your kernel config:

	NSWAPDEV="2"

    Additionally I strongly recommend reducing the number of mbufs in
    the system.  You almost certainly have an NMBCLUSTERS thing in your
    kernel config or a kern.ipc.nmbclusters in your /boot/loader.conf
    to get a number so high (your is set to 129536).  I recommend:

	kern.ipc.nmbclusters="70000"

    If you are running out of buffer space I recommend reducing 
    net.inet.tcp.recvspace and net.inet.tcp.sendspace in /etc/sysctl.conf.
    Currently you have them set at:

    (kgdb) print tcp_recvspace
    $3 = 57344
    (kgdb) print tcp_sendspace
    $4 = 32768

    Try reducing sendspace to 24576 and tcp_recvspace to 32768.

    --

    I think that for large-memory machines I am still reserving too much
    KVM space for swap meta structures.  I am going to cut that down even
    more for this release.  It's obviously been party responsible for a lot
    of the KVM exhaustion problems people have reported on large-memory
    machines.

    However, it looks like the primary issue here is that you made
    the resource settings so high there was no room left for anything else
    in KVM.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message