From owner-freebsd-stable Sat Aug 31 14: 6:27 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5721637B400; Sat, 31 Aug 2002 14:06:21 -0700 (PDT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F3DA43E42; Sat, 31 Aug 2002 14:06:20 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g7VL6KPQ002377; Sat, 31 Aug 2002 14:06:20 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.5/8.12.4/Submit) id g7VL6JEU002376; Sat, 31 Aug 2002 14:06:19 -0700 (PDT) (envelope-from dillon) Date: Sat, 31 Aug 2002 14:06:19 -0700 (PDT) From: Matthew Dillon Message-Id: <200208312106.g7VL6JEU002376@apollo.backplane.com> To: Arnvid Karstad Cc: bmah@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG Subject: Re: Problems with FreeBSD - causing zalloc to return 0 ?! References: <20020830094151.41DC.ARNVID@karstad.org> <200208301652.g7UGq3Ud059184@intruder.bmah.org> <20020830190849.8B8A.ARNVID@karstad.org> Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> almost identical. : :Just for an intresting side note.... : :With option INVARIANTS we get no problems and vmstat's shows new highs' : :root@irc:/usr# vmstat -z | grep VNODE :VNODE: 192, 0, 170782, 90, 170782 : :With out.. it dies horribly when the number reaches around 44000-45000. : :Arnvid Ok, I've examined the kernel core dump. I'm still not sure why INVARIANTS made any difference, but the kernel is definitely running out of KVM and it looks like the main culprit is the number of mbufs and mbuf clusters configured. It looks like they were manually configured up. (kgdb) printf "%08x\n", kernel_vm_end ffc00000 <<<< indicates kernel ran out of KVM (kgdb) print nmbclusters $9 = 129536 <<<< this is huge. autoconfigure does not do this, you must be overriding it. (kgdb) print nmbufs $10 = 518144 (kgdb) print nmbufs * 256 + nmbclusters * 2048 $11 = 397934592 <<<< too much. 397MB reserved! (kgdb) print clean_map->header.end - clean_map->header.start $21 = 186744832 <<<< (mainly buffer cache) (kgdb) print mb_map->header.end - mb_map->header.start $22 = 397934592 <<<< KVM reservation for MBUFs (kgdb) print maxswzone $4 = 73400320 <<<< maxswzone (used to manage swap) (kgdb) printf "%d\n", zone_kmem_kvaspace 214933504 <<<< zones eating 214MB define zlist set $zp = zlist while ($zp != 0) set $initmem = $zp->zmax * $zp->zsize set $addmem = $zp->ztotal * $zp->zsize printf "%p\t%-15s\t%8d init + %8d dyn = %8d\n", $zp, $zp->zname, $initmem, $addmem, $initmem + $addmem set $zp = $zp->znext end set $initmem = zone_kmem_kvaspace set $addmem = (zone_kmem_pages + zone_kern_pages ) * 0x1000 printf "TOTAL ZONE KMEM RESERVED: %d init + %d dynamic = %d\n", $initmem, $addmem, $initmem + $addmem end (kgdb) zlist 0xda1c4e80 PIPE 0 init + 16320 dyn = 16320 0xda15e780 SWAPMETA 51381120 init + 0 dyn = 51381120 0xda0d3100 ripcb 24870912 init + 4032 dyn = 24874944 0xda0d3180 syncache 2457440 init + 4000 dyn = 2461440 0xda0d3200 tcpcb 70467584 init + 8160 dyn = 70475744 0xda0d3280 udpcb 24870912 init + 8064 dyn = 24878976 0xda0d3300 unpcb 0 init + 8000 dyn = 8000 0xda0d3380 socket 24870912 init + 8064 dyn = 24878976 0xda0d3400 DIRHASH 0 init + 729088 dyn = 729088 0xda0d3480 KNOTE 0 init + 8192 dyn = 8192 0xda011e80 VNODE 0 init + 8120448 dyn = 8120448 0xda011f00 NAMEI 0 init + 16384 dyn = 16384 0xc2436900 VMSPACE 0 init + 12288 dyn = 12288 0xc2436a00 PROC 0 init + 20384 dyn = 20384 0xc02b7a40 DP fakepg 0 init + 0 dyn = 0 0xc02c8700 PV ENTRY 21327600 init + 9174200 dyn = 30501800 0xc02b7be0 MAP ENTRY 0 init + 22464 dyn = 22464 0xc02b7b80 KMAP ENTRY 3859728 init + 10224 dyn = 3869952 0xc02b7c40 MAP 0 init + 1080 dyn = 1080 0xc02bb320 VM OBJECT 0 init + 4080768 dyn = 4080768 TOTAL ZONE KMEM RESERVED: 214933504 init + 13156352 dynamic = 228089856 You've run out of KVM, it looks mainly due to increasing the number of mbufs in the system beyond the autoconfig and you've also massively increased maxsockets, so much so that the zone allocator is reserving over 110 MB just to hold tcpcb and udpcb allocations. The tcpcb and udpcb zmemory reservations are huge! There are a couple of things you can do. I recommend setting the following kernel boot variables in /boot/loader.conf: kern.maxswzone="32m" kern.ipc.maxsockets="30000" (how many active sockets do you actually normally have? Either you set your maxsockets to 129536 or the system autoconfig did it) In your kernel config: NSWAPDEV="2" Additionally I strongly recommend reducing the number of mbufs in the system. You almost certainly have an NMBCLUSTERS thing in your kernel config or a kern.ipc.nmbclusters in your /boot/loader.conf to get a number so high (your is set to 129536). I recommend: kern.ipc.nmbclusters="70000" If you are running out of buffer space I recommend reducing net.inet.tcp.recvspace and net.inet.tcp.sendspace in /etc/sysctl.conf. Currently you have them set at: (kgdb) print tcp_recvspace $3 = 57344 (kgdb) print tcp_sendspace $4 = 32768 Try reducing sendspace to 24576 and tcp_recvspace to 32768. -- I think that for large-memory machines I am still reserving too much KVM space for swap meta structures. I am going to cut that down even more for this release. It's obviously been party responsible for a lot of the KVM exhaustion problems people have reported on large-memory machines. However, it looks like the primary issue here is that you made the resource settings so high there was no room left for anything else in KVM. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message