From owner-freebsd-stable Thu Sep 5 19:59:55 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9B71C37B400 for ; Thu, 5 Sep 2002 19:59:47 -0700 (PDT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 00A3D43E3B for ; Thu, 5 Sep 2002 19:59:47 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.5/8.12.4) with ESMTP id g862xkPQ084693; Thu, 5 Sep 2002 19:59:46 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.5/8.12.4/Submit) id g862xkqJ084692; Thu, 5 Sep 2002 19:59:46 -0700 (PDT) (envelope-from dillon) Date: Thu, 5 Sep 2002 19:59:46 -0700 (PDT) From: Matthew Dillon Message-Id: <200209060259.g862xkqJ084692@apollo.backplane.com> To: "Marc G. Fournier" Cc: stable@freebsd.org Subject: Re: [JUPITER] Fatal trap 12: page fault while in kernel mode (Was: Re: Woo hoo ... it crashed!! ) References: <20020905221431.B15209-100000@hub.org> Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG (adding the general list back in!) : :On Thu, 5 Sep 2002, Matthew Dillon wrote: : :> Whew! These are big! :-). I've got jupiter's files downloaded, now :> it's working on venus. : :Ya, both are 4gig servers :) That was why netdump was so critical, cause :they are also both production servers, so choking it back at 2gig *really* :hurt ;) : Ok, you are running out of KVM! Yes indeed, that is what is happening. It must be happening quickly or that while loop diagnostic you did would have caught it. (kgdb) print kernel_vm_end $77 = 0xff400000 I'm not entirely sure but I believe SMP boxes reserve more page table pages then non-SMP boxes (e.g. an extra segment or two, and each segment represents 4MB of VM). So this could be hitting the limit. I'm going to dump a bunch of statistics first, then I'll analyize them: (kgdb) zlist 0xc943e780 NFSNODE 0 init + 56109152 dyn = 56109152 0xc943e800 NFSMOUNT 0 init + 83776 dyn = 83776 0xc92db580 PIPE 0 init + 799680 dyn = 799680 0xc92b4a80 SWAPMETA 37282560 init + 1044480 dyn = 38327040 0xc92b4f80 unpcb 0 init + 400000 dyn = 400000 0xc9254000 ripcb 2949120 init + 8064 dyn = 2957184 0xc9254080 syncache 2457440 init + 16320 dyn = 2473760 0xc9254100 tcpcb 8355840 init + 1114112 dyn = 9469952 0xc9254180 udpcb 2949120 init + 81792 dyn = 3030912 0xc9254200 socket 2949120 init + 794496 dyn = 3743616 0xc9254280 DIRHASH 0 init + 2007040 dyn = 2007040 0xc9254300 KNOTE 0 init + 12288 dyn = 12288 0xc9032d00 VNODE 0 init + 45344256 dyn = 45344256 0xc9032d80 NAMEI 0 init + 139264 dyn = 139264 0xc6302a80 VMSPACE 0 init + 700416 dyn = 700416 0xc6302b00 PROC 0 init + 1528800 dyn = 1528800 0xc0228e40 DP fakepg 0 init + 0 dyn = 0 0xc0239b40 PV ENTRY 92320648 init + 28901124 dyn = 121221772 0xc0228fc0 MAP ENTRY 0 init + 5334624 dyn = 5334624 0xc0228f60 KMAP ENTRY 12180000 init + 673776 dyn = 12853776 0xc0229020 MAP 0 init + 1080 dyn = 1080 0xc022c700 VM OBJECT 0 init + 26998656 dyn = 26998656 TOTAL ZONE KMEM RESERVED: 132546560 init + 147156992 dynamic = 279703552 Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) linux 7 1K 1K102400K 7 0 0 32 NFS hash 1 512K 512K102400K 1 0 0 512K NQNFS Lease 1 1K 1K102400K 1 0 0 1K NFSV3 srvdesc 28 1K 4K102400K314484627 0 0 16,256 NFSV3 diroff 145 73K 355K102400K 41817 0 0 512 NFS daemon 69 7K 7K102400K 69 0 0 64,256,512 NFS req 1 1K 3K102400K157377573 0 0 64 NFS srvsock 1 1K 1K102400K 1 0 0 256 atkbddev 2 1K 1K102400K 2 0 0 32 memdesc 1 4K 4K102400K 1 0 0 4K mbuf 1 24K 24K102400K 1 0 0 32K isadev 4 1K 1K102400K 4 0 0 64 ZONE 16 2K 2K102400K 16 0 0 128 VM pgdata 1 128K 128K102400K 1 0 0 128K file desc 3174 913K 1280K102400K 503534 0 0 256,512,1K,2K,4K,8K UFS dirhash 1593 636K 1158K102400K 223440 0 0 16,32,64,128,256,512,1K,4K,8K UFS mount 15 59K 59K102400K 15 0 0 512,2K,8K,32K UFS ihash 1 512K 512K102400K 1 0 0 512K FFS node210654 52664K 58024K102400K122698088 0 0 256 dirrem 130 5K 952K102400K 480890 0 0 32 mkdir 0 0K 7K102400K 1612 0 0 32 diradd 130 5K 101K102400K 418137 0 0 32 freefile 62 2K 727K102400K 248861 0 0 32 freeblks 74 10K 2493K102400K 213310 0 0 128 freefrag 6 1K 23K102400K 142018 0 0 32 allocindir 1 1K 289K102400K 441331 0 0 64 indirdep 2 1K 81K102400K 13249 0 0 32,8K allocdirect 24 2K 124K102400K 338310 0 0 64 bmsafemap 26 1K 5K102400K 204068 0 0 32 newblk 1 1K 1K102400K 779642 0 0 32,256 inodedep 262 545K 4055K102400K 601135 0 0 128,512K pagedep 175 139K 277K102400K 298594 0 0 64,128K p1003.1b 1 1K 1K102400K 1 0 0 16 syncache 1 8K 8K102400K 1 0 0 8K tseg_qent 0 0K 1K102400K 7034 0 0 32 IpFw/IpAcct 1 1K 1K102400K 1 0 0 256 in_multi 2 1K 1K102400K 2 0 0 32 routetbl 1064 160K 2119K102400K 755905 0 0 16,32,64,128,256 ether_multi 7 1K 1K102400K 7 0 0 16,32,64 ifaddr 166 41K 41K102400K 168 0 0 32,64,256,2K BPF 7 65K 129K102400K 15918 0 0 16,32,128,32K vnodes 114 7K 7K102400K 355 0 0 16,32,64,128,256 mount 284 142K 144K102400K 290 0 0 16,128,512 cluster_save buffer 0 0K 1K102400K 38874 0 0 32,64 vfscache254807 17001K 17642K102400K228931259 0 0 64,128,256,512K BIO buffer 1270 1294K 4610K102400K 2299198 0 0 512,1K,2K dev_t 1209 152K 152K102400K 1209 0 0 128 pcb 321 10K 24K102400K 728734 0 0 16,32,64,2K soname 1211 106K 125K102400K168093621 0 0 16,32,64,128 timecounter 5 1K 1K102400K 5 0 0 128 ptys 19 10K 10K102400K 19 0 0 512 ttys 753 95K 104K102400K 2118 0 0 128,256 shm 121 123K 138K102400K 32447 0 0 16,1K,16K sem 3 324K 324K102400K 3 0 0 4K,128K,256K msg 4 25K 25K102400K 4 0 0 512,4K,16K rman 31 2K 2K102400K 382 0 0 16,64 iov 0 0K 1K102400K 3813 0 0 128 ioctlops 0 0K 1K102400K 10 0 0 512,1K taskqueue 1 1K 1K102400K 1 0 0 32 SWAP 2 345K 417K102400K 4 0 0 32,128K,512K eventhandler 14 1K 1K102400K 15 0 0 32,64 bus 307 31K 33K102400K 715 0 0 16,32,64,128,256,512,1K,2K,4K sysctloid 20 1K 1K102400K 20 0 0 16,64 sysctl 0 0K 1K102400K 106018 0 0 16,32 uidinfo 13 2K 2K102400K 1895 0 0 32,1K cred 2879 360K 377K102400K 12385093 0 0 128 subproc 5025 416K 587K102400K 940754 0 0 32,64,256 proc 2 8K 8K102400K 2 0 0 4K session 971 61K 75K102400K 75835 0 0 64 pgrp 989 31K 38K102400K 76258 0 0 32 kld 40 614K 619K102400K 155 0 0 16,32,128,1K,2K,4K,8K,16K,32K,64K,128K,512K temp 97 117K 1338K102400K 40654648 0 0 16,32,64,128,256,512,1K,2K,4K,8K,128K devbuf 71 133K 133K102400K 134 0 0 16,32,64,128,256,512,2K,4K,16K,32K lockf 503 32K 39K102400K 3599530 0 0 64 prison 137 69K 70K102400K 142 0 0 512 atexit 1 1K 1K102400K 1 0 0 16 zombie 1 1K 27K102400K 411208 0 0 128 proc-args 1679 87K 145K102400K 5861534 0 0 16,32,64,128,256 kqueue 352 352K 439K102400K 579340 0 0 256,1K sigio 29 1K 3K102400K 2170 0 0 32 file 18689 1169K 1351K102400K 44809454 0 0 64 Memory Totals: In Use Free Requests 79619K 16815K 1109926660 (kgdb) printf "%d\n", kmemlimit - kmembase 256901120 OK, so zalloc'd space is eating 280MB, the kernel malloc area is eating 80MB, the clean_map (contains the buffer cache) is eating 251MB. kmem_map is around 256MB. Total == 867MB. The remainder is eaten up by various other reserved areas. zalloc'd space uses kernel_map, malloc'd space uses kmem_map. There's plenty of KVM free in the malloc space / kmem_map but it looks like zalloc() blew out the kernel_map. Analysis: There are a huge number of vnodes, nfsnodes, VM objects (goes hand-in-hand with vnodes. Normally this would not be a problem, but you also have a huge number of processes! ps -M vmcore.jupiter.00 -N kernel.jupiter -ax | wc -l 2071 The huge number of processes are eating a huge number of PV entries for page table mappings. Over 121 megabytes, in fact! I have one word to say about all of this: Wow. I believe that you can solve the problem by reducing kern.maxvnodes. It is currently set at 259786. It may also be possible to reduce the number of PV entries being allocated but I'd have to look into that. Try reducing it to 150000. (must be done at boot time in /etc/sysctl.conf or /etc/rc.local, or manually just after the machine boots up). sysctl -w kern.maxvnodes=150000 If necessary you can also reduce the size of the buffer cache with a /boot/loader.conf variable, though this should be a last resort, and you can completely turn off swap (but that will only save 32MB of KVM and swap is useful... you are using 80MB of your swap to get rid of idle system daemon's memory). -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message