From owner-freebsd-questions@FreeBSD.ORG Thu Jan 14 16:10:28 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B8FEE1065670; Thu, 14 Jan 2010 16:10:28 +0000 (UTC) (envelope-from doug@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.122]) by mx1.freebsd.org (Postfix) with ESMTP id 86ED98FC12; Thu, 14 Jan 2010 16:10:22 +0000 (UTC) X-Authority-Analysis: v=1.0 c=1 a=sVhNVL3m-NYA:10 a=5oGaD+IacabtnYYqBVNCkQ==:17 a=bqq2Vc5EAAAA:8 a=vwW2RXWJ9rMwOB1OK4YA:9 a=c2DGg9vIrhK9ClHdTlKGdYJx3dUA:4 a=5ERLOmoKdHQA:10 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:65522] helo=haran.polands.org) by hrndva-oedge02.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id 07/0A-17464-DE14F4B4; Thu, 14 Jan 2010 16:10:21 +0000 Received: from email.polands.org (ammon.polands.org [172.16.1.7]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0EGAKsH092329; Thu, 14 Jan 2010 10:10:20 -0600 (CST) (envelope-from doug@polands.org) Received: from 209.103.214.34 (SquirrelMail authenticated user djp) by email.polands.org with HTTP; Thu, 14 Jan 2010 10:10:20 -0600 Message-ID: In-Reply-To: <9bbcef731001140650h5d887843ubc6d555da993e8b6@mail.gmail.com> References: <8418112cdfada93d83ca0cb5307c1d21.squirrel@email.polands.org> <9bbcef731001131035x604cdea1t81b14589cb10ad25@mail.gmail.com> <9bbcef731001131157h256c4d14mbb241bc4326405f8@mail.gmail.com> <3aa09fd8723749d1fa65f1b9a6faac60.squirrel@email.polands.org> <27117211dd662bcf93055f4351243396.squirrel@email.polands.org> <9bbcef731001140650h5d887843ubc6d555da993e8b6@mail.gmail.com> Date: Thu, 14 Jan 2010 10:10:20 -0600 From: "Doug Poland" To: "Ivan Voras" User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-questions@freebsd.org Subject: Re: 8.0-R-p2 ZFS: unixbench causing kmem exhaustion panic X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jan 2010 16:10:28 -0000 On Thu, January 14, 2010 08:50, Ivan Voras wrote: > 2010/1/14 Doug Poland : >>>> >>>> kstat.zfs.misc.arcstats.size >>>> >>>> seemed to fluctuate between about 164,000,00 and 180,000,000 bytes >>>> during this last run >>> >>> Is that with or without panicking? >>> >> with a panic >> >> >>> If the system did panic then it looks like the problem is a memory >>> leak somewhere else in the kernel, which you could confirm by >>> monitoring vmstat -z. >>> >> I'll give that a try.  Am I looking for specific items in vmstat >> -z? arc*, zil*, zfs*, zio*? Please advise. > > You should look for whatever is allocating all your memory between 180 > MB (which is your ARC size) and 1.2 GB (which is your kmem size). > OK, another run, this time back to vfs.zfs.arc_max=512M in /boot/loader.conf, and a panic: panic: kmem malloc(131072): kmem map too small: 1294258176 total allocated I admit I do not fully understand what metrics are important to proper analysis of this issue. In this case, I was watching the following within 1 second of the panic: sysctl kstat.zfs.misc.arcstats.size: 41739944 sysctl vfs.numvnodes: 678 sysctl vfs.zfs.arc_max: 536870912 sysctl vfs.zfs.arc_meta_limit: 134217728 sysctl vfs.zfs.arc_meta_used: 7228584 sysctl vfs.zfs.arc_min: 67108864 sysctl vfs.zfs.cache_flush_disable: 0 sysctl vfs.zfs.debug: 0 sysctl vfs.zfs.mdcomp_disable: 0 sysctl vfs.zfs.prefetch_disable: 1 sysctl vfs.zfs.recover: 0 sysctl vfs.zfs.scrub_limit: 10 sysctl vfs.zfs.super_owner: 0 sysctl vfs.zfs.txg.synctime: 5 sysctl vfs.zfs.txg.timeout: 30 sysctl vfs.zfs.vdev.aggregation_limit: 131072 sysctl vfs.zfs.vdev.cache.bshift: 16 sysctl vfs.zfs.vdev.cache.max: 16384 sysctl vfs.zfs.vdev.cache.size: 10485760 sysctl vfs.zfs.vdev.max_pending: 35 sysctl vfs.zfs.vdev.min_pending: 4 sysctl vfs.zfs.vdev.ramp_rate: 2 sysctl vfs.zfs.vdev.time_shift: 6 sysctl vfs.zfs.version.acl: 1 sysctl vfs.zfs.version.dmu_backup_header: 2 sysctl vfs.zfs.version.dmu_backup_stream: 1 sysctl vfs.zfs.version.spa: 13 sysctl vfs.zfs.version.vdev_boot: 1 sysctl vfs.zfs.version.zpl: 3 sysctl vfs.zfs.zfetch.array_rd_sz: 1048576 sysctl vfs.zfs.zfetch.block_cap: 256 sysctl vfs.zfs.zfetch.max_streams: 8 sysctl vfs.zfs.zfetch.min_sec_reap: 2 sysctl vfs.zfs.zil_disable: 0 sysctl vm.kmem_size: 1327202304 sysctl vm.kmem_size_max: 329853485875 sysctl vm.kmem_size_min: 0 sysctl vm.kmem_size_scale: 3 vmstat -z | egrep -i 'zfs|zil|arc|zio|files' ITEM SIZE LIMIT USED FREE REQUESTS Files: 80, 0, 116, 199, 850713 zio_cache: 720, 0, 53562, 98, 86386955 arc_buf_hdr_t: 208, 0, 1193, 31, 11990 arc_buf_t: 72, 0, 1180, 120, 11990 zil_lwb_cache: 200, 0, 11580, 2594, 62407 zfs_znode_cache: 376, 0, 605, 55, 654 vmstat -m |grep solaris|sed 's/K//'|awk '{print "vm.solaris:", $3*1024}' solaris: 1285068800 The value I see as the culprit is vmstat -m | grep solaris. This value fluctuates wildly during the run and is always near kmem_size at the time of the panic. Again, I'm not sure what to look for here, and you are patiently helping me along in this process. If you have any tips or can point me to docs on how to easily monitor these values, I will endeavor to do so. -- Regards, Doug