From owner-freebsd-fs@FreeBSD.ORG Thu Jan 14 17:07:32 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6EED106566C for ; Thu, 14 Jan 2010 17:07:32 +0000 (UTC) (envelope-from djp@polands.org) Received: from hrndva-omtalb.mail.rr.com (hrndva-omtalb.mail.rr.com [71.74.56.124]) by mx1.freebsd.org (Postfix) with ESMTP id 839598FC0C for ; Thu, 14 Jan 2010 17:07:32 +0000 (UTC) Received: from hrndva-omtalb.mail.rr.com ([10.128.143.54]) by hrndva-qmta03.mail.rr.com with ESMTP id <20100114165147888.YBHC8345@hrndva-qmta03.mail.rr.com> for ; Thu, 14 Jan 2010 16:51:47 +0000 X-Authority-Analysis: v=1.0 c=1 a=sVhNVL3m-NYA:10 a=6I5d2MoRAAAA:8 a=bqq2Vc5EAAAA:8 a=LS4BNEoFbNd2vE0K8A4A:9 a=_WeRSUFL9el-dtXa90wA:7 a=pTbTa6b-dium-lvMrPdQpwZWhs4A:4 a=MZUFvVIdZ-cA:10 a=5ERLOmoKdHQA:10 X-Cloudmark-Score: 0 X-Originating-IP: 75.87.219.217 Received: from [75.87.219.217] ([75.87.219.217:57004] helo=haran.polands.org) by hrndva-oedge04.mail.rr.com (envelope-from ) (ecelerity 2.2.2.39 r()) with ESMTP id C2/01-10402-56B4F4B4; Thu, 14 Jan 2010 16:50:45 +0000 Received: from ammon.polands.org (ammon.polands.org [172.16.1.7]) by haran.polands.org (8.14.3/8.14.3) with ESMTP id o0EGoiI5092454; Thu, 14 Jan 2010 10:50:44 -0600 (CST) (envelope-from djp@polands.org) Received: from ammon.polands.org (localhost [127.0.0.1]) by ammon.polands.org (8.14.3/8.14.3) with ESMTP id o0EGoikO099185; Thu, 14 Jan 2010 10:50:44 -0600 (CST) (envelope-from djp@ammon.polands.org) Received: (from djp@localhost) by ammon.polands.org (8.14.3/8.14.3/Submit) id o0EGoi37099184; Thu, 14 Jan 2010 10:50:44 -0600 (CST) (envelope-from djp) Date: Thu, 14 Jan 2010 10:50:44 -0600 From: Doug Poland To: Ivan Voras Message-ID: <20100114165034.GA99127@polands.org> References: <9bbcef731001131035x604cdea1t81b14589cb10ad25@mail.gmail.com> <9bbcef731001131157h256c4d14mbb241bc4326405f8@mail.gmail.com> <3aa09fd8723749d1fa65f1b9a6faac60.squirrel@email.polands.org> <27117211dd662bcf93055f4351243396.squirrel@email.polands.org> <9bbcef731001140650h5d887843ubc6d555da993e8b6@mail.gmail.com> <9bbcef731001140815h5ee1d672je58c8ec91382e8d4@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9bbcef731001140815h5ee1d672je58c8ec91382e8d4@mail.gmail.com> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: 8.0-R-p2 ZFS: unixbench causing kmem exhaustion panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jan 2010 17:07:32 -0000 Hello, I am posting the end of this thread http://lists.freebsd.org/pipermail/freebsd-questions/2010-January/210963.html concering ZFS panics on an amd64 box with 4GB RAM and 6 SCSI drives in RAIDZ1. Ivan Voras suggested I forward to this list for further analysis. On Thu, Jan 14, 2010 at 05:15:47PM +0100, Ivan Voras wrote: > 2010/1/14 Doug Poland : > >>>>> > >>>>> kstat.zfs.misc.arcstats.size > >>>>> > >>>>> seemed to fluctuate between about 164,000,00 and 180,000,000 > >>>>> bytes during this last run > >>>> > >>>> Is that with or without panicking? > >>>> > >>> with a panic > >>> > >>> > >>>> If the system did panic then it looks like the problem is a > >>>> memory leak somewhere else in the kernel, which you could confirm > >>>> by monitoring vmstat -z. > >>>> > >>> I'll give that a try. Am I looking for specific items in vmstat > >>> -z arc*, zil*, zfs*, zio*. Please advise. > >> > >> You should look for whatever is allocating all your memory between > >> 180 MB (which is your ARC size) and 1.2 GB (which is your kmem > >> size). > >> > > > > OK, another run, this time back to vfs.zfs.arc_max=512M in > > /boot/loader.conf, and a panic: > > > > panic: kmem malloc(131072): kmem map too small: 1294258176 total > > allocated > > > > I admit I do not fully understand what metrics are important to > > proper analysis of this issue. ??In this case, I was watching the > > following within 1 second of the panic: > > > > sysctl kstat.zfs.misc.arcstats.size: 41739944 > > sysctl vfs.numvnodes: 678 > > sysctl vfs.zfs.arc_max: 536870912 > > sysctl vfs.zfs.arc_meta_limit: 134217728 > > sysctl vfs.zfs.arc_meta_used: 7228584 > > sysctl vfs.zfs.arc_min: 67108864 > > sysctl vfs.zfs.cache_flush_disable: 0 > > sysctl vfs.zfs.debug: 0 > > sysctl vfs.zfs.mdcomp_disable: 0 > > sysctl vfs.zfs.prefetch_disable: 1 > > sysctl vfs.zfs.recover: 0 > > sysctl vfs.zfs.scrub_limit: 10 > > sysctl vfs.zfs.super_owner: 0 > > sysctl vfs.zfs.txg.synctime: 5 > > sysctl vfs.zfs.txg.timeout: 30 > > sysctl vfs.zfs.vdev.aggregation_limit: 131072 > > sysctl vfs.zfs.vdev.cache.bshift: 16 > > sysctl vfs.zfs.vdev.cache.max: 16384 > > sysctl vfs.zfs.vdev.cache.size: 10485760 > > sysctl vfs.zfs.vdev.max_pending: 35 > > sysctl vfs.zfs.vdev.min_pending: 4 > > sysctl vfs.zfs.vdev.ramp_rate: 2 > > sysctl vfs.zfs.vdev.time_shift: 6 > > sysctl vfs.zfs.version.acl: 1 > > sysctl vfs.zfs.version.dmu_backup_header: 2 > > sysctl vfs.zfs.version.dmu_backup_stream: 1 > > sysctl vfs.zfs.version.spa: 13 > > sysctl vfs.zfs.version.vdev_boot: 1 > > sysctl vfs.zfs.version.zpl: 3 > > sysctl vfs.zfs.zfetch.array_rd_sz: 1048576 > > sysctl vfs.zfs.zfetch.block_cap: 256 > > sysctl vfs.zfs.zfetch.max_streams: 8 > > sysctl vfs.zfs.zfetch.min_sec_reap: 2 > > sysctl vfs.zfs.zil_disable: 0 > > sysctl vm.kmem_size: 1327202304 > > sysctl vm.kmem_size_max: 329853485875 > > sysctl vm.kmem_size_min: 0 > > sysctl vm.kmem_size_scale: 3 > > > > > > vmstat -z | egrep -i 'zfs|zil|arc|zio|files' > > ITEM SIZE LIMIT USED FREE REQUESTS > > zio_cache: 720, 0, 53562, 98, 86386955, > > arc_buf_hdr_t: 208, 0, 1193, 31, 11990, > > arc_buf_t: 72, 0, 1180, 120, 11990, > > zil_lwb_cache: 200, 0, 11580, 2594, 62407, > > zfs_znode_cache: 376, 0, 605, 55, 654, > > > > vmstat -m |grep solaris|sed 's/K//'|awk '{print "vm.solaris:", $3*1024}' > > > > > > solaris: 1285068800 > > > > The value I see as the culprit is vmstat -m | grep solaris. ??This > > value fluctuates wildly during the run and is always near kmem_size > > at the time of the panic. > > > > Again, I'm not sure what to look for here, and you are patiently > > helping me along in this process. ??If you have any tips or can > > point me to docs on how to easily monitor these values, I will > > endeavor to do so. > > The only really important ones should be kstat.zfs.misc.arcstats.size > (which you very rarely print) and vm.kmem_size. The "solaris" entry > above should be near kstat.zfs.misc.arcstats.size in all cases. > > But I don't have any more ideas here. Try taking this post (also > include kstst.zfs.misc.arcstats.size) to the freebsd-fs@ mailing list. > Thank you for your help, I will take this over to the freebsd-fs list. -- Regards, Doug