From owner-freebsd-fs@FreeBSD.ORG Wed Feb 15 10:21:32 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 378FE1065672 for ; Wed, 15 Feb 2012 10:21:32 +0000 (UTC) (envelope-from gkontos.mail@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id F27428FC15 for ; Wed, 15 Feb 2012 10:21:31 +0000 (UTC) Received: by mail-iy0-f182.google.com with SMTP id o4so1623027iae.13 for ; Wed, 15 Feb 2012 02:21:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=i9ZS6xlY9JcU0NY3Rcana2QdUH9/L0bDWsf8eDlD0OY=; b=kHvrylJSLvXSVQbNlsQtSyVq3yChX6ubR3gMxfuGlrDdsO46SDpsB3JnrBfBQVAIOd EovfFDh1+NfB6Umm4+Vtjva9CiBYK0fIERZf7kijvlAOffOxHtQoRxjt3NLrAF8PKglY uiMweXo3nUPhFBeJhzLt9KBPiyUHeS6WauWls= MIME-Version: 1.0 Received: by 10.50.94.228 with SMTP id df4mr40931225igb.12.1329301291734; Wed, 15 Feb 2012 02:21:31 -0800 (PST) Received: by 10.231.231.17 with HTTP; Wed, 15 Feb 2012 02:21:31 -0800 (PST) In-Reply-To: <15861.1329298812.1414986334451204096@ffe12.ukr.net> References: <15861.1329298812.1414986334451204096@ffe12.ukr.net> Date: Wed, 15 Feb 2012 12:21:31 +0200 Message-ID: From: George Kontostanos To: Pavlo Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and mem management X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Feb 2012 10:21:32 -0000 2012/2/15 Pavlo : > > > > Hello. > > We have an issue with memory management on FreeBSD and i suspect it is > related to FS. > We are using ZFS, here quick stats: > > > zpool status > pool: disk1 > state: ONLINE > scan: resilvered 657G in 8h30m with 0 errors on Tue Feb 14 21:17:37 2012 > config: > > NAME =A0 =A0 =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM > disk1 =A0 =A0 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > mirror-0 =A0 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk1 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk2 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk4 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk6 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk8 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk10 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk12 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > mirror-7 =A0 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk14 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/disk15 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > > errors: No known data errors > > pool: zroot > state: ONLINE > scan: resilvered 34.9G in 0h11m with 0 errors on Tue Feb 14 12:57:52 2012 > config: > > NAME =A0 =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKSUM > zroot =A0 =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > mirror-0 =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/sys0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > gpt/sys1 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0 =A0 =A0 0 > > errors: No known data errors > > ------------------------------------------------------------------------ > > System Memory: > > 0.95% =A0 =A075.61 =A0 =A0MiB Active, =A0 =A00.24% =A0 =A019.02 =A0 =A0Mi= B Inact > 18.25% =A0 =A01.41 =A0 =A0GiB Wired, =A0 =A00.01% =A0 =A0480.00 =A0 =A0Ki= B Cache > 80.54% =A0 =A06.24 =A0 =A0GiB Free, =A0 =A00.01% =A0 =A0604.00 =A0 =A0KiB= Gap > > Real Installed: =A0 =A08.00 =A0 =A0GiB > Real Available: =A0 =A099.84% =A0 =A07.99 =A0 =A0GiB > Real Managed: =A0 =A096.96% =A0 =A07.74 =A0 =A0GiB > > Logical Total: =A0 =A08.00 =A0 =A0GiB > Logical Used: =A0 =A021.79% =A0 =A01.74 =A0 =A0GiB > Logical Free: =A0 =A078.21% =A0 =A06.26 =A0 =A0GiB > > Kernel Memory: =A0 =A01.18 =A0 =A0GiB > Data: =A0 =A099.05% =A0 =A01.17 =A0 =A0GiB > Text: =A0 =A00.95% =A0 =A011.50 =A0 =A0MiB > > Kernel Memory Map: =A0 =A04.39 =A0 =A0GiB > Size: =A0 =A023.32% =A0 =A01.02 =A0 =A0GiB > Free: =A0 =A076.68% =A0 =A03.37 =A0 =A0GiB > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > ZFS Subsystem Report =A0 =A0Wed Feb 15 10:53:03 2012 > ------------------------------------------------------------------------ > > System Information: > > Kernel Version: =A0 =A0802516 (osreldate) > Hardware Platform: =A0 =A0amd64 > Processor Architecture: =A0 =A0amd64 > > ZFS Storage pool Version: =A0 =A028 > ZFS Filesystem Version: =A0 =A05 > > FreeBSD 8.2-STABLE #12: Thu Feb 9 11:35:23 EET 2012 root > 10:53AM =A0up 56 mins, 6 users, load averages: 0.00, 0.00, 0.00 > > ------------------------------------------------------------------------ > > > > > Background: > we are using some tool that does indexing of some data and then pushes it > into =A0database (currently bdb-5.2). Instances of indexer are running > continuously one after another. Time of indexing for one instance of > indexer may vary between =A02 seconds and 30 minutes. But mostly it is > below one minute. There is nothing else running on the machine except > system stuff and daemons. After several hours of indexing i can see a lot > of =A0active memory, it's ok. Then i check the number of vnodes. and it's > really huge: 300k+ even tho nobody has so many opened files. Reading docs > and googling i figured that's because of cached pages that reside in > memory (unmounting of disk causes whole memory to be freed). =A0Also I > figured that happens only when I am accessing files via mmap(). > > Looks like pretty legit behaviour but the issue is: > This spectacle continues (approximately for 12 hours) unlit indexers > began to be killed out of swap. As I wrote above I observe a lot of used > vnodes and like 7GB of active memory. I made a tool that allocates memory > using malloc() to check what's the limit of available memory that can be > allocated. It is several megabytes, sometimes more. Unless that tool gets > killed out of swap as well. So how i can see the issue: for some reason > after some process had exited normally all mapped pages don't get freed. > I red about and I agree =A0that this is reasonable behaviour if we have > spare memory. But following this logic these pages can be flushed back to > file at any time when system is under stress conditions. So when I ask > for a piece of RAM, OS should do that trick and give me what I ask. But > that's never happens. Those pages are like frozen. Until I unmount disk. > Even after there is not a single instance of indexer running. > > I believe all this is caused by mmap() for sure : BDB uses mmap() for > accessing databases and we tested indexing with out pushing data to DB. > Worked shiny. You may suggest that that's something wrong with BDB. But > we have some more tools of ours that using mmap() as well and the > behaviour is exact. > > Thank you. Paul, Ukraine. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Hi Paul, Are you using dedup anywhere on that pool? Also, could you please post the full zfs-stats -a --=20 George Kontostanos Aicom telecoms ltd http://www.aisecure.net