From owner-freebsd-fs@FreeBSD.ORG Tue Jul 16 22:47:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id BDC6CF65 for ; Tue, 16 Jul 2013 22:47:27 +0000 (UTC) (envelope-from girgen@FreeBSD.org) Received: from melon.pingpong.net (melon.pingpong.net [79.136.116.200]) by mx1.freebsd.org (Postfix) with ESMTP id 5D0AD76A for ; Tue, 16 Jul 2013 22:47:27 +0000 (UTC) Received: from girgbook.lan (c-0f54e155.1525-1-64736c12.cust.bredbandsbolaget.se [85.225.84.15]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by melon.pingpong.net (Postfix) with ESMTPSA id BE2992E651; Wed, 17 Jul 2013 00:47:23 +0200 (CEST) Message-ID: <51E5CD7A.2020109@FreeBSD.org> Date: Wed, 17 Jul 2013 00:47:22 +0200 From: Palle Girgensohn User-Agent: Postbox 3.0.8 (Macintosh/20130427) MIME-Version: 1.0 To: Kirk McKusick Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?) References: <201307151932.r6FJWSxM087108@chez.mckusick.com> In-Reply-To: <201307151932.r6FJWSxM087108@chez.mckusick.com> X-Enigmail-Version: 1.2.3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Jeff Roberson , Julian Akehurst X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jul 2013 22:47:27 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kirk McKusick skrev: >> Date: Mon, 15 Jul 2013 10:51:10 +0100 Subject: Re: leaking lots of >> unreferenced inodes (pg_xlog files?) From: Dan Thomas >> To: Kirk McKusick Cc: >> Palle Girgensohn , freebsd-fs@freebsd.org, Jeff >> Roberson , Julian Akehurst >> X-ASK-Info: Message Queued (2013/07/15 >> 02:51:22) X-ASK-Info: Confirmed by User (2013/07/15 02:55:04) >> >> On 11 June 2013 01:17, Kirk McKusick >> wrote: >>> OK, good to have it narrowed down. I will look to devise some >>> additional diagnostics that hopefully will help tease out the >>> bug. I'll hopefully get back to you soon. >> Hi, >> >> Is there any news on this issue? We're still running several >> servers that are exhibiting this problem (most recently, one that >> seems to be leaking around 10gb/hour), and it's getting to the >> point where we're looking at moving to a different OS until it's >> resolved. >> >> We have access to several production systems with this problem and >> (at least from time to time) will have systems with a significant >> leak on them that we can experiment with. Is there any way we can >> assist with tracking this down? Any diagnostics or testing that >> would be useful? >> >> Thanks, Dan > > Hi Dan (and Palle), > > Sorry for the long delay with no help / news. I have gotten > side-tracked on several projects and have had little time to try and > devise some tests that would help find the cause of the lost space. > It almost certainly is a one-line fix (a missing vput or vrele > probably in some error path), but finding where it goes is the hard > part :-) > > I have had little success in inserting code that tracks reference > counts (too many false positives). So, I am going to need some help > from you to narrow it down. My belief is that there is some set of > filesystem operations (system calls) that are leading to the > problem. Notably, a file is being created, data put into it, then the > file is deleted (either before or after being closed). Somehow a > reference to that file is persisting despite there being no valid > reference to it. Hence the filesystem thinks it is still live and is > not deleting it. When you do the forcible unmount, these files get > cleared and the space shows back up. > > What I need to devise is a small test program doing the set of system > calls that cause this to happen. The way that I would like to try and > get it is to have you `ktrace -i' your application and then run your > application just long enough to create at least one of these lost > files. The goal is to minimize the amount of ktrace data through > which we need to sift. > > In preparation for doing this test you need to have a kernel compiled > with `option DIAGNOSTIC' or if you prefer, just add `#define > DIAGNOSTIC 1' to the top of sys/kern/vfs_subr.c. You will know you > have at least one offending file when you try to unmount the affected > filesystem and find it busy. Before doing the `umount -f', enable > busy printing using `sysctl debug.busyprt=1'. Then capture the > console output which will show the details of all the vnodes that had > to be forcibly flushed. Hopefully we will then be able to correlate > them back to the files (NAMI in the ktrace output) with which they > were associated. We may need to augment the NAMI data with the inode > number of the associated file to make the association with the > busyprt output. Anyway, once we have that, we can look at all the > system calls done on those files and create a small test program that > exhibits the problem. Given a small test program, Jeff or I can track > down the offending system call path and nail this pernicious bug once > and for all. > > Kirk McKusick Hi, I have run ktrace -i on pg_ctl (which forks off all the postgresql processes) and I got two "busy" files that where "lost" after a few hours. dmesg reveals this: vflush: busy vnode 0xfffffe067cdde960: tag ufs, type VREG usecount 1, writecount 0, refcount 2 mountedhere 0 flags (VI(0x200)) VI_LOCKed v_object 0xfffffe0335922000 ref 0 pages 0 lock type ufs: EXCL by thread 0xfffffe01600eb8e0 (pid 56723) ino 11047146, on dev da2s1d vflush: busy vnode 0xfffffe039f35bb40: tag ufs, type VREG usecount 1, writecount 0, refcount 3 mountedhere 0 flags (VI(0x200)) VI_LOCKed v_object 0xfffffe03352701d0 ref 0 pages 0 lock type ufs: EXCL by thread 0xfffffe01600eb8e0 (pid 56723) ino 11045961, on dev da2s1d I had to umount -f, so they where "lost". So, now I have 55 GB ktrace output... ;) Is there anything I can do to filter it, or shall I compress it and put it on a web server for you to fetch as it is? Palle -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJR5c16AAoJEIhV+7FrxBJDK0AH/RLG1QLdyQhwNC6USlqO2+2B 6HXmYwbmDCMIlUQZAaG4h0x6QPzWjXWYMa1KDdpk/BtRhfL7z8tFPdWjTzqBPuK1 aEEQjv/Cp5IgI6FqVbc2agW3GfUwomtjEL3lUk2zmKdPImEWte6ZkLzOFgQpqQao QAxFnN0I8/g+ynQNQIavGOo0foze89wAuOaNvoy9z1wa7tFbjlH2lsVK1xGU6eNj AQn4RJw+tMPMGkNMy6Xjy7B/WMXfxutz1f4O9B1KBwLRZ/cgKxhmppoZdF3N4JsK GNiQvcRbYR9GhBiK+Er87UXKBcj2NS+QQsdSqIb5Ik1ahp78hjxq3raHuOLCTLw= =8+W4 -----END PGP SIGNATURE-----