Date: Wed, 17 Jul 2013 00:47:22 +0200 From: Palle Girgensohn <girgen@FreeBSD.org> To: Kirk McKusick <mckusick@mckusick.com> Cc: freebsd-fs@freebsd.org, Jeff Roberson <jroberson@jroberson.net>, Julian Akehurst <julian@pingpong.se> Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?) Message-ID: <51E5CD7A.2020109@FreeBSD.org> In-Reply-To: <201307151932.r6FJWSxM087108@chez.mckusick.com> References: <201307151932.r6FJWSxM087108@chez.mckusick.com>
next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kirk McKusick skrev: >> Date: Mon, 15 Jul 2013 10:51:10 +0100 Subject: Re: leaking lots of >> unreferenced inodes (pg_xlog files?) From: Dan Thomas >> <godders@gmail.com> To: Kirk McKusick <mckusick@mckusick.com> Cc: >> Palle Girgensohn <girgen@freebsd.org>, freebsd-fs@freebsd.org, Jeff >> Roberson <jroberson@jroberson.net>, Julian Akehurst >> <julian@pingpong.se> X-ASK-Info: Message Queued (2013/07/15 >> 02:51:22) X-ASK-Info: Confirmed by User (2013/07/15 02:55:04) >> >> On 11 June 2013 01:17, Kirk McKusick <mckusick@mckusick.com> >> wrote: >>> OK, good to have it narrowed down. I will look to devise some >>> additional diagnostics that hopefully will help tease out the >>> bug. I'll hopefully get back to you soon. >> Hi, >> >> Is there any news on this issue? We're still running several >> servers that are exhibiting this problem (most recently, one that >> seems to be leaking around 10gb/hour), and it's getting to the >> point where we're looking at moving to a different OS until it's >> resolved. >> >> We have access to several production systems with this problem and >> (at least from time to time) will have systems with a significant >> leak on them that we can experiment with. Is there any way we can >> assist with tracking this down? Any diagnostics or testing that >> would be useful? >> >> Thanks, Dan > > Hi Dan (and Palle), > > Sorry for the long delay with no help / news. I have gotten > side-tracked on several projects and have had little time to try and > devise some tests that would help find the cause of the lost space. > It almost certainly is a one-line fix (a missing vput or vrele > probably in some error path), but finding where it goes is the hard > part :-) > > I have had little success in inserting code that tracks reference > counts (too many false positives). So, I am going to need some help > from you to narrow it down. My belief is that there is some set of > filesystem operations (system calls) that are leading to the > problem. Notably, a file is being created, data put into it, then the > file is deleted (either before or after being closed). Somehow a > reference to that file is persisting despite there being no valid > reference to it. Hence the filesystem thinks it is still live and is > not deleting it. When you do the forcible unmount, these files get > cleared and the space shows back up. > > What I need to devise is a small test program doing the set of system > calls that cause this to happen. The way that I would like to try and > get it is to have you `ktrace -i' your application and then run your > application just long enough to create at least one of these lost > files. The goal is to minimize the amount of ktrace data through > which we need to sift. > > In preparation for doing this test you need to have a kernel compiled > with `option DIAGNOSTIC' or if you prefer, just add `#define > DIAGNOSTIC 1' to the top of sys/kern/vfs_subr.c. You will know you > have at least one offending file when you try to unmount the affected > filesystem and find it busy. Before doing the `umount -f', enable > busy printing using `sysctl debug.busyprt=1'. Then capture the > console output which will show the details of all the vnodes that had > to be forcibly flushed. Hopefully we will then be able to correlate > them back to the files (NAMI in the ktrace output) with which they > were associated. We may need to augment the NAMI data with the inode > number of the associated file to make the association with the > busyprt output. Anyway, once we have that, we can look at all the > system calls done on those files and create a small test program that > exhibits the problem. Given a small test program, Jeff or I can track > down the offending system call path and nail this pernicious bug once > and for all. > > Kirk McKusick Hi, I have run ktrace -i on pg_ctl (which forks off all the postgresql processes) and I got two "busy" files that where "lost" after a few hours. dmesg reveals this: vflush: busy vnode 0xfffffe067cdde960: tag ufs, type VREG usecount 1, writecount 0, refcount 2 mountedhere 0 flags (VI(0x200)) VI_LOCKed v_object 0xfffffe0335922000 ref 0 pages 0 lock type ufs: EXCL by thread 0xfffffe01600eb8e0 (pid 56723) ino 11047146, on dev da2s1d vflush: busy vnode 0xfffffe039f35bb40: tag ufs, type VREG usecount 1, writecount 0, refcount 3 mountedhere 0 flags (VI(0x200)) VI_LOCKed v_object 0xfffffe03352701d0 ref 0 pages 0 lock type ufs: EXCL by thread 0xfffffe01600eb8e0 (pid 56723) ino 11045961, on dev da2s1d I had to umount -f, so they where "lost". So, now I have 55 GB ktrace output... ;) Is there anything I can do to filter it, or shall I compress it and put it on a web server for you to fetch as it is? Palle -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJR5c16AAoJEIhV+7FrxBJDK0AH/RLG1QLdyQhwNC6USlqO2+2B 6HXmYwbmDCMIlUQZAaG4h0x6QPzWjXWYMa1KDdpk/BtRhfL7z8tFPdWjTzqBPuK1 aEEQjv/Cp5IgI6FqVbc2agW3GfUwomtjEL3lUk2zmKdPImEWte6ZkLzOFgQpqQao QAxFnN0I8/g+ynQNQIavGOo0foze89wAuOaNvoy9z1wa7tFbjlH2lsVK1xGU6eNj AQn4RJw+tMPMGkNMy6Xjy7B/WMXfxutz1f4O9B1KBwLRZ/cgKxhmppoZdF3N4JsK GNiQvcRbYR9GhBiK+Er87UXKBcj2NS+QQsdSqIb5Ik1ahp78hjxq3raHuOLCTLw= =8+W4 -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51E5CD7A.2020109>