Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Jun 2013 11:55:03 +0200
From:      Palle Girgensohn <girgen@FreeBSD.org>
To:        Kirk McKusick <mckusick@mckusick.com>
Cc:        freebsd-fs@freebsd.org, Dan Thomas <godders@gmail.com>, Jeff Roberson <jroberson@jroberson.net>, Julian Akehurst <julian@pingpong.se>
Subject:   Re: leaking lots of unreferenced inodes (pg_xlog files?)
Message-ID:  <51B5A277.2060904@FreeBSD.org>
In-Reply-To: <201306022101.r52L19vg033389@chez.mckusick.com>
References:  <201306022101.r52L19vg033389@chez.mckusick.com>

next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kirk McKusick skrev:
>> Date: Sun, 02 Jun 2013 22:35:23 +0200 From: Palle Girgensohn
>> <girgen@freebsd.org> To: Kirk McKusick <mckusick@mckusick.com> 
>> Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?) 
>> Cc: freebsd-fs@freebsd.org, Dan Thomas <godders@gmail.com>, Jeff
>> Roberson <jroberson@jroberson.net>, Julian Akehurst
>> <julian@pingpong.se>
>> 
>> --On 31 maj 2013 11.25.40 -0700 Kirk McKusick
>> <mckusick@mckusick.com> wrote:
>> 
>>> Your results are very enlightening. Especially the fact that you
>>> have to do a forcible unmount of the filesystem. What that tells
>>> me is that somehow we are getting vnodes that have phantom
>>> references. That is there is some system call where we get a
>>> reference on a vnode (vref, vget, or similar) that does not
>>> ultimately have a corresponding drop of the reference (vrele,
>>> vput, or similar). The net effect is that the file is held open
>>> despite the fact that there are no longer any connections to it.
>>> When you do the forcible unmount, the kernel walks the list of
>>> vnodes associated with the filesystem and does a vgone on each of
>>> them. That causes each to be inactivated which then triggers the
>>> release of their associated disk space. The reason that the
>>> unmount takes 20 seconds is to process all the releasing of the
>>> space. My guess is that there is an error path in some system
>>> call that is missing the vrele or vput.
>>> 
>>> Assuming that you are able to run some more tests on your test
>>> machine, the next step in narrowing down the set of code to look
>>> at is to try running your system with soft updates disabled. The
>>> idea is to find out whether the miss-matched references are in
>>> the soft updates code or are in one of the filesystem system
>>> calls themselves. To disable soft updates run the command `tunefs
>>> -n disable /pgsql' on the unmounted /pgsql filesystem. If the
>>> system then runs without the problem, I will know to search the
>>> soft updates code. If the problem persists, then I'll know to
>>> look in the system calls themselves. You may want to do some 
>>> preliminary tests to see how quickly the problem manifests
>>> itself. You can do this by running it for a short time (10
>>> minutes say) and then checking to see if you need to do a
>>> forcible unmount of the filesystem. Once you establish how long
>>> you have to run before you reliably have to do a forcible
>>> unmount, you will know how long to run the test with soft updates
>>> turned off. If you find that running with soft updates turned off
>>> makes your application run too slowly you can mount your
>>> filesystem asynchronously. Note however, that you should not run
>>> asynchronously if the data on the filesystem is critical as you
>>> may end up with an unrecoverable filesystem after a power
>>> failure or system crash. So only run asynchronously if you can
>>> afford to lose your filesystem.
>>> 
>>> Finally, it would be helpful if you could add two more commands
>>> to your diskspacecheck.sh script:
>>> 
>>> sysctl -a | egrep vnode mount -v
>>> 
>>> The first shows the vnode usage and the second shows the
>>> operational state of your filesystems.
>>> 
>>> Kirk McKusick
>> OK, I have now turned off soft updates. This is on the test server.
>> It is not as busy as the production machine, but I'll keep an eye
>> on it and will mail new results as soon as I see any evidence of
>> either that soft updates is the culprit or that it is not.
>> 
>> FWIW, I attach the script from this remount process as well, which
>> includes
>> 
>> sysctl -a | grep vnode ; mount -v.
>> 
>> Note that it is all in one script file this time.
>> 
>> Cheers, Palle
> 
> This looks good. Keep me posted.

After running for a number of days without soft updates, it seems to me
that the culprit is indeed in the soft updates code.

# df -k /pgsql; du -sk /pgsql
Filesystem  1024-blocks     Used    Avail Capacity  Mounted on
/dev/da2s1d   134763348 86339044 37643238    70%    /pgsql
86303252	/pgsql

Palle

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJRtaJ3AAoJEIhV+7FrxBJD+IkH/3FOoZ95VGE0fOWSuFIwVn8I
jvHiJ6qTx0zh17pZNnc+G0UpU5fHxCazD1yT6yCwfkWebWKXELXtfQMeZUMGi0AX
e94P0HJ2O4RQSMHC1rlWSLUidAB6m1ZtAtpXzgziB9P/Jonk78uFqRcTmZyMycsy
pxPFHsbywsjJm9FLF4ZuhiSPX57tbAKLQM3HYDMFQ/rHPJiBlkx7VVeON6svtmMO
bRZWnQTUXUAAMT1NDUEL8opGAO2S72+hFBiCjJsgS22SSq7KIMzAlJqq01L2svhH
o7KNAkN6lIMuJS9B2idjJWLVXG/vNQ1QBOha0VY80fIQYSYeZt25EGlXf3rYL6Y=
=Zmu2
-----END PGP SIGNATURE-----



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51B5A277.2060904>