Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Sep 2010 19:11:32 +1000
From:      Danny Carroll <fbsd@dannysplace.net>
To:        freebsd-fs@freebsd.org
Subject:   Re: Strange ZFS problem, filesystem claims to be full when clearly not full
Message-ID:  <4CA45444.6070002@dannysplace.net>
In-Reply-To: <20100930103647.62193lbkp9yqx5k4@webmail.leidinger.net>
References:  <AANLkTimRmBi=th1oia5ZuKcEtLR%2BYjK04KNYeZhu931A@mail.gmail.com>	<20100929192534.GA97031@icarus.home.lan>	<AANLkTi=q6adZv57mwNZVivOwLsfXBjVHki7tzP6-jD0G@mail.gmail.com>	<AANLkTikqX1Y3qbcdr-2six%2BPwv61k-Exwh142w5FFqbS@mail.gmail.com>	<20100929221549.GA343@icarus.home.lan> <20100930103647.62193lbkp9yqx5k4@webmail.leidinger.net>

next in thread | previous in thread | raw e-mail | index | archive | help
 On 30/09/2010 6:36 PM, Alexander Leidinger wrote:
>
> Quoting Jeremy Chadwick <freebsd@jdc.parodius.com> (from Wed, 29 Sep
> 2010 15:15:49 -0700):
>
>> On Thu, Sep 30, 2010 at 12:11:09AM +0200, Torbjorn Kristoffersen wrote:
>>> I'm at a complete loss here. I shut down the jail completely, and I am
>>> watching the jail's ZFS filesystem grow as we speak.  No process is
>>> using
>>> it.   It only grows in "df" and "zfs list", I can't find any files
>>> that are
>>> growing.  I have to re-set the quota to be higher and higher to
>>> accommodate
>>> the space.
>>>
>>> On Wed, Sep 29, 2010 at 10:46 PM, Torbjorn Kristoffersen <
>>> torbjoern@gmail.com> wrote:
>>>
>>> > Hi Jeremy.
>>> >
>>> > 1) I checked now, and found nothing extraordinary. Just processes
>>> that have
>>> > been running for a long while, such as screen, cron, sshd, bash,
>>> irssi,
>>> > syslogd, etc.
>>> >
>>> > 2) No compression used on this zfs filesystem (or any of the others).
>>> >
>>> > I completedly stopped the jail now, and removed some of the
>>> directories
>>> > with the most data in them, but to no avail.
>>> >
>>> >
>>> > On Wed, Sep 29, 2010 at 9:25 PM, Jeremy Chadwick
>>> <freebsd@jdc.parodius.com
>>> > > wrote:
>>> >
>>> >> On Wed, Sep 29, 2010 at 08:46:38PM +0200, Torbjorn Kristoffersen
>>> wrote:
>>> >> > I have a ZFS "tank" called tpool, the server runs a couple of
>>> jails
>>> >> (each
>>> >> > with a zfs filesystem).  There is a problem with one of these
>>> >> filesystems.
>>> >> > First, its disk usage as shown in ``df -h'':
>>> >> > ...
>>> >> > tpool/rb.org      100G     95G    4.6G    95%    /jails/rb.org
>>> >> > ...
>>> >> >
>>> >> > The command ``zfs list'' shows the same:
>>> >> > ..
>>> >> > tpool/rb.org    95.4G  4.56G  95.4G  /jails/rb.org
>>> >> > ..
>>> >> >
>>> >> > However, there is a very mysterious problem somewhere.
>>> >> > Something inside this jail is eating diskspace, but we can't
>>> find any
>>> >> > directories that is actually taking the diskspace. We first
>>> suspected
>>> >> either
>>> >> > fetchmail or spamassassin of causing a lot of space to be used,
>>> since
>>> >> some
>>> >> > of their directories were huge. (These were later deleted, and
>>> which is
>>> >> why
>>> >> > you see that 4.6GB is now available, before that 0GB was
>>> available).
>>> >> >
>>> >> > However, we can't find *any trace* of an actual directory or
>>> file that
>>> >> is
>>> >> > taking all the spac.e
>>> >> >
>>> >> > Take this for instance:
>>> >> >
>>> >> > outsidejail# du -sh rb.org
>>> >> >  43G    rb.org
>>> >> >
>>> >> > How can this be?  df and zfs are showing that the entire drive
>>> is nearly
>>> >> > full, yet I can't find any directory that is actually taking
>>> all this
>>> >> space.
>>> >> >  I've carefully looked through every single directory within
>>> the jail
>>> >> trying
>>> >> > to find something that's taking all that space, but to no avail.
>>> >> >
>>> >> > ----
>>> >> > My system stats:
>>> >> > # uname -a
>>> >> > FreeBSD grim 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19
>>> 02:36:49 UTC
>>> >> > 2010    
>>> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
>>> >> > # zpool get version tpool
>>> >> > NAME   PROPERTY  VALUE    SOURCE
>>> >> > tpool  version   14       default
>>> >> > # zpool status
>>> >> >   pool: tpool
>>> >> >  state: ONLINE
>>> >> >  scrub: none requested
>>> >> > config:
>>> >> >
>>> >> >         NAME        STATE     READ WRITE CKSUM
>>> >> >         tpool       ONLINE       0     0     0
>>> >> >           mirror    ONLINE       0     0     0
>>> >> >             ad4s1d  ONLINE       0     0     0
>>> >> >             ad6s1d  ONLINE       0     0     0
>>> >> >
>>> >> > errors: No known data errors
>>> >> >
>>> >> > [ Note that I've also done a scrub recently ]
>>> >>
>>> >> 1) Have you checked using fstat to ensure that no file descriptors
>>> >> remain open on any of your ZFS filesystems (not pools)?
>>> >>
>>> >> 2) Are you using compression on any of your ZFS filesystems?
>>
>> Andriy and Pawel,
>>
>> Do either of you have ideas as to what could cause the issue Torbjorn's
>> experiencing?  I swear I remember some bug or quirk that got fixed with
>> regards to free space on ZFS, but as has been proven time and time again
>> my memory is horrible.  His kernel's 8.1-RELEASE dated July 19th.
>
> IIRC the commit you talk about was by Martin (CCed). I do not know if
> it is (already) MFCed.
>
> I'm not sure the bug you talk about is related to what Torbjorn is
> talking about. The fact that the free space is going down while the
> jail is shutdown (and I assume jls does not show his JID anymore, so
> all of its processes are really gone) points more to some other
> process (outside of the jail) which is filling some (maybe already
> deleted, so not visible anymore with du) file.
>

It certainly smells like a process still writing to a file that is unlinked.
I wonder if it would show up with lsof.

If dtrace is enabled on that machine then I think it should be easy to
see which process is performing write operations.

-D



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4CA45444.6070002>