Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Sep 2010 15:28:25 +0200
From:      Torbjorn Kristoffersen <torbjoern@gmail.com>
To:        freebsd-fs@freebsd.org
Subject:   Re: Strange ZFS problem, filesystem claims to be full when clearly not full
Message-ID:  <AANLkTi=x5irhAM8uhiZJLztE230=Q9CAMDeja=Bo4fVL@mail.gmail.com>
In-Reply-To: <4CA45444.6070002@dannysplace.net>
References:  <AANLkTimRmBi=th1oia5ZuKcEtLR%2BYjK04KNYeZhu931A@mail.gmail.com> <20100929192534.GA97031@icarus.home.lan> <AANLkTi=q6adZv57mwNZVivOwLsfXBjVHki7tzP6-jD0G@mail.gmail.com> <AANLkTikqX1Y3qbcdr-2six%2BPwv61k-Exwh142w5FFqbS@mail.gmail.com> <20100929221549.GA343@icarus.home.lan> <20100930103647.62193lbkp9yqx5k4@webmail.leidinger.net> <4CA45444.6070002@dannysplace.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 30, 2010 at 11:11 AM, Danny Carroll <fbsd@dannysplace.net> wrot=
e:
>
> =A0On 30/09/2010 6:36 PM, Alexander Leidinger wrote:
> >
> > Quoting Jeremy Chadwick <freebsd@jdc.parodius.com> (from Wed, 29 Sep
> > 2010 15:15:49 -0700):
> >
> >> On Thu, Sep 30, 2010 at 12:11:09AM +0200, Torbjorn Kristoffersen wrote=
:
> >>> I'm at a complete loss here. I shut down the jail completely, and I a=
m
> >>> watching the jail's ZFS filesystem grow as we speak. =A0No process is
> >>> using
> >>> it. =A0 It only grows in "df" and "zfs list", I can't find any files
> >>> that are
> >>> growing. =A0I have to re-set the quota to be higher and higher to
> >>> accommodate
> >>> the space.
> >>>
> >>> On Wed, Sep 29, 2010 at 10:46 PM, Torbjorn Kristoffersen <
> >>> torbjoern@gmail.com> wrote:
> >>>
> >>> > Hi Jeremy.
> >>> >
> >>> > 1) I checked now, and found nothing extraordinary. Just processes
> >>> that have
> >>> > been running for a long while, such as screen, cron, sshd, bash,
> >>> irssi,
> >>> > syslogd, etc.
> >>> >
> >>> > 2) No compression used on this zfs filesystem (or any of the others=
).
> >>> >
> >>> > I completedly stopped the jail now, and removed some of the
> >>> directories
> >>> > with the most data in them, but to no avail.
> >>> >
> >>> >
> >>> > On Wed, Sep 29, 2010 at 9:25 PM, Jeremy Chadwick
> >>> <freebsd@jdc.parodius.com
> >>> > > wrote:
> >>> >
> >>> >> On Wed, Sep 29, 2010 at 08:46:38PM +0200, Torbjorn Kristoffersen
> >>> wrote:
> >>> >> > I have a ZFS "tank" called tpool, the server runs a couple of
> >>> jails
> >>> >> (each
> >>> >> > with a zfs filesystem). =A0There is a problem with one of these
> >>> >> filesystems.
> >>> >> > First, its disk usage as shown in ``df -h'':
> >>> >> > ...
> >>> >> > tpool/rb.org =A0 =A0 =A0100G =A0 =A0 95G =A0 =A04.6G =A0 =A095% =
=A0 =A0/jails/rb.org
> >>> >> > ...
> >>> >> >
> >>> >> > The command ``zfs list'' shows the same:
> >>> >> > ..
> >>> >> > tpool/rb.org =A0 =A095.4G =A04.56G =A095.4G =A0/jails/rb.org
> >>> >> > ..
> >>> >> >
> >>> >> > However, there is a very mysterious problem somewhere.
> >>> >> > Something inside this jail is eating diskspace, but we can't
> >>> find any
> >>> >> > directories that is actually taking the diskspace. We first
> >>> suspected
> >>> >> either
> >>> >> > fetchmail or spamassassin of causing a lot of space to be used,
> >>> since
> >>> >> some
> >>> >> > of their directories were huge. (These were later deleted, and
> >>> which is
> >>> >> why
> >>> >> > you see that 4.6GB is now available, before that 0GB was
> >>> available).
> >>> >> >
> >>> >> > However, we can't find *any trace* of an actual directory or
> >>> file that
> >>> >> is
> >>> >> > taking all the spac.e
> >>> >> >
> >>> >> > Take this for instance:
> >>> >> >
> >>> >> > outsidejail# du -sh rb.org
> >>> >> > =A043G =A0 =A0rb.org
> >>> >> >
> >>> >> > How can this be? =A0df and zfs are showing that the entire drive
> >>> is nearly
> >>> >> > full, yet I can't find any directory that is actually taking
> >>> all this
> >>> >> space.
> >>> >> > =A0I've carefully looked through every single directory within
> >>> the jail
> >>> >> trying
> >>> >> > to find something that's taking all that space, but to no avail.
> >>> >> >
> >>> >> > ----
> >>> >> > My system stats:
> >>> >> > # uname -a
> >>> >> > FreeBSD grim 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19
> >>> 02:36:49 UTC
> >>> >> > 2010
> >>> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC =A0amd64
> >>> >> > # zpool get version tpool
> >>> >> > NAME =A0 PROPERTY =A0VALUE =A0 =A0SOURCE
> >>> >> > tpool =A0version =A0 14 =A0 =A0 =A0 default
> >>> >> > # zpool status
> >>> >> > =A0 pool: tpool
> >>> >> > =A0state: ONLINE
> >>> >> > =A0scrub: none requested
> >>> >> > config:
> >>> >> >
> >>> >> > =A0 =A0 =A0 =A0 NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKS=
UM
> >>> >> > =A0 =A0 =A0 =A0 tpool =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0=
 =A0 =A0 0
> >>> >> > =A0 =A0 =A0 =A0 =A0 mirror =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0=
 =A0 =A0 0
> >>> >> > =A0 =A0 =A0 =A0 =A0 =A0 ad4s1d =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0=
 =A0 =A0 0
> >>> >> > =A0 =A0 =A0 =A0 =A0 =A0 ad6s1d =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0=
 =A0 =A0 0
> >>> >> >
> >>> >> > errors: No known data errors
> >>> >> >
> >>> >> > [ Note that I've also done a scrub recently ]
> >>> >>
> >>> >> 1) Have you checked using fstat to ensure that no file descriptors
> >>> >> remain open on any of your ZFS filesystems (not pools)?
> >>> >>
> >>> >> 2) Are you using compression on any of your ZFS filesystems?
> >>
> >> Andriy and Pawel,
> >>
> >> Do either of you have ideas as to what could cause the issue Torbjorn'=
s
> >> experiencing? =A0I swear I remember some bug or quirk that got fixed w=
ith
> >> regards to free space on ZFS, but as has been proven time and time aga=
in
> >> my memory is horrible. =A0His kernel's 8.1-RELEASE dated July 19th.
> >
> > IIRC the commit you talk about was by Martin (CCed). I do not know if
> > it is (already) MFCed.
> >
> > I'm not sure the bug you talk about is related to what Torbjorn is
> > talking about. The fact that the free space is going down while the
> > jail is shutdown (and I assume jls does not show his JID anymore, so
> > all of its processes are really gone) points more to some other
> > process (outside of the jail) which is filling some (maybe already
> > deleted, so not visible anymore with du) file.
> >
>
> It certainly smells like a process still writing to a file that is unlink=
ed.
> I wonder if it would show up with lsof.
>
> If dtrace is enabled on that machine then I think it should be easy to
> see which process is performing write operations.
>

That could very well be.  Interestingly, dtrace is not installed and
doesn't even load.  When I do
kldload dtraceall it says:

    kldload: can't load dtraceall: Exec format error

=A0Perhaps I should recompile the kernel on this server, and build in
Dtrace into the kernel.  Perhaps I should first update to
FreeBSD-STABLE, as it is more cutting edge?

Actually, I'll first do a complete backup of this jail, remove the zfs
filesystem, then re-create it, put the files back, and see what
happens.  The unfortunate thing is that I will be ruining a chance to
find out what really happened.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=x5irhAM8uhiZJLztE230=Q9CAMDeja=Bo4fVL>