From owner-freebsd-fs@FreeBSD.ORG Thu Sep 30 09:11:48 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 01F2610656A4 for ; Thu, 30 Sep 2010 09:11:48 +0000 (UTC) (envelope-from fbsd@dannysplace.net) Received: from mailgw.dannysplace.net (mailgw.dannysplace.net [204.109.56.184]) by mx1.freebsd.org (Postfix) with ESMTP id BFF038FC15 for ; Thu, 30 Sep 2010 09:11:47 +0000 (UTC) Received: from [203.206.171.212] (helo=[192.168.10.10]) by mailgw.dannysplace.net with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.72 (FreeBSD)) (envelope-from ) id 1P1FBZ-0003Ii-Gg for freebsd-fs@freebsd.org; Thu, 30 Sep 2010 19:12:19 +1000 Message-ID: <4CA45444.6070002@dannysplace.net> Date: Thu, 30 Sep 2010 19:11:32 +1000 From: Danny Carroll User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <20100929192534.GA97031@icarus.home.lan> <20100929221549.GA343@icarus.home.lan> <20100930103647.62193lbkp9yqx5k4@webmail.leidinger.net> In-Reply-To: <20100930103647.62193lbkp9yqx5k4@webmail.leidinger.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Authenticated-User: danny X-Authenticator: plain X-Exim-Version: 4.72 (build at 12-Jul-2010 18:31:29) X-Date: 2010-09-30 19:12:17 X-Connected-IP: 203.206.171.212:57823 X-Message-Linecount: 159 X-Body-Linecount: 145 X-Message-Size: 5967 X-Body-Size: 5077 X-Received-Count: 1 X-Recipient-Count: 1 X-Local-Recipient-Count: 1 X-Local-Recipient-Defer-Count: 0 X-Local-Recipient-Fail-Count: 0 X-SA-Exim-Connect-IP: 203.206.171.212 X-SA-Exim-Mail-From: fbsd@dannysplace.net X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on damka.dannysplace.net X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.1 X-SA-Exim-Version: 4.2 X-SA-Exim-Scanned: Yes (on mailgw.dannysplace.net) Subject: Re: Strange ZFS problem, filesystem claims to be full when clearly not full X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: fbsd@dannysplace.net List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 09:11:48 -0000 On 30/09/2010 6:36 PM, Alexander Leidinger wrote: > > Quoting Jeremy Chadwick (from Wed, 29 Sep > 2010 15:15:49 -0700): > >> On Thu, Sep 30, 2010 at 12:11:09AM +0200, Torbjorn Kristoffersen wrote: >>> I'm at a complete loss here. I shut down the jail completely, and I am >>> watching the jail's ZFS filesystem grow as we speak. No process is >>> using >>> it. It only grows in "df" and "zfs list", I can't find any files >>> that are >>> growing. I have to re-set the quota to be higher and higher to >>> accommodate >>> the space. >>> >>> On Wed, Sep 29, 2010 at 10:46 PM, Torbjorn Kristoffersen < >>> torbjoern@gmail.com> wrote: >>> >>> > Hi Jeremy. >>> > >>> > 1) I checked now, and found nothing extraordinary. Just processes >>> that have >>> > been running for a long while, such as screen, cron, sshd, bash, >>> irssi, >>> > syslogd, etc. >>> > >>> > 2) No compression used on this zfs filesystem (or any of the others). >>> > >>> > I completedly stopped the jail now, and removed some of the >>> directories >>> > with the most data in them, but to no avail. >>> > >>> > >>> > On Wed, Sep 29, 2010 at 9:25 PM, Jeremy Chadwick >>> >> > > wrote: >>> > >>> >> On Wed, Sep 29, 2010 at 08:46:38PM +0200, Torbjorn Kristoffersen >>> wrote: >>> >> > I have a ZFS "tank" called tpool, the server runs a couple of >>> jails >>> >> (each >>> >> > with a zfs filesystem). There is a problem with one of these >>> >> filesystems. >>> >> > First, its disk usage as shown in ``df -h'': >>> >> > ... >>> >> > tpool/rb.org 100G 95G 4.6G 95% /jails/rb.org >>> >> > ... >>> >> > >>> >> > The command ``zfs list'' shows the same: >>> >> > .. >>> >> > tpool/rb.org 95.4G 4.56G 95.4G /jails/rb.org >>> >> > .. >>> >> > >>> >> > However, there is a very mysterious problem somewhere. >>> >> > Something inside this jail is eating diskspace, but we can't >>> find any >>> >> > directories that is actually taking the diskspace. We first >>> suspected >>> >> either >>> >> > fetchmail or spamassassin of causing a lot of space to be used, >>> since >>> >> some >>> >> > of their directories were huge. (These were later deleted, and >>> which is >>> >> why >>> >> > you see that 4.6GB is now available, before that 0GB was >>> available). >>> >> > >>> >> > However, we can't find *any trace* of an actual directory or >>> file that >>> >> is >>> >> > taking all the spac.e >>> >> > >>> >> > Take this for instance: >>> >> > >>> >> > outsidejail# du -sh rb.org >>> >> > 43G rb.org >>> >> > >>> >> > How can this be? df and zfs are showing that the entire drive >>> is nearly >>> >> > full, yet I can't find any directory that is actually taking >>> all this >>> >> space. >>> >> > I've carefully looked through every single directory within >>> the jail >>> >> trying >>> >> > to find something that's taking all that space, but to no avail. >>> >> > >>> >> > ---- >>> >> > My system stats: >>> >> > # uname -a >>> >> > FreeBSD grim 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 >>> 02:36:49 UTC >>> >> > 2010 >>> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 >>> >> > # zpool get version tpool >>> >> > NAME PROPERTY VALUE SOURCE >>> >> > tpool version 14 default >>> >> > # zpool status >>> >> > pool: tpool >>> >> > state: ONLINE >>> >> > scrub: none requested >>> >> > config: >>> >> > >>> >> > NAME STATE READ WRITE CKSUM >>> >> > tpool ONLINE 0 0 0 >>> >> > mirror ONLINE 0 0 0 >>> >> > ad4s1d ONLINE 0 0 0 >>> >> > ad6s1d ONLINE 0 0 0 >>> >> > >>> >> > errors: No known data errors >>> >> > >>> >> > [ Note that I've also done a scrub recently ] >>> >> >>> >> 1) Have you checked using fstat to ensure that no file descriptors >>> >> remain open on any of your ZFS filesystems (not pools)? >>> >> >>> >> 2) Are you using compression on any of your ZFS filesystems? >> >> Andriy and Pawel, >> >> Do either of you have ideas as to what could cause the issue Torbjorn's >> experiencing? I swear I remember some bug or quirk that got fixed with >> regards to free space on ZFS, but as has been proven time and time again >> my memory is horrible. His kernel's 8.1-RELEASE dated July 19th. > > IIRC the commit you talk about was by Martin (CCed). I do not know if > it is (already) MFCed. > > I'm not sure the bug you talk about is related to what Torbjorn is > talking about. The fact that the free space is going down while the > jail is shutdown (and I assume jls does not show his JID anymore, so > all of its processes are really gone) points more to some other > process (outside of the jail) which is filling some (maybe already > deleted, so not visible anymore with du) file. > It certainly smells like a process still writing to a file that is unlinked. I wonder if it would show up with lsof. If dtrace is enabled on that machine then I think it should be easy to see which process is performing write operations. -D