From owner-freebsd-fs@FreeBSD.ORG Thu Sep 30 13:28:27 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3607F1065674 for ; Thu, 30 Sep 2010 13:28:27 +0000 (UTC) (envelope-from torbjoern@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id AC24F8FC1C for ; Thu, 30 Sep 2010 13:28:26 +0000 (UTC) Received: by bwz15 with SMTP id 15so1764988bwz.13 for ; Thu, 30 Sep 2010 06:28:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=Ip0w/go5Hd/apJrlZ+IfX/SJ3XmKiOK97viq1nla3lk=; b=uGe3Uqf1tyQOUJkHj7obrWi0PxkF9VOFNn4P7KcPSQMQIfIypcolfsGF6xCIag1YYH hnsYLHdNHiRse0s6ixukUXAjQCjLcO/64xySTj2GYMrhQzRR71UjZjT2X5UW3m9NHydD Nsxh+350mLvW++/9N4p8Bi4gD+YKHwxJEwKlk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=baBuh7MX3QnIirYZ7jJ4oPFISvTddUifvnloV4a8zmPepYGtS3MTbo8RJ+muz//3+w S++RwEORWQJnSjGbvGQBxwLqFWgsQjybaPluUpWtjNCPlx9YVm9MimpacMKkHgZmgeSg aejAnTreCig0q/WIV6I77l4533KIJMsxzErEI= MIME-Version: 1.0 Received: by 10.204.126.92 with SMTP id b28mr2686416bks.47.1285853305557; Thu, 30 Sep 2010 06:28:25 -0700 (PDT) Received: by 10.204.71.138 with HTTP; Thu, 30 Sep 2010 06:28:25 -0700 (PDT) In-Reply-To: <4CA45444.6070002@dannysplace.net> References: <20100929192534.GA97031@icarus.home.lan> <20100929221549.GA343@icarus.home.lan> <20100930103647.62193lbkp9yqx5k4@webmail.leidinger.net> <4CA45444.6070002@dannysplace.net> Date: Thu, 30 Sep 2010 15:28:25 +0200 Message-ID: From: Torbjorn Kristoffersen To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: Strange ZFS problem, filesystem claims to be full when clearly not full X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 13:28:27 -0000 On Thu, Sep 30, 2010 at 11:11 AM, Danny Carroll wrot= e: > > =A0On 30/09/2010 6:36 PM, Alexander Leidinger wrote: > > > > Quoting Jeremy Chadwick (from Wed, 29 Sep > > 2010 15:15:49 -0700): > > > >> On Thu, Sep 30, 2010 at 12:11:09AM +0200, Torbjorn Kristoffersen wrote= : > >>> I'm at a complete loss here. I shut down the jail completely, and I a= m > >>> watching the jail's ZFS filesystem grow as we speak. =A0No process is > >>> using > >>> it. =A0 It only grows in "df" and "zfs list", I can't find any files > >>> that are > >>> growing. =A0I have to re-set the quota to be higher and higher to > >>> accommodate > >>> the space. > >>> > >>> On Wed, Sep 29, 2010 at 10:46 PM, Torbjorn Kristoffersen < > >>> torbjoern@gmail.com> wrote: > >>> > >>> > Hi Jeremy. > >>> > > >>> > 1) I checked now, and found nothing extraordinary. Just processes > >>> that have > >>> > been running for a long while, such as screen, cron, sshd, bash, > >>> irssi, > >>> > syslogd, etc. > >>> > > >>> > 2) No compression used on this zfs filesystem (or any of the others= ). > >>> > > >>> > I completedly stopped the jail now, and removed some of the > >>> directories > >>> > with the most data in them, but to no avail. > >>> > > >>> > > >>> > On Wed, Sep 29, 2010 at 9:25 PM, Jeremy Chadwick > >>> >>> > > wrote: > >>> > > >>> >> On Wed, Sep 29, 2010 at 08:46:38PM +0200, Torbjorn Kristoffersen > >>> wrote: > >>> >> > I have a ZFS "tank" called tpool, the server runs a couple of > >>> jails > >>> >> (each > >>> >> > with a zfs filesystem). =A0There is a problem with one of these > >>> >> filesystems. > >>> >> > First, its disk usage as shown in ``df -h'': > >>> >> > ... > >>> >> > tpool/rb.org =A0 =A0 =A0100G =A0 =A0 95G =A0 =A04.6G =A0 =A095% = =A0 =A0/jails/rb.org > >>> >> > ... > >>> >> > > >>> >> > The command ``zfs list'' shows the same: > >>> >> > .. > >>> >> > tpool/rb.org =A0 =A095.4G =A04.56G =A095.4G =A0/jails/rb.org > >>> >> > .. > >>> >> > > >>> >> > However, there is a very mysterious problem somewhere. > >>> >> > Something inside this jail is eating diskspace, but we can't > >>> find any > >>> >> > directories that is actually taking the diskspace. We first > >>> suspected > >>> >> either > >>> >> > fetchmail or spamassassin of causing a lot of space to be used, > >>> since > >>> >> some > >>> >> > of their directories were huge. (These were later deleted, and > >>> which is > >>> >> why > >>> >> > you see that 4.6GB is now available, before that 0GB was > >>> available). > >>> >> > > >>> >> > However, we can't find *any trace* of an actual directory or > >>> file that > >>> >> is > >>> >> > taking all the spac.e > >>> >> > > >>> >> > Take this for instance: > >>> >> > > >>> >> > outsidejail# du -sh rb.org > >>> >> > =A043G =A0 =A0rb.org > >>> >> > > >>> >> > How can this be? =A0df and zfs are showing that the entire drive > >>> is nearly > >>> >> > full, yet I can't find any directory that is actually taking > >>> all this > >>> >> space. > >>> >> > =A0I've carefully looked through every single directory within > >>> the jail > >>> >> trying > >>> >> > to find something that's taking all that space, but to no avail. > >>> >> > > >>> >> > ---- > >>> >> > My system stats: > >>> >> > # uname -a > >>> >> > FreeBSD grim 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19 > >>> 02:36:49 UTC > >>> >> > 2010 > >>> root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC =A0amd64 > >>> >> > # zpool get version tpool > >>> >> > NAME =A0 PROPERTY =A0VALUE =A0 =A0SOURCE > >>> >> > tpool =A0version =A0 14 =A0 =A0 =A0 default > >>> >> > # zpool status > >>> >> > =A0 pool: tpool > >>> >> > =A0state: ONLINE > >>> >> > =A0scrub: none requested > >>> >> > config: > >>> >> > > >>> >> > =A0 =A0 =A0 =A0 NAME =A0 =A0 =A0 =A0STATE =A0 =A0 READ WRITE CKS= UM > >>> >> > =A0 =A0 =A0 =A0 tpool =A0 =A0 =A0 ONLINE =A0 =A0 =A0 0 =A0 =A0 0= =A0 =A0 0 > >>> >> > =A0 =A0 =A0 =A0 =A0 mirror =A0 =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0= =A0 =A0 0 > >>> >> > =A0 =A0 =A0 =A0 =A0 =A0 ad4s1d =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0= =A0 =A0 0 > >>> >> > =A0 =A0 =A0 =A0 =A0 =A0 ad6s1d =A0ONLINE =A0 =A0 =A0 0 =A0 =A0 0= =A0 =A0 0 > >>> >> > > >>> >> > errors: No known data errors > >>> >> > > >>> >> > [ Note that I've also done a scrub recently ] > >>> >> > >>> >> 1) Have you checked using fstat to ensure that no file descriptors > >>> >> remain open on any of your ZFS filesystems (not pools)? > >>> >> > >>> >> 2) Are you using compression on any of your ZFS filesystems? > >> > >> Andriy and Pawel, > >> > >> Do either of you have ideas as to what could cause the issue Torbjorn'= s > >> experiencing? =A0I swear I remember some bug or quirk that got fixed w= ith > >> regards to free space on ZFS, but as has been proven time and time aga= in > >> my memory is horrible. =A0His kernel's 8.1-RELEASE dated July 19th. > > > > IIRC the commit you talk about was by Martin (CCed). I do not know if > > it is (already) MFCed. > > > > I'm not sure the bug you talk about is related to what Torbjorn is > > talking about. The fact that the free space is going down while the > > jail is shutdown (and I assume jls does not show his JID anymore, so > > all of its processes are really gone) points more to some other > > process (outside of the jail) which is filling some (maybe already > > deleted, so not visible anymore with du) file. > > > > It certainly smells like a process still writing to a file that is unlink= ed. > I wonder if it would show up with lsof. > > If dtrace is enabled on that machine then I think it should be easy to > see which process is performing write operations. > That could very well be. Interestingly, dtrace is not installed and doesn't even load. When I do kldload dtraceall it says: kldload: can't load dtraceall: Exec format error =A0Perhaps I should recompile the kernel on this server, and build in Dtrace into the kernel. Perhaps I should first update to FreeBSD-STABLE, as it is more cutting edge? Actually, I'll first do a complete backup of this jail, remove the zfs filesystem, then re-create it, put the files back, and see what happens. The unfortunate thing is that I will be ruining a chance to find out what really happened.