From owner-freebsd-fs@FreeBSD.ORG Tue Jun 23 18:22:58 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F0E0C1065679 for ; Tue, 23 Jun 2009 18:22:57 +0000 (UTC) (envelope-from mandrews@bit0.com) Received: from magnum.bit0.com (magnum.bit0.com [207.246.88.226]) by mx1.freebsd.org (Postfix) with ESMTP id C123B8FC18 for ; Tue, 23 Jun 2009 18:22:57 +0000 (UTC) (envelope-from mandrews@bit0.com) Received: from localhost (localhost [127.0.0.1]) by magnum.bit0.com (Postfix) with ESMTP id B9A88A9A6 for ; Tue, 23 Jun 2009 14:07:17 -0400 (EDT) X-Virus-Scanned: amavisd-new at bit0.com Received: from magnum.bit0.com ([127.0.0.1]) by localhost (magnum.int.bit0.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5-fNJF6GoIFc for ; Tue, 23 Jun 2009 14:07:12 -0400 (EDT) Received: from beast.int.bit0.com (beast.int.bit0.com [172.27.0.2]) by magnum.bit0.com (Postfix) with ESMTP for ; Tue, 23 Jun 2009 14:07:12 -0400 (EDT) Date: Tue, 23 Jun 2009 14:07:11 -0400 (EDT) From: Mike Andrews X-X-Sender: mandrews@beast.int.bit0.com To: fs@freebsd.org In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: Re: weird problem w/ ZFS not reclaiming freed space X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jun 2009 18:22:58 -0000 On Fri, 19 Jun 2009, Mike Andrews wrote: > Somehow I've managed to get ZFS on one of my machines into a state where it > won't reclaim all space after deleting files AND snapshots off of it: > (this is with 7.2-STABLE amd64, compiled June 10) > > # ls -la /weird > total 4 > drwxr-x--- 2 mysql mysql 2 Jun 19 02:42 . > drwxr-xr-x 29 root wheel 1024 Jun 19 02:44 .. > > # df /weird > Filesystem 1K-blocks Used Avail Capacity Mounted on > scotch/weird 282201472 109151232 173050240 39% /weird > > # zfs list scotch/weird > NAME USED AVAIL REFER MOUNTPOINT > scotch/weird 104G 164G 104G /weird > > # zfs list -t snapshot | grep scotch/weird > > # zfs get all scotch/weird > NAME PROPERTY VALUE SOURCE > scotch/weird type filesystem - > scotch/weird creation Wed Jun 17 1:20 2009 - > scotch/weird used 104G - > scotch/weird available 159G - > scotch/weird referenced 104G - > scotch/weird compressratio 1.00x - > scotch/weird mounted yes - > scotch/weird quota none default > scotch/weird reservation none default > scotch/weird recordsize 128K default > scotch/weird mountpoint /weird local > scotch/weird sharenfs off default > scotch/weird checksum on default > scotch/weird compression off default > scotch/weird atime off local > scotch/weird devices on default > scotch/weird exec off local > scotch/weird setuid off local > scotch/weird readonly off default > scotch/weird jailed off default > scotch/weird snapdir hidden default > scotch/weird aclmode groupmask default > scotch/weird aclinherit restricted default > scotch/weird canmount on default > scotch/weird shareiscsi off default > scotch/weird xattr off temporary > scotch/weird copies 1 default > scotch/weird version 3 - > scotch/weird utf8only off - > scotch/weird normalization none - > scotch/weird casesensitivity sensitive - > scotch/weird vscan off default > scotch/weird nbmand off default > scotch/weird sharesmb off default > scotch/weird refquota none default > scotch/weird refreservation none default > scotch/weird primarycache all default > scotch/weird secondarycache all default > scotch/weird usedbysnapshots 0 - > scotch/weird usedbydataset 104G - > scotch/weird usedbychildren 0 - > scotch/weird usedbyrefreservation 0 - > > > If I then rsync stuff to it, space seems OK, if I continue to rsync to it > every few hours, the used space grows, even if no snapshots are being taken > If I do take snapshots, then change stuff, then delete the snapshots, the > snapshot space does appear to be reclaimed. Also if I 'zfs destroy' the > filesystem, the space is correctly reclaimed, but once I create a new one > and repeat the process, the problem reappears. > > I have not had any luck reproducing this on another machine yet, but > admittedly haven't tried super hard yet. > > Scrubbing the zpool returns no errors. > > I'm guessing zdb is my only hope at debugging this, but as I've never used it > before and as it seems to dump core whenever I try running it, can someone > suggest what I need to check/look for in it? > > I did also have a panic a few days ago that, based on the text, might be > related (I do have the vmdump and core.txt) > > panic: solaris assert: P2PHASE(start, 1ULL << sm->sm_shift) == 0, file: > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, > line: 146 > > ...for which I have a vmdump and a core.txt if anyone wants to look at it. Just to update this and move it to the -fs mailing list, removing the "--sparse" flag from rsync solves this problem, so the bug has something to do with ZFS's handling of sparse files.