From owner-freebsd-stable@FreeBSD.ORG Fri Jun 19 07:30:39 2009 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DBF95106566C for ; Fri, 19 Jun 2009 07:30:38 +0000 (UTC) (envelope-from mandrews@bit0.com) Received: from magnum.bit0.com (magnum.bit0.com [207.246.88.226]) by mx1.freebsd.org (Postfix) with ESMTP id B2F3D8FC13 for ; Fri, 19 Jun 2009 07:30:38 +0000 (UTC) (envelope-from mandrews@bit0.com) Received: from localhost (localhost [127.0.0.1]) by magnum.bit0.com (Postfix) with ESMTP id 170E29D23 for ; Fri, 19 Jun 2009 03:30:38 -0400 (EDT) X-Virus-Scanned: amavisd-new at bit0.com Received: from magnum.bit0.com ([127.0.0.1]) by localhost (magnum.int.bit0.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VaYFY9tEMb0H for ; Fri, 19 Jun 2009 03:30:33 -0400 (EDT) Received: from beast.int.bit0.com (beast.int.bit0.com [172.27.0.2]) by magnum.bit0.com (Postfix) with ESMTP for ; Fri, 19 Jun 2009 03:30:33 -0400 (EDT) Date: Fri, 19 Jun 2009 03:30:33 -0400 (EDT) From: Mike Andrews X-X-Sender: mandrews@beast.int.bit0.com To: stable@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Cc: Subject: weird problem w/ ZFS not reclaiming freed space X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Jun 2009 07:30:39 -0000 Somehow I've managed to get ZFS on one of my machines into a state where it won't reclaim all space after deleting files AND snapshots off of it: (this is with 7.2-STABLE amd64, compiled June 10) # ls -la /weird total 4 drwxr-x--- 2 mysql mysql 2 Jun 19 02:42 . drwxr-xr-x 29 root wheel 1024 Jun 19 02:44 .. # df /weird Filesystem 1K-blocks Used Avail Capacity Mounted on scotch/weird 282201472 109151232 173050240 39% /weird # zfs list scotch/weird NAME USED AVAIL REFER MOUNTPOINT scotch/weird 104G 164G 104G /weird # zfs list -t snapshot | grep scotch/weird # zfs get all scotch/weird NAME PROPERTY VALUE SOURCE scotch/weird type filesystem - scotch/weird creation Wed Jun 17 1:20 2009 - scotch/weird used 104G - scotch/weird available 159G - scotch/weird referenced 104G - scotch/weird compressratio 1.00x - scotch/weird mounted yes - scotch/weird quota none default scotch/weird reservation none default scotch/weird recordsize 128K default scotch/weird mountpoint /weird local scotch/weird sharenfs off default scotch/weird checksum on default scotch/weird compression off default scotch/weird atime off local scotch/weird devices on default scotch/weird exec off local scotch/weird setuid off local scotch/weird readonly off default scotch/weird jailed off default scotch/weird snapdir hidden default scotch/weird aclmode groupmask default scotch/weird aclinherit restricted default scotch/weird canmount on default scotch/weird shareiscsi off default scotch/weird xattr off temporary scotch/weird copies 1 default scotch/weird version 3 - scotch/weird utf8only off - scotch/weird normalization none - scotch/weird casesensitivity sensitive - scotch/weird vscan off default scotch/weird nbmand off default scotch/weird sharesmb off default scotch/weird refquota none default scotch/weird refreservation none default scotch/weird primarycache all default scotch/weird secondarycache all default scotch/weird usedbysnapshots 0 - scotch/weird usedbydataset 104G - scotch/weird usedbychildren 0 - scotch/weird usedbyrefreservation 0 - If I then rsync stuff to it, space seems OK, if I continue to rsync to it every few hours, the used space grows, even if no snapshots are being taken If I do take snapshots, then change stuff, then delete the snapshots, the snapshot space does appear to be reclaimed. Also if I 'zfs destroy' the filesystem, the space is correctly reclaimed, but once I create a new one and repeat the process, the problem reappears. I have not had any luck reproducing this on another machine yet, but admittedly haven't tried super hard yet. Scrubbing the zpool returns no errors. I'm guessing zdb is my only hope at debugging this, but as I've never used it before and as it seems to dump core whenever I try running it, can someone suggest what I need to check/look for in it? I did also have a panic a few days ago that, based on the text, might be related (I do have the vmdump and core.txt) panic: solaris assert: P2PHASE(start, 1ULL << sm->sm_shift) == 0, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, line: 146 ...for which I have a vmdump and a core.txt if anyone wants to look at it.