From owner-freebsd-fs@FreeBSD.ORG  Tue Jun 23 18:22:58 2009
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F0E0C1065679
	for <fs@freebsd.org>; Tue, 23 Jun 2009 18:22:57 +0000 (UTC)
	(envelope-from mandrews@bit0.com)
Received: from magnum.bit0.com (magnum.bit0.com [207.246.88.226])
	by mx1.freebsd.org (Postfix) with ESMTP id C123B8FC18
	for <fs@freebsd.org>; Tue, 23 Jun 2009 18:22:57 +0000 (UTC)
	(envelope-from mandrews@bit0.com)
Received: from localhost (localhost [127.0.0.1])
	by magnum.bit0.com (Postfix) with ESMTP id B9A88A9A6
	for <fs@freebsd.org>; Tue, 23 Jun 2009 14:07:17 -0400 (EDT)
X-Virus-Scanned: amavisd-new at bit0.com
Received: from magnum.bit0.com ([127.0.0.1])
	by localhost (magnum.int.bit0.com [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id 5-fNJF6GoIFc for <fs@freebsd.org>;
	Tue, 23 Jun 2009 14:07:12 -0400 (EDT)
Received: from beast.int.bit0.com (beast.int.bit0.com [172.27.0.2])
	by magnum.bit0.com (Postfix) with ESMTP
	for <fs@freebsd.org>; Tue, 23 Jun 2009 14:07:12 -0400 (EDT)
Date: Tue, 23 Jun 2009 14:07:11 -0400 (EDT)
From: Mike Andrews <mandrews@bit0.com>
X-X-Sender: mandrews@beast.int.bit0.com
To: fs@freebsd.org
In-Reply-To: <alpine.BSF.2.00.0906190329420.41402@beast.int.bit0.com>
Message-ID: <alpine.BSF.2.00.0906231405500.35070@beast.int.bit0.com>
References: <alpine.BSF.2.00.0906190329420.41402@beast.int.bit0.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: 
Subject: Re: weird problem w/ ZFS not reclaiming freed space
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Jun 2009 18:22:58 -0000

On Fri, 19 Jun 2009, Mike Andrews wrote:

> Somehow I've managed to get ZFS on one of my machines into a state where it 
> won't reclaim all space after deleting files AND snapshots off of it:
> (this is with 7.2-STABLE amd64, compiled June 10)
>
> # ls -la /weird
> total 4
> drwxr-x---   2 mysql  mysql     2 Jun 19 02:42 .
> drwxr-xr-x  29 root   wheel  1024 Jun 19 02:44 ..
>
> # df /weird
> Filesystem   1K-blocks      Used     Avail Capacity  Mounted on
> scotch/weird 282201472 109151232 173050240    39%    /weird
>
> # zfs list scotch/weird
> NAME           USED  AVAIL  REFER  MOUNTPOINT
> scotch/weird   104G   164G   104G  /weird
>
> # zfs list -t snapshot | grep scotch/weird
>
> # zfs get all scotch/weird
> NAME          PROPERTY              VALUE                  SOURCE
> scotch/weird  type                  filesystem             -
> scotch/weird  creation              Wed Jun 17  1:20 2009  -
> scotch/weird  used                  104G                   -
> scotch/weird  available             159G                   -
> scotch/weird  referenced            104G                   -
> scotch/weird  compressratio         1.00x                  -
> scotch/weird  mounted               yes                    -
> scotch/weird  quota                 none                   default
> scotch/weird  reservation           none                   default
> scotch/weird  recordsize            128K                   default
> scotch/weird  mountpoint            /weird                 local
> scotch/weird  sharenfs              off                    default
> scotch/weird  checksum              on                     default
> scotch/weird  compression           off                    default
> scotch/weird  atime                 off                    local
> scotch/weird  devices               on                     default
> scotch/weird  exec                  off                    local
> scotch/weird  setuid                off                    local
> scotch/weird  readonly              off                    default
> scotch/weird  jailed                off                    default
> scotch/weird  snapdir               hidden                 default
> scotch/weird  aclmode               groupmask              default
> scotch/weird  aclinherit            restricted             default
> scotch/weird  canmount              on                     default
> scotch/weird  shareiscsi            off                    default
> scotch/weird  xattr                 off                    temporary
> scotch/weird  copies                1                      default
> scotch/weird  version               3                      -
> scotch/weird  utf8only              off                    -
> scotch/weird  normalization         none                   -
> scotch/weird  casesensitivity       sensitive              -
> scotch/weird  vscan                 off                    default
> scotch/weird  nbmand                off                    default
> scotch/weird  sharesmb              off                    default
> scotch/weird  refquota              none                   default
> scotch/weird  refreservation        none                   default
> scotch/weird  primarycache          all                    default
> scotch/weird  secondarycache        all                    default
> scotch/weird  usedbysnapshots       0                      -
> scotch/weird  usedbydataset         104G                   -
> scotch/weird  usedbychildren        0                      -
> scotch/weird  usedbyrefreservation  0                      -
>
>
> If I then rsync stuff to it, space seems OK, if I continue to rsync to it
> every few hours, the used space grows, even if no snapshots are being taken
> If I do take snapshots, then change stuff, then delete the snapshots, the
> snapshot space does appear to be reclaimed.  Also if I 'zfs destroy' the
> filesystem, the space is correctly reclaimed, but once I create a new one
> and repeat the process, the problem reappears.
>
> I have not had any luck reproducing this on another machine yet, but 
> admittedly haven't tried super hard yet.
>
> Scrubbing the zpool returns no errors.
>
> I'm guessing zdb is my only hope at debugging this, but as I've never used it 
> before and as it seems to dump core whenever I try running it, can someone 
> suggest what I need to check/look for in it?
>
> I did also have a panic a few days ago that, based on the text, might be 
> related (I do have the vmdump and core.txt)
>
> panic: solaris assert: P2PHASE(start, 1ULL << sm->sm_shift) == 0, file: 
> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/space_map.c, 
> line: 146
>
> ...for which I have a vmdump and a core.txt if anyone wants to look at it.


Just to update this and move it to the -fs mailing list, removing the 
"--sparse" flag from rsync solves this problem, so the bug has something 
to do with ZFS's handling of sparse files.