From owner-freebsd-fs@FreeBSD.ORG Thu Aug 4 12:40:12 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C1071065672 for ; Thu, 4 Aug 2011 12:40:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 1ACE78FC1A for ; Thu, 4 Aug 2011 12:40:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p74CeBwO008142 for ; Thu, 4 Aug 2011 12:40:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p74CeBxw008141; Thu, 4 Aug 2011 12:40:11 GMT (envelope-from gnats) Date: Thu, 4 Aug 2011 12:40:11 GMT Message-Id: <201108041240.p74CeBxw008141@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/157728: [zfs] zfs (v28) incremental receive may leave behind temporary clones X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Aug 2011 12:40:12 -0000 The following reply was made to PR kern/157728; it has been noted by GNATS. From: Martin Matuska To: Borja Marcos Cc: bug-followup@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: kern/157728: [zfs] zfs (v28) incremental receive may leave behind temporary clones Date: Thu, 04 Aug 2011 14:33:45 +0200 This is a multi-part message in MIME format. --------------090908080609040403050308 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit That is not a solution, we want hidden datasets :) A workaround patch is attached that does not prefetch hidden datasets in zfs (btw. why should we do that at all). It doesn't cure the source of the problem but the symptoms - to reproduce the problem you have to run zfs list or get directly on the invisible temporary clone now. Please test. Dňa 04.08.2011 12:39, Borja Marcos wrote / napísal(a): > I have a clue. I've tried a partial fix and so far seems to work. Now I have a loop doing zfs sends of a dataset with a make buildworld running, each 30 seconds, and receiving them onto a different pool, on which I have a while ( 1 ) ; zfs list ; end loop running. > > So far I haven't had issues. The only side effect is that temporary datasets can appear in the zfs list output. > > Read below for the explanation. > > After reading Martin's analysis, seemed quite clear to me that the scenario was due to the necessity of getting a consistent snapshot of the state of a complex data structure. In this case, I imagined that the "list" service would traverse the data structures holding the datasets descriptions, and that it would place temporary locks on the elements in order to prevent them from being altered while the structure is being traversed. > > So, a generic "list" service in a fine-grained locking environment and rendering a consistent response would be something like that: > > - traverse data structure, building a list. > (each time we get an element, a temporary lock is placed on it) > - get next element, etc. > > - With the complete and consistent list ready, prepare the response. > > - Once the response has been built, traverse the grabbed results and release the locks. > > > So, where's the problem? In the special treatment of the "hidden" datasets. > > Looking at /usr/src/sys/cddl/contrib/opensolaris/common/fs/zfs/zfs_ioctl.c, at the function zfs_ioc_dataset_list_next(zfs_cmd_t *zc) > > I see something resembling this idea: > > while (error == 0 && dataset_name_hidden(zc->zc_name) && > !(zc->zc_iflags & FKIOCTL)); > dmu_objset_rele(os, FTAG); > > So, wondering if the problem is this, giving a special treatment to the hidden dataset, I've edited the dataset_name_hidden() function so that it ignores the "%" datasets. > > boolean_t > dataset_name_hidden(const char *name) > { > /* > * Skip over datasets that are not visible in this zone, > * internal datasets (which have a $ in their name), and > * temporary datasets (which have a % in their name). > */ > if (strchr(name, '$') != NULL) > return (B_TRUE); > /* if (strchr(name, '%') != NULL) > return (B_TRUE); */ > if (!INGLOBALZONE(curthread) && !zone_dataset_visible(name, NULL)) > return (B_TRUE); > return (B_FALSE); > } > > > I was expecting just a side-effect: a "zfs list" would list the "%"datasets. > > Done this, I've compiled the kernel, started the test again, and, voila! it works. > > Of course, now I see the "%" datasets while the zfs receive is running, > > pruebazfs3# zfs list -t all > NAME USED AVAIL REFER MOUNTPOINT > rpool 1.22G 6.61G 41.3K /rpool > rpool/newsrc 1.22G 6.61G 565M /rpool/newsrc > rpool/newsrc@anteshidden 149M - 973M - > rpool/newsrc@parcheteoria1 1.09M - 973M - > rpool/newsrc@20110804_113700 0 - 565M - > rpool/newsrc/%20110804_113730 1.31M 6.61G 566M /rpool/newsrc/%20110804_113730 > > > but after zfs receive finishes they are correctly cleaned up > > NAME USED AVAIL REFER MOUNTPOINT > rpool 1.22G 6.61G 41.3K /rpool > rpool/newsrc 1.22G 6.61G 566M /rpool/newsrc > rpool/newsrc@anteshidden 149M - 973M - > rpool/newsrc@parcheteoria1 1.09M - 973M - > rpool/newsrc@20110804_113730 0 - 566M - > > > So: Seems to me that these datasets are a sort of afterthought. The ioctl "list" service should not discard them when building the dataset list. Instead it should not "print" them, so to speak. > > I'm sure this temporary fix can be refined, and I'm wondering if a similar issue is lurking somewhere else.... -- Martin Matuska FreeBSD committer http://blog.vx.sk --------------090908080609040403050308 Content-Type: text/x-patch; name="zfs_ioctl.c.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="zfs_ioctl.c.patch" Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c =================================================================== --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c (revision 224648) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c (working copy) @@ -1963,8 +1963,13 @@ zfs_ioc_dataset_list_next() uint64_t cookie = 0; int len = sizeof (zc->zc_name) - (p - zc->zc_name); - while (dmu_dir_list_next(os, len, p, NULL, &cookie) == 0) - (void) dmu_objset_prefetch(zc->zc_name, NULL); + while (dmu_dir_list_next(os, len, p, NULL, + &cookie) == 0) { + if (dataset_name_hidden(zc->zc_name) == B_FALSE) { + (void) dmu_objset_prefetch(zc->zc_name, + NULL); + } + } } do { --------------090908080609040403050308--