Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 19 Dec 2013 20:08:22 +0100 (CET)
From:      krichy@tvnetwork.hu
To:        =?ISO-8859-15?Q?Gerrit_K=FChn?= <gerrit.kuehn@aei.mpg.de>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: ZFS snapshot renames failing after upgrade to 9.2
Message-ID:  <alpine.DEB.2.10.1312192007100.30378@krichy.tvnetwork.hu>
In-Reply-To: <20131219174054.91ac617a.gerrit.kuehn@aei.mpg.de>
References:  <0C9FD4E1-0549-4849-BFC5-D8C5D4A34D64@msqr.us> <54D3B3C002184A52BEC9B1543854B87F@multiplay.co.uk> <333D57C6A4544067880D9CFC04F02312@multiplay.co.uk> <26053_1387447492_52B2C4C4_26053_331_1_20131219105503.3a8d1df3.gerrit.kuehn@aei.mpg.de> <20131219165549.9f2ca709.gerrit.kuehn@aei.mpg.de> <alpine.DEB.2.10.1312191718330.12885@krichy.tvnetwork.hu> <20131219174054.91ac617a.gerrit.kuehn@aei.mpg.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Dear Gerrit,

I am testing my issues on 10, but will apply them for 9.2, as my stable 
system is using that.

So a simple renaming can cause your system to hang?

Regards,
Kojedzinszky Richard
Euronet Magyarorszag Informatikai Zrt.

On Thu, 19 Dec 2013, Gerrit Kühn wrote:

> Date: Thu, 19 Dec 2013 17:40:54 +0100
> From: Gerrit Kühn <gerrit.kuehn@aei.mpg.de>
> To: krichy@tvnetwork.hu
> Cc: freebsd-fs@freebsd.org
> Subject: Re: ZFS snapshot renames failing after upgrade to 9.2
> 
> On Thu, 19 Dec 2013 17:21:04 +0100 (CET) krichy@tvnetwork.hu wrote about
> Re: ZFS snapshot renames failing after upgrade to 9.2:
>
> Dear Richard,
>
> KH> I have some similar issues with snapshot handling, and I was told to
> KH> use that patch, but unfortunately that did not solve my issues. As I
> KH> tracked down things, my issues have nothing to do with snapshot
> KH> sending.
>
> That sounds not too promising. I do not send snapshots, either, I just
> need to rotate (rename) them like to original poster.
> I rebooted the machine which made the issue go away for now. Snapshots and
> subsequent backups will be running this night, I'm curious how it looks
> tomorrow morning.
> Unusable snapshot-renaming would be very bad, my backups and several other
> things rely on that. Do you still see the issue on your system? Are you
> using 9.2 or 10?
>
>
> cu
>  Gerrit
>
From owner-freebsd-fs@FreeBSD.ORG  Thu Dec 19 23:49:06 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 57B9D69
 for <freebsd-fs@freebsd.org>; Thu, 19 Dec 2013 23:49:06 +0000 (UTC)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E2B711509
 for <freebsd-fs@freebsd.org>; Thu, 19 Dec 2013 23:49:05 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: 
X-IronPort-AV: E=Sophos;i="4.95,516,1384318800"; d="scan'208";a="81382145"
Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.222])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 19 Dec 2013 18:49:04 -0500
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 1FC6DB4026;
 Thu, 19 Dec 2013 18:49:04 -0500 (EST)
Date: Thu, 19 Dec 2013 18:49:04 -0500 (EST)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Jason Keltz <jas@cse.yorku.ca>
Message-ID: <1717165737.33441446.1387496944120.JavaMail.root@uoguelph.ca>
In-Reply-To: <52A7E53D.8000002@cse.yorku.ca>
Subject: Re: mount ZFS snapshot on Linux system
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.209]
X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790)
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>,
 Steve Dickson <SteveD@redhat.com>
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>;
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Dec 2013 23:49:06 -0000

Jason Keltz wrote:
> On 10/12/2013 7:21 PM, Rick Macklem wrote:
> > Jason Keltz wrote:
> >> I'm running FreeBSD 9.2 with various ZFS datasets.
> >> I export a dataset to a Linux system (RHEL64), and mount it.  It
> >> works
> >> fine...
> >> When I try to access the ZFS snapshot directory on the Linux NFS
> >> client,
> >> things go weird.
> >>
For NFSv4, I found two problems w.r.t. handling ZFS snapshots.
1 - Since a NFSv4 Readdir skips "." and "..", the check for
    VFS_VGET() returning ENOTSUPP wasn't happening, so it wasn't
    switching to use VOP_LOOKUP(). { As I understand it, that
    means that VFS_VGET() gets bogus stuff if the auto pseudo-mount
    of the snapshot hasn't happened. }
2 - Since the pseudo-mount of a snapshot doesn't set v_mountedhere
    in the mounted on vnode, I needed to add a check for a different
    vp->v_mount to recognize the "mount point" crossing.
I think this patch fixes both problems:
  http://people.freebsd.org/~rmacklem/nfsv4-zfs-snapshot.patch
Thanks goes to Jason for helping with testing this.

W.r.t. NFSv3, access to the snapshots is somewhat bogus, in that
an NFSv3 is never supposed to cross mount point boundaries. However,
I don't know how an auto pseudo-mount could be safely exported and
mounted as a separate volume, so all I can think of doing is documenting
"in man nfsd(8)?" that it doesn't quite work. The breakage will
depend on how the NFSv3 client handles st_dev. { The FreeBSD client
sets st_dev to the client NFS mount's fsid and doesn't use the fsid
returned by the server, so it doesn't change. As such, for FreeBSD,
it will see one file system, but with duplicated filenos. For example,
fts(3) might complain about a loop in the directory structure. }

I hope to commit the above patch to head soon, once I get it
reviewed and tested, rick

> >> With NFSv4:
> >>
> >> [jas@archive /]# cd /mnt/.zfs/snapshot
> >> [jas@archive snapshot]# ls
> >> 20131203  20131205  20131206  20131207  20131208  20131209
> >>  20131210
> >> [jas@archive snapshot]# cd 20131210
> >> 20131210: Not a directory.
> >>
> >> huh?
> >>
> >> [jas@archive snapshot]# ls -al
> >> total 77
> >> dr-xr-xr-x   9 root root   9 Dec 10 11:20 .
> >> dr-xr-xr-x   4 root root   4 Nov 28 15:42 ..
> >> drwxr-xr-x 380 root root 380 Dec  2 15:56 20131203
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131205
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131206
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131207
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131208
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131209
> >> drwxr-xr-x 381 root root 381 Dec  3 11:24 20131210
> >> [jas@archive snapshot]# stat *
> >> [jas@archive snapshot]# ls -al
> >> total 292
> >> dr-xr-xr-x 9 root      root         9 Dec 10 11:20 .
> >> dr-xr-xr-x 4 root      root         4 Nov 28 15:42 ..
> >> -rw-r--r-- 1 uax    guest   137647 Mar 17  2010 20131203
> >> -rw-r--r-- 1 uax    guest         865 Jul 31  2009 20131205
> >> -rw-r--r-- 1 uax    guest   137647 Mar 17  2010 20131206
> >> -rw-r--r-- 1 uax    guest         771 Jul 31  2009 20131207
> >> -rw-r--r-- 1 uax    guest         778 Jul 31  2009 20131208
> >> -rw-r--r-- 1 uax     guest       5281 Jul 31  2009 20131209
> >> -rw------- 1 btx      faculty      893 Jul 13 20:21 20131210
> >>
> >> But it gets even more fun..
> >>
> >> # ls -ali
> >> total 205
> >>     2 dr-xr-xr-x   9 root      root       9 Dec 10 11:20 .
> >>     1 dr-xr-xr-x   4 root      root       4 Nov 28 15:42 ..
> >> 863 -rw-r--r--   1 uax     guest 137647 Mar 17  2010 20131203
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131205
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131206
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131207
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131208
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131209
> >>     4 drwxr-xr-x 381 root      root     381 Dec  3 11:24 20131210
> >>
> >> This is not a user id mapping issue because all the files in /mnt
> >> have
> >> the proper owner/groups, and I can access them there fine.
> >>
> >> I also tried explicitly exporting .zfs/snapshot.  The result isn't
> >> any
> >> different.
> >>
> >> If I use nfs v3 it "works", but I'm seeing a whole lot of errors
> >> like
> >> these in syslog:
> >>
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131203: Invalid argument
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131209: Invalid argument
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131210: Invalid argument
> >> Dec 10 12:32:28 jungle mountd[49579]: can't delete exports for
> >> /local/backup/home9/.zfs/snapshot/20131207: Invalid argument
> >>
> >> It's not clear to me why this doesn't just "work".
> >>
> >> Can anyone provide any advice on debugging this?
> >>
> > As I think you already know, I know nothing about ZFS and never
> > use it.
> Yup! :)
> > Having said that, I suspect that there are filenos (i-node #s)
> > that are the same in the snapshot as in the parent file system
> > tree.
> >
> > The basic assumptions are:
> > - within a file system, all i-node# are unique (represent one file
> >    object only) and all file objects have the same fsid
> > - when the fsid changes, that indicates a file system boundary and
> >    fileno (i-node#s) can be reused in the subtree with a different
> >    fsid
> >
> > For NFSv3, the server should export single volumes only (all
> > objects
> > have the same fsid and the filenos are unique). This is indicated
> > to
> > the VFS by the use of the NOCROSSMOUNT flag on VOP_LOOKUP() and
> > friends.
> >
> > For NFSv4, the server does export multiple volumes and the boundary
> > is indicated by a change in fsid value.
> >
> > I suspect ZFS snaphots don't obey the above in some way, but that
> > is
> > just a hunch.
> >
> > Now, how to narrow this down...
> > - Do the above tests (both NFSv4 and NFSv3) and capture the
> > packets,
> >    then look at them in wireshark. In particular, look at the
> >    fileid numbers
> >    and fsid values for the various directories under .zfs.
> 
> I gave this a shot, but I haven't used wireshark to capture NFS
> traffic
> before, so if I need to provide additional details, let me know..
> 
> NFSv4:
> 
> For /mnt/.zfs/snapshot/20131203:
> fileid=4
> fsid4.major=1446349656
> fsid4.minor=222
> 
> For /mnt/.zfs/snapshot/20131205:
> fileid=4
> fsid4.major=1845998066
> fsid4.minor=222
> 
> For /mnt/jas:
> fileid=144
> fsid4.major=597946950
> fsid4.minor=222
> 
> For /mnt/jas1:
> fileid=338
> fsid4.major=597946950
> fsid4.minor=222
> 
> So fsid is the same for all the different "data" directories, which
> is
> what I would expect given what you said.  I  guess each snapshot is
> seen
> as a unique filesystem...  but then a repeating inode in different
> filesystems shouldn't be a problem...
> 
> NFSv3:
> 
> For /mnt/.zfs/snapshot/20131203:
> fileid=4
> fsid=0x0000000056358b58
> 
> For /mnt/.zfs/snapshot/20131205:
> fileid=4
> fsid=0x000000006e07b1f2
> 
> For /mnt/jas
> fileid=144
> fsid=0x0000000023a3f246
> 
> For /mnt/jas1:
> fileid=338
> fsid=0x0000000023a3f246
> 
> Here, it seems it's the same, even though it's NFSv3... hmm.
> 
> 
> > - Try mounting the individual snapshot directory, like
> >     .zfs/snapshot/20131209 and see if that works (for both NFSv3
> >     and NFSv4).
> 
> Hmm .. I tried this:
> 
> /local/backup/home9/.zfs/snapshot/20131203  -ro
> archive-mrpriv.cs.yorku.ca
> V4: /
> 
> ... but syslog reports:
> 
> Dec 10 22:28:22 jungle mountd[85405]: can't export
> /local/backup/home9/.zfs/snapshot/20131203
> 
> ... and of course I can't mount from either v3/v4.
> 
> On the other hand, I kept it as:
> 
> /local/backup/home9 -ro archive-mrpriv.cs.yorku.ca
> V4:/
> 
> ... and was able to NFSv4 mount
> /local/backup/home9/.zfs/snapshot/20131203, and this does indeed
> work.
> 
> > - Try doing the mounts with a FreeBSD client and see if you get the
> > same
> >    behaviour?
> I found this:
> http://forums.freenas.org/threads/mounting-snapshot-directory-using-nfs-from-linux-broken.6060/
> .. implies it will work from FreeBSD/Nexenta, just not Linux.
> Found this as well:
> https://groups.google.com/a/zfsonlinux.org/forum/#!topic/zfs-discuss/lKyfYsjPMNM
> 
> Jason.
> 
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.DEB.2.10.1312192007100.30378>