Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 09 Feb 2012 14:29:49 +0100
From:      Peter Maloney <peter.maloney@brockmann-consult.de>
To:        freebsd-fs@freebsd.org
Subject:   zfs snapdir NFS hang
Message-ID:  <4F33CA4D.6060400@brockmann-consult.de>

next in thread | raw e-mail | index | archive | help
So, I have an issue where after some point (arbitrary number of
snapshots? a specific snapshot? gremlins?), exporting a directory that
contains a .zfs directory, whether or not snapdir=hidden is set, then
listing the directory in the NFS client causes a total hang of the
dataset and some commands like "zdb -d poolname". I don't know the root
cause, so I don't know how to reproduce it, or create a PR.

eg.
|# echo /tank/dataset -maproot=root 10.10.10.10 >> /etc/exports|
|# kill -HUP `cat /var/run/mountd.pid`|
|# tail /var/log/messages|
Code:

Feb  8 15:47:54 bcnas1 mountd[46760]: can't delete exports for /tank/dataset/.zfs/snapshot/replication-20120204134001: Invalid argument
Feb  8 15:47:54 bcnas1 mountd[46760]: can't delete exports for /tank/dataset/.zfs/snapshot/replication-20120208140000: Invalid argument
...

(I would think the above shows that the NFS server is not really
compatible with this situation / buggy)
|# ssh 10.10.10.10 "mount bcnas1:/tank/dataset /mountpoint ; ls
/mountpoint/.zfs/snapshot"|
(hang on this command, and anything else after this poing using the same
dataset)

There was a point when this would not hang, but instead just show many
directories, and then for many other snapshots (all of the ones listed
in the /var/log/messages errors and more), there would be strange binary
files, or directories with the wrong files in them (it would show me a
subdirectory of the correct root of the snapshot).

All of these problems happen whether or not I set snapdir=hidden or
snapdir=visible.

I am currently running 8-STABLE from Sept. 28th.

Today an identical problem happened, and I rebooted to fix it. I don't
know if it was the same cause, but I would like to find out.

Can someone give me ideas of how to track the problem, or tell me which
source files I should open up in /usr/src, hack apart or add debugging
and either:

    * find the root cause of the problem
    * prevent NFS from exporting any .zfs directories

Or does someone know if this has been fixed in the latest 8-STABLE or 9?

The best workaround I can think of is reorganizing all my datasets so
the root directory is empty except one directory, and then share only
that subdirectory which does not contain a .zfs directory (or any other
child datasets). But ideally, nfs clients should be able to view
snapshots to recover files.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F33CA4D.6060400>