From owner-freebsd-fs@FreeBSD.ORG Thu Feb 9 15:42:13 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB4C31065675 for ; Thu, 9 Feb 2012 15:42:12 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.171]) by mx1.freebsd.org (Postfix) with ESMTP id 4D5358FC19 for ; Thu, 9 Feb 2012 15:42:12 +0000 (UTC) Received: from [10.3.0.26] ([141.4.215.32]) by mrelayeu.kundenserver.de (node=mreu3) with ESMTP (Nemesis) id 0Ld8Qn-1SLT6J39dZ-00iEe1; Thu, 09 Feb 2012 16:42:11 +0100 Message-ID: <4F33E952.80102@brockmann-consult.de> Date: Thu, 09 Feb 2012 16:42:10 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4F33CA4D.6060400@brockmann-consult.de> In-Reply-To: <4F33CA4D.6060400@brockmann-consult.de> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:n++W0t6aoGuOC50Yyuz4CeT50SLViH1PIL59B98FS+F 0R1UkWftG0hj4VZCI5L86sfOOIel+3uhyN3uPsh3EWgQiX4PCw zz+zX7GEnXA6eksKurxZuNPbq5qXDR6b/SPXqwwqZJlclnEACl f9ZGXc/GnmjSVXuWSJ9EYYUehWOcWMHnm8ewbc+nF3eXhK7AOg +JhsQdPQoaY6K0ezkky2SgwQqJMQSplruahlm1dKJXIkMffVsP jEg5Q5IdP5G4HuqisJFYRN1aFjme8un7aTlbiDMy7RwHtmkF28 n33V6oy4YQniprzY6AgKmlQt6s7YkHGoDfXTej613GtY2qO7AC hukx9ZPHnnfzt45nB5rXfV+g7z81Hbn5O5V+0iMmt Subject: Re: zfs snapdir NFS hang X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Feb 2012 15:42:13 -0000 I just tested it with a FreeBSD NFS client, and it all looks correct. So I guess only the Linux client triggers this strange behavior (wrong directories, files instead of directories, etc.). But the /var/log/messages error messages clearly show the server is messed up, not (only?) the client. And obviously the hang can't be blamed on the client. linuxclient # uname -a Linux peter 2.6.38-12-generic #51-Ubuntu SMP Wed Sep 28 14:27:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux freebsdclient # uname -a FreeBSD bczfsvm1.bc.local 8.2-STABLE-20120104 FreeBSD 8.2-STABLE-20120104 #0: Mon Feb 6 12:10:32 UTC 2012 root@bczfsvm1.bc.local:/usr/obj/usr/src/sys/GENERIC amd64 And for the record, I have tested full scans of the .zfs/snapshot directory using "find" to see if it "brings the server to its knees" as is often said, but even that does not cause any problems at all (with 48 GB of memory on this machine). I am currently running another run of that test, and will tell you the result in 20 or so hours when it is done. # time find /tank/bcnasvm1/.zfs/snapshot -type f > /dev/null 2>&1 On 02/09/2012 02:29 PM, Peter Maloney wrote: > So, I have an issue where after some point (arbitrary number of > snapshots? a specific snapshot? gremlins?), exporting a directory that > contains a .zfs directory, whether or not snapdir=hidden is set, then > listing the directory in the NFS client causes a total hang of the > dataset and some commands like "zdb -d poolname". I don't know the root > cause, so I don't know how to reproduce it, or create a PR. > > eg. > |# echo /tank/dataset -maproot=root 10.10.10.10 >> /etc/exports| > |# kill -HUP `cat /var/run/mountd.pid`| > |# tail /var/log/messages| > Code: > > Feb 8 15:47:54 bcnas1 mountd[46760]: can't delete exports for /tank/dataset/.zfs/snapshot/replication-20120204134001: Invalid argument > Feb 8 15:47:54 bcnas1 mountd[46760]: can't delete exports for /tank/dataset/.zfs/snapshot/replication-20120208140000: Invalid argument > ... > > (I would think the above shows that the NFS server is not really > compatible with this situation / buggy) > |# ssh 10.10.10.10 "mount bcnas1:/tank/dataset /mountpoint ; ls > /mountpoint/.zfs/snapshot"| > (hang on this command, and anything else after this poing using the same > dataset) > > There was a point when this would not hang, but instead just show many > directories, and then for many other snapshots (all of the ones listed > in the /var/log/messages errors and more), there would be strange binary > files, or directories with the wrong files in them (it would show me a > subdirectory of the correct root of the snapshot). > > All of these problems happen whether or not I set snapdir=hidden or > snapdir=visible. > > I am currently running 8-STABLE from Sept. 28th. > > Today an identical problem happened, and I rebooted to fix it. I don't > know if it was the same cause, but I would like to find out. > > Can someone give me ideas of how to track the problem, or tell me which > source files I should open up in /usr/src, hack apart or add debugging > and either: > > * find the root cause of the problem > * prevent NFS from exporting any .zfs directories > > Or does someone know if this has been fixed in the latest 8-STABLE or 9? > > The best workaround I can think of is reorganizing all my datasets so > the root directory is empty except one directory, and then share only > that subdirectory which does not contain a .zfs directory (or any other > child datasets). But ideally, nfs clients should be able to view > snapshots to recover files. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann-consult.de Internet: http://www.brockmann-consult.de --------------------------------------------