Date: Sat, 09 Jul 2011 21:16:01 +0200 From: Martin Birgmeier <Martin.Birgmeier@aon.at> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-fs@FreeBSD.org Subject: Re: kern/131342: [nfs] mounting/unmounting of disks causes NFS to fail Message-ID: <4E18A8F1.5060102@aon.at> In-Reply-To: <1489296886.367033.1310155084168.JavaMail.root@erie.cs.uoguelph.ca> References: <1489296886.367033.1310155084168.JavaMail.root@erie.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
Thank you for looking into this - answers below. On 07/08/11 21:58, Rick Macklem wrote: > Martin Birgmeier wrote: >> The following reply was made to PR kern/131342; it has been noted by >> GNATS. >> >> From: Martin Birgmeier<Martin.Birgmeier@aon.at> >> To: bug-followup@FreeBSD.org >> Cc: >> Subject: Re: kern/131342: [nfs] mounting/unmounting of disks causes >> NFS to >> fail >> Date: Fri, 08 Jul 2011 15:00:03 +0200 >> >> This is a friendly reminder that some kind soul with knowledge of the >> relevant kernel parts look into this... the error can easily be >> reproduced. I just had it on a 7.4 system which did heavy reading from >> an 8.2 server. When I mounted something on the server, the client got >> a >> "Permission denied" reply. >> >> So, to recap the scenario: >> >> 7.4 NFS client >> 8.2 NFS server >> client mounts a fs from the server (via IPv4, might be interesting to >> look at http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/151681, too, >> but >> that is unrelated) >> client does heavy i/o on the mounted fs >> server does a mount (on its side, in this case it was from an md >> device) >> >> --> error: client gets back some NFS error (in this case "permission >> denied") >> > I just made a quick attempt and wasn't able to reproduce this. I mounted/unmounted > a UFS volume on the server (both to a subdir of the exported volume and to a > directory outside of the exported volume) while a client as accessing an exported fs > and didn't get an error. You'll need to be doing heavy NFS i/o from the client to the server while mounting/unmounting something on the server in order to reproduce the problem. > > Could the way you mount the volume on the server somehow end up renumbering the > exported volumes? If so, the fsid in the file handle will no longer be able to > vfs_busyfs(fsid); > - and then the mount point will be broken until remounted by the client. I am sorry, but I don't know how to check this (# of the exported volume). On the other hand, I do not believe it does - see below. > > I don't use anything like geom and don't use ZFS. I had this problem earlier when I wasn't using ZFS, so it does not seem to be specific to ZFS. However, now the server is (also) running ZFS (root is on UFS). > > Since you can reproduce this easily, I'd suggest that you: > 1 - look to make sure drives (the st_dev value returned by stat(2)) aren't being > renumbered by the mount. (If they are, then that has to be avoided if an NFS > export is to still work.) > 2 - Try mounting/unmounting something else, to see if it is md specific. I seem to remember that it's not only confined to adding an md-backed mount on the server, but that I also had this with CDROM mounts (mounting a CD on the server would result in a client error). I'd need to check that, but it might take a while. > > Also, does it only happen when there is a heavy load generated by the client or > all the time? (If only under heavy load, it may be a mount list locking bug, since > that's the only place where a mount of a non-exported volume on the server will > affect the exported mounts, as far as I can see.) I am quite sure it is mostly under heavy load; see also below. > > I don't mind looking at a packet trace (you can email me the file generated by > "tcpdump -s 0 -w<file> host<nfs-client>" when run on the server, but only if > you can reproduce it without the heavy client load. (If only reproduced when there > is a heavy client load a packet trace would be too big and probably not useful, > since the bug is likely some race related to the mount list.) Maybe I manage to reproduce it and cut it down sufficiently - this might take a while, though. > > rick > ps: I assume you are referring to mounts that worked before the server mount > and not a case where the new mount was supposed to be exported. That's clear. > > Oh, and one more question... > Is the error persistent (ie. is the client mount unusable until remounted) > or does the mount point work after the mount/unmount of the other volume > has completed? This seems to be a crucial question: in fact, after the single error event (which typically halts the heavy NFS i/o, therefore changes the situation - cf the question about load above), the mount continues to work perfectly. So referring to your question about renumbering above, I'd guess no, it does not get renumbered. > > If it just happens when the other volume is unmounted/mounted, make sure > that you aren't using the "soft" option for your client mounts. ("soft" > implies that an RPC fails after a timeout, and an unmount/mount of > another volume could delay the RPC for a while, until the mount list > is unlocked.) I am not using soft mounts. > > rick Regards, Martin p.s. Sending this also to freebsd-fs, but since I'm currently not subscribed, this might not make it through.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E18A8F1.5060102>