Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 May 2022 14:32:12 +0200
From:      Andreas Kempe <kempe@lysator.liu.se>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: FreeBSD 12.3/13.1 NFS client hang
Message-ID:  <YpS5TBH/h1m39uwk@shipon.lysator.liu.se>
In-Reply-To: <YQBPR0101MB9742BDE2175F07CF23A7CD5ADDD89@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM>
References:  <YpEwxdGCouUUFHiE@shipon.lysator.liu.se> <YQBPR0101MB9742280313FC17543132A61CDDD89@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM> <YpFM2bSMscG4ekc9@shipon.lysator.liu.se> <YQBPR0101MB9742BDE2175F07CF23A7CD5ADDD89@YQBPR0101MB9742.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello again Rick and thank you for your reply!

On Fri, May 27, 2022 at 11:56:32PM +0000, Rick Macklem wrote:
> Andreas Kempe <kempe@lysator.liu.se> wrote:
> > On Fri, May 27, 2022 at 08:59:57PM +0000, Rick Macklem wrote:
> > > Andreas Kempe <kempe@lysator.liu.se> wrote:
> > > > Hello everyone!
> > > >
> > > > I'm having issues with the NFS clients on FreeBSD 12.3 and 13.1
> > > > systems hanging when using a CentOS 7 server.
> > > First, make sure you are using hard mounts. "soft" or "intr" mounts won't
> > > work and will mess up the session sooner or later. (A messed up session could
> > > result in no free slots on the session and that will wedge threads in
> > > nfsv4_sequencelookup() as you describe.
> > > (This is briefly described in the BUGS section of "man mount_nfs".)
> > >
> >
> > I had totally missed that soft and interruptible mounts have these
> > issues. I switched the FreeBSD-machines to soft and intr on purpose
> > to be able to fix hung mounts without having to restart the machine on
> > NFS hangs. Since they are shared machines, it is an inconvinience for
> > other users if one user causes a hang.
> Usually, a "umount -N <mnt_path>" should dismount a hung mount
> point.  It can take a couple of minutes to complete.
> 

It's been a while since we last ran with hard mounts and I'm afraid I
can't really remember for how long I waited when the hang happened
while trying to unmount. It is possible that I thought the umount hung
as well and rebooted unnecessarily. I'll try again next time for sure.

> > Switching our test machine back to hard mounts did prevent recursive
> > grep from immediately causing the slot type hang again.
> >
> > > Do a:
> > > # nfsstat -m
> > > on the clients and look for "hard".
> > >
> > > Next, is there anything logged on the console for the 13.1 client(s)?
> > > (13.1 has some diagnostics for things like a server replying with the
> > >  wrong session slot#.)
> > >
> >
> > The one thing we have seen logged are messages along the lines of:
> > kernel: newnfs: server 'mail' error: fileid changed. fsid 4240eca6003a052a:0: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)
> It means that the server returned a different fileid number for the same file, although it should never change.
> There's a description in a comment in sys/fs/nfsclient/nfs_clport.c.
> I doubt the broken middleware is anywhere any more. I never knew the
> details, since the guy that told me about it was under NDA to the
> company that sold it. It cached Getattr replies and would sometimes return
> the wrong cached entry. I think it only worked for NFSv3, anyhow.
> 
> However, it does indicate something is seriously wrong, probably on the server end.
> (If you can capture packets when it gets logged, we could look at them in wireshark.)
> --> I'm not sure if a soft mount could somehow cause this?
> 

We do use a setup where the exported NFS mount is currently backed by
a Ceph storage cluster. I guess it is possible that something might
have gone wrong between the NFS server and the cluster, but we haven't
seem similar issues on our Linux clients. Of course, this might just
mean they handle the error better.

> The diagnostics I was referring to would be things like "Wrong session" or "freeing free slot".
> It was these that identified the Amazon EFS bug I mention later.
> 

I'd love to capture some data, but the problem is that we only have a
reliable reproducer for the soft mount with grep -R. In the hard mount
case, it would occur seemingly at random. The coming weeks are a bit
packed for me, but I'll see about trying to find a reproducer.

[snip]

> > Yes, I sent this mostly in the hope that it might be something that
> > someone has seen before. I understand that more verbose information is
> > needed to track down the lock contention.
> There is PR#260011. It is similar and he was also using soft mounts, although he is now trying
> hard mounts. Also, we now know that the Amazon EFS server has a serious
> bug where it sometimes replies with the wrong slotid.
> 

I'll have a look at this and your other suggestions as soon as I have
time.

> > I'll switch our machines back to using hard mounts and try to get as
> > much diagnostic information as possible when the next lockup happens.
> >
> > Do you have any good suggestions for tracking down the issue? I've
> > been contemplating enabling WITNESS or building with debug information
> > to be able to hook in the kernel debugger.
> I don't think WITNESS or the kernel debugger will help.
> Beyond what you get from "ps axHl", it has happened before the hang.

I guess this means you think the error is at a protocol handling level
and the issues aren't caused by locking issues in the code? I was
wondering whether the hangs that were not slot related could possibly
be due to some race condition when locking since it happens so
seemingly randomly.

> If you can reproduce it for a hard mount, you could capture packets via:
> # tcpdump -s 0 -w out.pcap host <nfs-server>
> Tcpdump is useless at decoding NFS, but wireshark can decode the out.pcap
> quite nicely. I can look at the out.pcap or, if you do so, you start by looking for
> NFSv4 specific errors.
> --> The client will usually log if it gets one of these. It will be an error # > 10000.
> 

With us not knowing the NFSv4 protocol, we were holding off on even
trying to get Wireshark dumps since we wouldn't know what to look for
and would have to learn the protocol first. You having a look would be
greatly appreciated! As I wrote above, I'll try to get dumps if we can
find a reproducer.

> Good luck with it, rick
> > rick
> >

Cordially,
Andreas Kempe



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YpS5TBH/h1m39uwk>