Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Dec 2021 17:20:04 +0000
From:      bugzilla-noreply@freebsd.org
To:        fs@FreeBSD.org
Subject:   [Bug 260664] ZFS/NFS: Intermittent hangs and crashes after a period of time in: nfscl_hasexpired || dbuf_write_done || zio_execute
Message-ID:  <bug-260664-3630-LT49vB0YGZ@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-260664-3630@https.bugs.freebsd.org/bugzilla/>
References:  <bug-260664-3630@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D260664

--- Comment #11 from Rick Macklem <rmacklem@FreeBSD.org> ---
Ok, it does look like the commit in stable/13 would fix this hang.
It happens because read calls nfscl_hasexpired(), which tries to
acquire the exclusive lock (similar to delegation return cases).

However, calling nfscl_hasexpired() should *almost never* happen.
It happens when the client has been partitioned from the NFSv4
server for at least a minute.
For the FreeBSD NFSv4 server (which is what is called a courteous
server), the expired only happens when a conflicting open/lock
request is done by another client or when open/lock resources
become exhausted.
When recovery from "expired" is done, all byte range locks are lost,
so getting the client/server into this state is to be avoiding if
at all possible.

If your NFSv4 server is a Linux one, expired will happen when the
client is network partitioned from the server for over 60sec.

In other words, I think you have some sort of network connectivity
problem to the NFS server.

As an alternative to upgrading to stable/13, you could switch to
using NFSv3 mounts to avoid the hang.

--=20
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-260664-3630-LT49vB0YGZ>