Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Apr 2021 23:09:26 +0200
From:      Juraj Lutter <otis@FreeBSD.org>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        Allan Jude <allanjude@freebsd.org>, "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>, Richard Scheffenegger <rscheff@FreeBSD.org>, Peter Mihalik <peter.mihalik@bonet.sk>
Subject:   Re: NFS issues since upgrading to 13-RELEASE
Message-ID:  <A8212379-79FB-44BF-A7BA-B00FA44901F9@FreeBSD.org>
In-Reply-To: <YQXPR0101MB09681707D3F3DC10814A905BDD4D9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
References:  <902a3c81-2ce8-49c0-b163-5ffa4b90afe5@www.fastmail.com> <e8f585eb-a2a8-ae9d-7f33-526e412ec462@freebsd.org> <YQXPR0101MB09681707D3F3DC10814A905BDD4D9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 15 Apr 2021, at 22:47, Rick Macklem <rmacklem@uoguelph.ca> wrote:
>=20
> Allan Jude wrote:
>> On 4/15/2021 9:22 AM, Chris Roose wrote:
>>> I posted this in -questions and someone suggested I post here as =
well.
>>>=20
>>> I'm having NFS availability issues between my Proxmox client and =
FreeBSD server (10G link) since upgrading to 13->RELEASE. And =
unfortunately I upgraded my ZFS pool to v2.0.0 before I noticed the =
issue, so I'm kind of stuck.
>>>=20
>>> Periodically, the NFS server (I've tried both v3 and v4.2 clients) =
will go unresponsive for several minutes. I never had >this problem on =
12.2, and as far as I can tell it's not a disk or network I/O issue. =
I'll get several "nfs: server not >responding, still trying" messages on =
the client and a few minutes later it usually recovers. It's not clear =
to me yet >what's causing the block. Restarting nfsd on the server will =
resolve the issue if it doesn't clear itself.
>>=20
> otis@ has run into a problem that sounds similar.
> He sees a growing Recv-Q size on the server for the TCP connection =
from the client
> when "netstat -a" is done on the server when the "hang" occurs.
> In his case, he is using a Linux client and it does not recover, =
however other client
> mounts continue to function.

Correct.

> I suspect the recovery after a few minutes is the client establishing =
a new TCP
> connection.
>=20
> He has been running for almost a week with r367492 reverted and has =
not reported
> seeing the problem again (he had reported that it has taken up to a =
week to recur, so
> reverting r367492 *might* have fixed the problem and I'd guess we'll =
know in another
> week?).

We are now running 4 days without interruption. Before r367492 was =
reverted, it was
unpredictable when it will lock up. The best result we achieved was 7 =
days.

The machine it=E2=80=99s running on is definitely a slow or weak one =
(it=E2=80=99s dell r740xd with 2x CPU, 256GB RAM, 22xNVMe data zpool).

otis




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A8212379-79FB-44BF-A7BA-B00FA44901F9>