From owner-freebsd-current@freebsd.org Thu Apr 15 21:09:38 2021 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2F7875D82D8 for ; Thu, 15 Apr 2021 21:09:38 +0000 (UTC) (envelope-from SRS0=srcQ=JM=FreeBSD.org=otis@ns2.wilbury.net) Received: from ns2.wilbury.net (ns2.wilbury.net [IPv6:2a01:b200:0:1:f816:3eff:fecd:13e6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "svc.wilbury.net", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FLsRP2qbbz4rwp; Thu, 15 Apr 2021 21:09:36 +0000 (UTC) (envelope-from SRS0=srcQ=JM=FreeBSD.org=otis@ns2.wilbury.net) Received: from chemex.owhome.lan (gw-upc.owhome.net [188.167.168.254]) (Authenticated sender: juraj@lutter.sk) by svc.wilbury.net (Postfix) with ESMTPSA id B94E445CF09; Thu, 15 Apr 2021 23:09:26 +0200 (CEST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: NFS issues since upgrading to 13-RELEASE From: Juraj Lutter In-Reply-To: Date: Thu, 15 Apr 2021 23:09:26 +0200 Cc: Allan Jude , "freebsd-current@freebsd.org" , Richard Scheffenegger , Peter Mihalik Content-Transfer-Encoding: quoted-printable Message-Id: References: <902a3c81-2ce8-49c0-b163-5ffa4b90afe5@www.fastmail.com> To: Rick Macklem X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Rspamd-Queue-Id: 4FLsRP2qbbz4rwp X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of SRS0=srcQ=JM=FreeBSD.org=otis@ns2.wilbury.net has no SPF policy when checking 2a01:b200:0:1:f816:3eff:fecd:13e6) smtp.mailfrom=SRS0=srcQ=JM=FreeBSD.org=otis@ns2.wilbury.net X-Spamd-Result: default: False [-1.26 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; RCPT_COUNT_FIVE(0.00)[5]; NEURAL_HAM_SHORT(-0.96)[-0.961]; FORGED_SENDER(0.30)[otis@FreeBSD.org,SRS0=srcQ=JM=FreeBSD.org=otis@ns2.wilbury.net]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2a01:b200:0:1:f816:3eff:fecd:13e6:from]; ASN(0.00)[asn:44185, ipnet:2a01:b200::/32, country:SK]; MID_RHS_MATCH_FROM(0.00)[]; FROM_NEQ_ENVFROM(0.00)[otis@FreeBSD.org,SRS0=srcQ=JM=FreeBSD.org=otis@ns2.wilbury.net]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[FreeBSD.org]; AUTH_NA(1.00)[]; SPAMHAUS_ZRD(0.00)[2a01:b200:0:1:f816:3eff:fecd:13e6:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_DKIM_NA(0.00)[]; R_SPF_NA(0.00)[no SPF record]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Apr 2021 21:09:38 -0000 > On 15 Apr 2021, at 22:47, Rick Macklem wrote: >=20 > Allan Jude wrote: >> On 4/15/2021 9:22 AM, Chris Roose wrote: >>> I posted this in -questions and someone suggested I post here as = well. >>>=20 >>> I'm having NFS availability issues between my Proxmox client and = FreeBSD server (10G link) since upgrading to 13->RELEASE. And = unfortunately I upgraded my ZFS pool to v2.0.0 before I noticed the = issue, so I'm kind of stuck. >>>=20 >>> Periodically, the NFS server (I've tried both v3 and v4.2 clients) = will go unresponsive for several minutes. I never had >this problem on = 12.2, and as far as I can tell it's not a disk or network I/O issue. = I'll get several "nfs: server not >responding, still trying" messages on = the client and a few minutes later it usually recovers. It's not clear = to me yet >what's causing the block. Restarting nfsd on the server will = resolve the issue if it doesn't clear itself. >>=20 > otis@ has run into a problem that sounds similar. > He sees a growing Recv-Q size on the server for the TCP connection = from the client > when "netstat -a" is done on the server when the "hang" occurs. > In his case, he is using a Linux client and it does not recover, = however other client > mounts continue to function. Correct. > I suspect the recovery after a few minutes is the client establishing = a new TCP > connection. >=20 > He has been running for almost a week with r367492 reverted and has = not reported > seeing the problem again (he had reported that it has taken up to a = week to recur, so > reverting r367492 *might* have fixed the problem and I'd guess we'll = know in another > week?). We are now running 4 days without interruption. Before r367492 was = reverted, it was unpredictable when it will lock up. The best result we achieved was 7 = days. The machine it=E2=80=99s running on is definitely a slow or weak one = (it=E2=80=99s dell r740xd with 2x CPU, 256GB RAM, 22xNVMe data zpool). otis