Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Apr 2021 20:47:23 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Allan Jude <allanjude@freebsd.org>, "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
Cc:        Richard Scheffenegger <rscheff@FreeBSD.org>, Juraj Lutter <otis@FreeBSD.org>
Subject:   Re: NFS issues since upgrading to 13-RELEASE
Message-ID:  <YQXPR0101MB09681707D3F3DC10814A905BDD4D9@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <e8f585eb-a2a8-ae9d-7f33-526e412ec462@freebsd.org>
References:  <902a3c81-2ce8-49c0-b163-5ffa4b90afe5@www.fastmail.com>, <e8f585eb-a2a8-ae9d-7f33-526e412ec462@freebsd.org>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
Allan Jude wrote:
>On 4/15/2021 9:22 AM, Chris Roose wrote:
>> I posted this in -questions and someone suggested I post here as well.
>>
>> I'm having NFS availability issues between my Proxmox client and FreeBSD server (10G link) since upgrading to 13->RELEASE. And unfortunately I upgraded my ZFS pool to v2.0.0 before I noticed the issue, so I'm kind of stuck.
>>
>> Periodically, the NFS server (I've tried both v3 and v4.2 clients) will go unresponsive for several minutes. I never had >this problem on 12.2, and as far as I can tell it's not a disk or network I/O issue. I'll get several "nfs: server not >responding, still trying" messages on the client and a few minutes later it usually recovers. It's not clear to me yet >what's causing the block. Restarting nfsd on the server will resolve the issue if it doesn't clear itself.
>
otis@ has run into a problem that sounds similar.
He sees a growing Recv-Q size on the server for the TCP connection from the client
when "netstat -a" is done on the server when the "hang" occurs.
In his case, he is using a Linux client and it does not recover, however other client
mounts continue to function.
I suspect the recovery after a few minutes is the client establishing a new TCP
connection.

He has been running for almost a week with r367492 reverted and has not reported
seeing the problem again (he had reported that it has taken up to a week to recur, so
reverting r367492 *might* have fixed the problem and I'd guess we'll know in another
week?).

- If using svn to revert the patch is inconvenient, I've attached a patch that can be applied
   to revert it.
- Alternately you can try rscheff@'s alternate proposed patch that is at
  https://reviews.freebsd.og/D29690.
  I have not yet had time to test this one, but since I cannot reproduce the hang, I can
  only do testing of it to see that it is "no worse" than reverting r367492 for my
  setup.

Please let us know which you choose and whether or not it fixes your problem.
  
>> Any pointers for troubleshooting this? I've been looking through vmstat, gstat, top, etc. when the problem occurs, but I haven't been able to pinpoint the issue. I can get pcap, but it would be from the hosts, because I don't have a 10G tap or managed switch.
>>
>
>run `nfsstat -d 1` and try to capture a few lines from before, during,
>and after the stall, and that may provide some insight.
>
>Specifically, does the queue length grow, suggesting it is waiting on
>the I/O subsystem, or does it just stop getting traffic all together.

If the revert of r367492 does not fix the problem, monitor the TCP connection(s)
via "netstat -a" and, if possible, capture packets via
tcpdump -s 0 -w hang.pcap host <nfs-client>
or similar, run on the server.

Ideally the tcpdump would  be started before the "hang" occurs, but running
one while the hang is occurring (until after it recovers) could also be useful.

Thanks for reporting this, rick

--
Allan Jude
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


[-- Attachment #2 --]
--- sys/netinet/tcp_stacks/bbr.c.orig	2021-04-09 08:24:05.364212000 -0700
+++ sys/netinet/tcp_stacks/bbr.c	2021-04-09 08:33:49.799902000 -0700
@@ -7876,8 +7876,7 @@ bbr_process_ack(struct mbuf *m, struct tcphdr *th, str
 	acked_amount = min(acked, (int)sbavail(&so->so_snd));
 	tp->snd_wnd -= acked_amount;
 	mfree = sbcut_locked(&so->so_snd, acked_amount);
-	SOCKBUF_UNLOCK(&so->so_snd);
-	tp->t_flags |= TF_WAKESOW;
+	sowwakeup_locked(so);
 	m_freem(mfree);
 	if (SEQ_GT(th->th_ack, tp->snd_una)) {
 		bbr_collapse_rtt(tp, bbr, TCP_REXMTVAL(tp));
@@ -8353,8 +8352,7 @@ bbr_process_data(struct mbuf *m, struct tcphdr *th, st
 				appended =
 #endif
 					sbappendstream_locked(&so->so_rcv, m, 0);
-			SOCKBUF_UNLOCK(&so->so_rcv);
-			tp->t_flags |= TF_WAKESOR;
+			sorwakeup_locked(so);
 #ifdef NETFLIX_SB_LIMITS
 			if (so->so_rcv.sb_shlim && appended != mcnt)
 				counter_fo_release(so->so_rcv.sb_shlim,
@@ -8414,8 +8412,6 @@ bbr_process_data(struct mbuf *m, struct tcphdr *th, st
 	if (thflags & TH_FIN) {
 		if (TCPS_HAVERCVDFIN(tp->t_state) == 0) {
 			socantrcvmore(so);
-			/* The socket upcall is handled by socantrcvmore. */
-			tp->t_flags &= ~TF_WAKESOR;
 			/*
 			 * If connection is half-synchronized (ie NEEDSYN
 			 * flag on) then delay ACK, so it may be piggybacked
@@ -8606,8 +8602,7 @@ bbr_do_fastnewdata(struct mbuf *m, struct tcphdr *th, 
 			sbappendstream_locked(&so->so_rcv, m, 0);
 		ctf_calc_rwin(so, tp);
 	}
-	SOCKBUF_UNLOCK(&so->so_rcv);
-	tp->t_flags |= TF_WAKESOR;
+	sorwakeup_locked(so);
 #ifdef NETFLIX_SB_LIMITS
 	if (so->so_rcv.sb_shlim && mcnt != appended)
 		counter_fo_release(so->so_rcv.sb_shlim, mcnt - appended);
@@ -8798,7 +8793,7 @@ bbr_fastack(struct mbuf *m, struct tcphdr *th, struct 
 		    &tcp_savetcp, 0);
 #endif
 	/* Wake up the socket if we have room to write more */
-	tp->t_flags |= TF_WAKESOW;
+	sowwakeup(so);
 	if (tp->snd_una == tp->snd_max) {
 		/* Nothing left outstanding */
 		bbr_log_progress_event(bbr, tp, ticks, PROGRESS_CLEAR, __LINE__);
@@ -11754,10 +11749,8 @@ bbr_do_segment(struct mbuf *m, struct tcphdr *th, stru
 	}
 	retval = bbr_do_segment_nounlock(m, th, so, tp,
 					 drop_hdrlen, tlen, iptos, 0, &tv);
-	if (retval == 0) {
-		tcp_handle_wakeup(tp, so);
+	if (retval == 0)
 		INP_WUNLOCK(tp->t_inpcb);
-	}
 }
 
 /*
--- sys/netinet/tcp_stacks/rack.c.orig	2021-04-09 08:36:23.622821000 -0700
+++ sys/netinet/tcp_stacks/rack.c	2021-04-09 08:41:24.096687000 -0700
@@ -8344,8 +8344,7 @@ rack_process_ack(struct mbuf *m, struct tcphdr *th, st
 		 */
 		ourfinisacked = 1;
 	}
-	SOCKBUF_UNLOCK(&so->so_snd);
-	tp->t_flags |= TF_WAKESOW;
+	sowwakeup_locked(so);
 	m_freem(mfree);
 	if (rack->r_ctl.rc_early_recovery == 0) {
 		if (IN_RECOVERY(tp->t_flags)) {
@@ -8665,8 +8664,7 @@ rack_process_data(struct mbuf *m, struct tcphdr *th, s
 				appended =
 #endif
 					sbappendstream_locked(&so->so_rcv, m, 0);
-			SOCKBUF_UNLOCK(&so->so_rcv);
-			tp->t_flags |= TF_WAKESOR;
+			sorwakeup_locked(so);
 #ifdef NETFLIX_SB_LIMITS
 			if (so->so_rcv.sb_shlim && appended != mcnt)
 				counter_fo_release(so->so_rcv.sb_shlim,
@@ -8731,8 +8729,6 @@ rack_process_data(struct mbuf *m, struct tcphdr *th, s
 	if (thflags & TH_FIN) {
 		if (TCPS_HAVERCVDFIN(tp->t_state) == 0) {
 			socantrcvmore(so);
-			/* The socket upcall is handled by socantrcvmore. */
-			tp->t_flags &= ~TF_WAKESOR;
 			/*
 			 * If connection is half-synchronized (ie NEEDSYN
 			 * flag on) then delay ACK, so it may be piggybacked
@@ -8924,8 +8920,7 @@ rack_do_fastnewdata(struct mbuf *m, struct tcphdr *th,
 			sbappendstream_locked(&so->so_rcv, m, 0);
 		ctf_calc_rwin(so, tp);
 	}
-	SOCKBUF_UNLOCK(&so->so_rcv);
-	tp->t_flags |= TF_WAKESOR;
+	sorwakeup_locked(so);
 #ifdef NETFLIX_SB_LIMITS
 	if (so->so_rcv.sb_shlim && mcnt != appended)
 		counter_fo_release(so->so_rcv.sb_shlim, mcnt - appended);
@@ -9142,7 +9137,7 @@ rack_fastack(struct mbuf *m, struct tcphdr *th, struct
 		rack_timer_cancel(tp, rack, rack->r_ctl.rc_rcvtime, __LINE__);
 	}
 	/* Wake up the socket if we have room to write more */
-	tp->t_flags |= TF_WAKESOW;
+	sowwakeup(so);
 	if (sbavail(&so->so_snd)) {
 		rack->r_wanted_output = 1;
 	}
@@ -11205,10 +11200,8 @@ rack_do_segment(struct mbuf *m, struct tcphdr *th, str
 		tcp_get_usecs(&tv);
 	}
 	if(rack_do_segment_nounlock(m, th, so, tp,
-				    drop_hdrlen, tlen, iptos, 0, &tv) == 0) {
-		tcp_handle_wakeup(tp, so);
+				    drop_hdrlen, tlen, iptos, 0, &tv) == 0)
 		INP_WUNLOCK(tp->t_inpcb);
-	}
 }
 
 struct rack_sendmap *
--- sys/netinet/tcp_stacks/rack_bbr_common.c.orig	2021-04-09 08:45:26.721521000 -0700
+++ sys/netinet/tcp_stacks/rack_bbr_common.c	2021-04-09 08:46:58.580234000 -0700
@@ -458,7 +458,6 @@ ctf_do_queued_segments(struct socket *so, struct tcpcb
 			/* We lost the tcpcb (maybe a RST came in)? */
 			return(1);
 		}
-		tcp_handle_wakeup(tp, so);
 	}
 	return (0);
 }
--- sys/netinet/tcp_input.c.orig	2021-04-05 01:07:00.342559000 -0700
+++ sys/netinet/tcp_input.c	2021-04-09 07:58:03.262815000 -0700
@@ -1472,29 +1472,6 @@ tcp_autorcvbuf(struct mbuf *m, struct tcphdr *th, stru
 }
 
 void
-tcp_handle_wakeup(struct tcpcb *tp, struct socket *so)
-{
-	/*
-	 * Since tp might be gone if the session entered
-	 * the TIME_WAIT state before coming here, we need
-	 * to check if the socket is still connected.
-	 */
-	if ((so->so_state & SS_ISCONNECTED) == 0)
-		return;
-	INP_LOCK_ASSERT(tp->t_inpcb);
-	if (tp->t_flags & TF_WAKESOR) {
-		tp->t_flags &= ~TF_WAKESOR;
-		SOCKBUF_UNLOCK_ASSERT(&so->so_rcv);
-		sorwakeup(so);
-	}
-	if (tp->t_flags & TF_WAKESOW) {
-		tp->t_flags &= ~TF_WAKESOW;
-		SOCKBUF_UNLOCK_ASSERT(&so->so_snd);
-		sowwakeup(so);
-	}
-}
-
-void
 tcp_do_segment(struct mbuf *m, struct tcphdr *th, struct socket *so,
     struct tcpcb *tp, int drop_hdrlen, int tlen, uint8_t iptos)
 {
@@ -1863,7 +1840,7 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru
 				else if (!tcp_timer_active(tp, TT_PERSIST))
 					tcp_timer_activate(tp, TT_REXMT,
 						      tp->t_rxtcur);
-				tp->t_flags |= TF_WAKESOW;
+				sowwakeup(so);
 				if (sbavail(&so->so_snd))
 					(void) tp->t_fb->tfb_tcp_output(tp);
 				goto check_delack;
@@ -1928,8 +1905,8 @@ tcp_do_segment(struct mbuf *m, struct tcphdr *th, stru
 				m_adj(m, drop_hdrlen);	/* delayed header drop */
 				sbappendstream_locked(&so->so_rcv, m, 0);
 			}
-			SOCKBUF_UNLOCK(&so->so_rcv);
-			tp->t_flags |= TF_WAKESOR;
+			/* NB: sorwakeup_locked() does an implicit unlock. */
+			sorwakeup_locked(so);
 			if (DELAY_ACK(tp, tlen)) {
 				tp->t_flags |= TF_DELACK;
 			} else {
@@ -2925,8 +2902,8 @@ process_ACK:
 				tp->snd_wnd = 0;
 			ourfinisacked = 0;
 		}
-		SOCKBUF_UNLOCK(&so->so_snd);
-		tp->t_flags |= TF_WAKESOW;
+		/* NB: sowwakeup_locked() does an implicit unlock. */
+		sowwakeup_locked(so);
 		m_freem(mfree);
 		/* Detect una wraparound. */
 		if (!IN_RECOVERY(tp->t_flags) &&
@@ -3147,8 +3124,8 @@ dodata:							/* XXX */
 				m_freem(m);
 			else
 				sbappendstream_locked(&so->so_rcv, m, 0);
-			SOCKBUF_UNLOCK(&so->so_rcv);
-			tp->t_flags |= TF_WAKESOR;
+			/* NB: sorwakeup_locked() does an implicit unlock. */
+			sorwakeup_locked(so);
 		} else {
 			/*
 			 * XXX: Due to the header drop above "th" is
@@ -3215,8 +3192,6 @@ dodata:							/* XXX */
 	if (thflags & TH_FIN) {
 		if (TCPS_HAVERCVDFIN(tp->t_state) == 0) {
 			socantrcvmore(so);
-			/* The socket upcall is handled by socantrcvmore. */
-			tp->t_flags &= ~TF_WAKESOR;
 			/*
 			 * If connection is half-synchronized
 			 * (ie NEEDSYN flag on) then delay ACK,
@@ -3280,7 +3255,6 @@ check_delack:
 		tp->t_flags &= ~TF_DELACK;
 		tcp_timer_activate(tp, TT_DELACK, tcp_delacktime);
 	}
-	tcp_handle_wakeup(tp, so);
 	INP_WUNLOCK(tp->t_inpcb);
 	return;
 
@@ -3314,7 +3288,6 @@ dropafterack:
 	TCP_PROBE3(debug__input, tp, th, m);
 	tp->t_flags |= TF_ACKNOW;
 	(void) tp->t_fb->tfb_tcp_output(tp);
-	tcp_handle_wakeup(tp, so);
 	INP_WUNLOCK(tp->t_inpcb);
 	m_freem(m);
 	return;
@@ -3322,7 +3295,6 @@ dropafterack:
 dropwithreset:
 	if (tp != NULL) {
 		tcp_dropwithreset(m, th, tp, tlen, rstreason);
-		tcp_handle_wakeup(tp, so);
 		INP_WUNLOCK(tp->t_inpcb);
 	} else
 		tcp_dropwithreset(m, th, NULL, tlen, rstreason);
@@ -3338,10 +3310,8 @@ drop:
 			  &tcp_savetcp, 0);
 #endif
 	TCP_PROBE3(debug__input, tp, th, m);
-	if (tp != NULL) {
-		tcp_handle_wakeup(tp, so);
+	if (tp != NULL)
 		INP_WUNLOCK(tp->t_inpcb);
-	}
 	m_freem(m);
 }
 
--- sys/netinet/tcp_reass.c.orig	2021-04-09 08:18:10.599092000 -0700
+++ sys/netinet/tcp_reass.c	2021-04-09 08:19:54.912378000 -0700
@@ -959,8 +959,7 @@ new_entry:
 		} else {
 			sbappendstream_locked(&so->so_rcv, m, 0);
 		}
-		SOCKBUF_UNLOCK(&so->so_rcv);
-		tp->t_flags |= TF_WAKESOR;
+		sorwakeup_locked(so);
 		return (flags);
 	}
 	if (tcp_new_limits) {
@@ -1108,7 +1107,6 @@ present:
 #ifdef TCP_REASS_LOGGING
 	tcp_reass_log_dump(tp);
 #endif
-	SOCKBUF_UNLOCK(&so->so_rcv);
-	tp->t_flags |= TF_WAKESOR;
+	sorwakeup_locked(so);
 	return (flags);
 }
help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQXPR0101MB09681707D3F3DC10814A905BDD4D9>