Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 03 Sep 2019 14:08:18 -0000
From:      "Bjoern A. Zeeb" <bz@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r346595 - head/sys/netinet
Message-ID:  <201904231012.x3NACYWI075436@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: bz
Date: Tue Apr 23 10:12:33 2019
New Revision: 346595
URL: https://svnweb.freebsd.org/changeset/base/346595

Log:
  iFix udp_output() lock inconsistency.
  
  In r297225 the initial INP_RLOCK() was replaced by an early
  acquisition of an r- or w-lock depending on input variables
  possibly extending the write locked area for reasons not entirely
  clear but possibly to avoid a later case of unlock and relock
  leading to a possible race condition and possibly in order to
  allow the route cache to work for connected sockets.
  
  Unfortunately the conditions were not 1:1 replicated (probably
  because of the route cache needs). While this would not be a
  problem the legacy IP code compared to IPv6 has an extra case
  when dealing with IP_SENDSRCADDR. In a particular case we were
  holding an exclusive inp lock and acquired the shared udbinfo
  lock (now epoch).
  When then running into an error case, the locking assertions
  on release fired as the udpinfo and inp lock levels did not match.
  
  Break up the special case and in that particular case acquire
  and udpinfo lock depending on the exclusitivity of the inp lock.
  
  MFC After:	9 days
  Reported-by:	syzbot+1f5c6800e4f99bdb1a48@syzkaller.appspotmail.com
  Reviewed by:	tuexen
  Differential Revision:	https://reviews.freebsd.org/D19594

Modified:
  head/sys/netinet/udp_usrreq.c

Modified: head/sys/netinet/udp_usrreq.c
==============================================================================
--- head/sys/netinet/udp_usrreq.c	Tue Apr 23 07:46:38 2019	(r346594)
+++ head/sys/netinet/udp_usrreq.c	Tue Apr 23 10:12:33 2019	(r346595)
@@ -1258,20 +1258,22 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct s
 	}
 
 	/*
-	 * Depending on whether or not the application has bound or connected
-	 * the socket, we may have to do varying levels of work.  The optimal
-	 * case is for a connected UDP socket, as a global lock isn't
-	 * required at all.
-	 *
-	 * In order to decide which we need, we require stability of the
-	 * inpcb binding, which we ensure by acquiring a read lock on the
-	 * inpcb.  This doesn't strictly follow the lock order, so we play
-	 * the trylock and retry game; note that we may end up with more
-	 * conservative locks than required the second time around, so later
-	 * assertions have to accept that.  Further analysis of the number of
-	 * misses under contention is required.
-	 *
-	 * XXXRW: Check that hash locking update here is correct.
+	 * In the old days, depending on whether or not the application had
+	 * bound or connected the socket, we had to do varying levels of work.
+	 * The optimal case was for a connected UDP socket, as a global lock
+	 * wasn't required at all.
+	 * In order to decide which we need, we required stability of the
+	 * inpcb binding, which we ensured by acquiring a read lock on the
+	 * inpcb.  This didn't strictly follow the lock order, so we played
+	 * the trylock and retry game.
+	 * With the re-introduction of the route-cache in some cases, we started
+	 * to acquire an early inp wlock and a possible race during re-lock
+	 * went away.  With the introduction of epoch(9) some read locking
+	 * became epoch(9) and the lock-order issues also went away.
+	 * Due to route-cache we may now hold more conservative locks than
+	 * otherwise required and have split up the 2nd case in case 2 and 3
+	 * in order to keep the udpinfo lock level in sync with the inp one
+	 * for the IP_SENDSRCADDR case below.
 	 */
 	pr = inp->inp_socket->so_proto->pr_protocol;
 	pcbinfo = udp_get_inpcbinfo(pr);
@@ -1279,14 +1281,21 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct s
 	    (inp->inp_laddr.s_addr == INADDR_ANY && inp->inp_lport == 0)) {
 		INP_HASH_WLOCK(pcbinfo);
 		unlock_udbinfo = UH_WLOCKED;
-	} else if ((sin != NULL &&
-		(sin->sin_addr.s_addr == INADDR_ANY ||
-		sin->sin_addr.s_addr == INADDR_BROADCAST ||
-		inp->inp_laddr.s_addr == INADDR_ANY ||
-		inp->inp_lport == 0)) ||
-	    src.sin_family == AF_INET) {
+	} else if (sin != NULL &&
+	    (sin->sin_addr.s_addr == INADDR_ANY ||
+	    sin->sin_addr.s_addr == INADDR_BROADCAST ||
+	    inp->inp_laddr.s_addr == INADDR_ANY ||
+	    inp->inp_lport == 0)) {
 		INP_HASH_RLOCK_ET(pcbinfo, et);
 		unlock_udbinfo = UH_RLOCKED;
+	} else if (src.sin_family == AF_INET) {
+		if (unlock_inp == UH_WLOCKED) {
+			INP_HASH_WLOCK(pcbinfo);
+			unlock_udbinfo = UH_WLOCKED;
+		} else {
+			INP_HASH_RLOCK_ET(pcbinfo, et);
+			unlock_udbinfo = UH_RLOCKED;
+		}
 	} else
 		unlock_udbinfo = UH_UNLOCKED;
 





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201904231012.x3NACYWI075436>