From owner-svn-src-all@freebsd.org Tue Apr 23 10:12:35 2019 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2621F159540E; Tue, 23 Apr 2019 10:12:35 +0000 (UTC) (envelope-from bz@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C08F66A9B2; Tue, 23 Apr 2019 10:12:34 +0000 (UTC) (envelope-from bz@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 7A94C230A9; Tue, 23 Apr 2019 10:12:34 +0000 (UTC) (envelope-from bz@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x3NACYhn075437; Tue, 23 Apr 2019 10:12:34 GMT (envelope-from bz@FreeBSD.org) Received: (from bz@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x3NACYWI075436; Tue, 23 Apr 2019 10:12:34 GMT (envelope-from bz@FreeBSD.org) Message-Id: <201904231012.x3NACYWI075436@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bz set sender to bz@FreeBSD.org using -f From: "Bjoern A. Zeeb" Date: Tue, 23 Apr 2019 10:12:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r346595 - head/sys/netinet X-SVN-Group: head X-SVN-Commit-Author: bz X-SVN-Commit-Paths: head/sys/netinet X-SVN-Commit-Revision: 346595 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: C08F66A9B2 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.96 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.96)[-0.960,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Apr 2019 10:12:35 -0000 Author: bz Date: Tue Apr 23 10:12:33 2019 New Revision: 346595 URL: https://svnweb.freebsd.org/changeset/base/346595 Log: iFix udp_output() lock inconsistency. In r297225 the initial INP_RLOCK() was replaced by an early acquisition of an r- or w-lock depending on input variables possibly extending the write locked area for reasons not entirely clear but possibly to avoid a later case of unlock and relock leading to a possible race condition and possibly in order to allow the route cache to work for connected sockets. Unfortunately the conditions were not 1:1 replicated (probably because of the route cache needs). While this would not be a problem the legacy IP code compared to IPv6 has an extra case when dealing with IP_SENDSRCADDR. In a particular case we were holding an exclusive inp lock and acquired the shared udbinfo lock (now epoch). When then running into an error case, the locking assertions on release fired as the udpinfo and inp lock levels did not match. Break up the special case and in that particular case acquire and udpinfo lock depending on the exclusitivity of the inp lock. MFC After: 9 days Reported-by: syzbot+1f5c6800e4f99bdb1a48@syzkaller.appspotmail.com Reviewed by: tuexen Differential Revision: https://reviews.freebsd.org/D19594 Modified: head/sys/netinet/udp_usrreq.c Modified: head/sys/netinet/udp_usrreq.c ============================================================================== --- head/sys/netinet/udp_usrreq.c Tue Apr 23 07:46:38 2019 (r346594) +++ head/sys/netinet/udp_usrreq.c Tue Apr 23 10:12:33 2019 (r346595) @@ -1258,20 +1258,22 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct s } /* - * Depending on whether or not the application has bound or connected - * the socket, we may have to do varying levels of work. The optimal - * case is for a connected UDP socket, as a global lock isn't - * required at all. - * - * In order to decide which we need, we require stability of the - * inpcb binding, which we ensure by acquiring a read lock on the - * inpcb. This doesn't strictly follow the lock order, so we play - * the trylock and retry game; note that we may end up with more - * conservative locks than required the second time around, so later - * assertions have to accept that. Further analysis of the number of - * misses under contention is required. - * - * XXXRW: Check that hash locking update here is correct. + * In the old days, depending on whether or not the application had + * bound or connected the socket, we had to do varying levels of work. + * The optimal case was for a connected UDP socket, as a global lock + * wasn't required at all. + * In order to decide which we need, we required stability of the + * inpcb binding, which we ensured by acquiring a read lock on the + * inpcb. This didn't strictly follow the lock order, so we played + * the trylock and retry game. + * With the re-introduction of the route-cache in some cases, we started + * to acquire an early inp wlock and a possible race during re-lock + * went away. With the introduction of epoch(9) some read locking + * became epoch(9) and the lock-order issues also went away. + * Due to route-cache we may now hold more conservative locks than + * otherwise required and have split up the 2nd case in case 2 and 3 + * in order to keep the udpinfo lock level in sync with the inp one + * for the IP_SENDSRCADDR case below. */ pr = inp->inp_socket->so_proto->pr_protocol; pcbinfo = udp_get_inpcbinfo(pr); @@ -1279,14 +1281,21 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct s (inp->inp_laddr.s_addr == INADDR_ANY && inp->inp_lport == 0)) { INP_HASH_WLOCK(pcbinfo); unlock_udbinfo = UH_WLOCKED; - } else if ((sin != NULL && - (sin->sin_addr.s_addr == INADDR_ANY || - sin->sin_addr.s_addr == INADDR_BROADCAST || - inp->inp_laddr.s_addr == INADDR_ANY || - inp->inp_lport == 0)) || - src.sin_family == AF_INET) { + } else if (sin != NULL && + (sin->sin_addr.s_addr == INADDR_ANY || + sin->sin_addr.s_addr == INADDR_BROADCAST || + inp->inp_laddr.s_addr == INADDR_ANY || + inp->inp_lport == 0)) { INP_HASH_RLOCK_ET(pcbinfo, et); unlock_udbinfo = UH_RLOCKED; + } else if (src.sin_family == AF_INET) { + if (unlock_inp == UH_WLOCKED) { + INP_HASH_WLOCK(pcbinfo); + unlock_udbinfo = UH_WLOCKED; + } else { + INP_HASH_RLOCK_ET(pcbinfo, et); + unlock_udbinfo = UH_RLOCKED; + } } else unlock_udbinfo = UH_UNLOCKED;