From owner-svn-src-head@FreeBSD.ORG Sat Jun 4 14:30:48 2011 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A1101106564A; Sat, 4 Jun 2011 14:30:48 +0000 (UTC) (envelope-from kristof@sigsegv.be) Received: from mercury.codepro.be (mercury.codepro.be [IPv6:2001:4b98:dc0:51:216:3eff:feb7:3147]) by mx1.freebsd.org (Postfix) with ESMTP id A385A8FC15; Sat, 4 Jun 2011 14:30:47 +0000 (UTC) Received: from triton.sigsegv.be (triton.neptune.sigsegv.be [IPv6:2001:470:c8f4:0:200:ff:fe00:7]) by mercury.codepro.be (Postfix) with ESMTP id 68409DA8; Sat, 4 Jun 2011 16:30:45 +0200 (CEST) Received: from nereid (nereid.neptune.sigsegv.be [IPv6:2001:470:c8f4:0:200:ff:fe00:8]) by triton.sigsegv.be (Postfix) with SMTP id EFC9E1C1C7; Sat, 4 Jun 2011 16:30:43 +0200 (CEST) Received: by nereid (sSMTP sendmail emulation); Sat, 04 Jun 2011 16:30:43 +0200 Date: Sat, 4 Jun 2011 16:30:43 +0200 From: Kristof Provost To: Robert Watson Message-ID: <20110604143043.GE17764@nereid> References: <201105300943.p4U9htjI070096@svn.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201105300943.p4U9htjI070096@svn.freebsd.org> X-PGP-Fingerprint: E114 D9EA 909E D469 8F57 17A5 7D15 91C6 9EFA F286 User-Agent: Mutt/1.5.20 (2009-06-14) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r222488 - in head/sys: contrib/pf/net netinet netinet/ipfw netinet6 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jun 2011 14:30:48 -0000 Hi, I'm seeing a panic when I start natd, which I suspect is related to this commit: panic: Lock pcbinfohash not exclusively locked @ /usr/src/sys/netinet/in_pcb.c:323 Backtrace: panic() ... _rw_assert() at _rw_assert+0x18d in_pcbbind() at in_pcbbind+0xf5 div_bind() at div_bind+0xb9 sobind() at sobind()+0x79 kern_bind() at kern_bind+0xde bind() at bind()+0x41 syscallenter() at syscallenter+0x1cb syscall() at syscall+0x4c Xfast_syscall() at Xfast_syscall+0xdd div_bind probably also needs to surround the call to in_pcbbind with INP_HASHW(UN)LOCK(...) I'm currently running 222680. I've only now seen the issue, but I've also just now activated INVARIANTS. Regards, Kristof On 2011-05-30 09:43:55 (+0000), Robert Watson wrote: > Author: rwatson > Date: Mon May 30 09:43:55 2011 > New Revision: 222488 > URL: http://svn.freebsd.org/changeset/base/222488 > > Log: > Decompose the current single inpcbinfo lock into two locks: > > - The existing ipi_lock continues to protect the global inpcb list and > inpcb counter. This lock is now relegated to a small number of > allocation and free operations, and occasional operations that walk > all connections (including, awkwardly, certain UDP multicast receive > operations -- something to revisit). > > - A new ipi_hash_lock protects the two inpcbinfo hash tables for > looking up connections and bound sockets, manipulated using new > INP_HASH_*() macros. This lock, combined with inpcb locks, protects > the 4-tuple address space. > > Unlike the current ipi_lock, ipi_hash_lock follows the individual inpcb > connection locks, so may be acquired while manipulating a connection on > which a lock is already held, avoiding the need to acquire the inpcbinfo > lock preemptively when a binding change might later be required. As a > result, however, lookup operations necessarily go through a reference > acquire while holding the lookup lock, later acquiring an inpcb lock -- > if required. > > A new function in_pcblookup() looks up connections, and accepts flags > indicating how to return the inpcb. Due to lock order changes, callers > no longer need acquire locks before performing a lookup: the lookup > routine will acquire the ipi_hash_lock as needed. In the future, it will > also be able to use alternative lookup and locking strategies > transparently to callers, such as pcbgroup lookup. New lookup flags are, > supplementing the existing INPLOOKUP_WILDCARD flag: > > INPLOOKUP_RLOCKPCB - Acquire a read lock on the returned inpcb > INPLOOKUP_WLOCKPCB - Acquire a write lock on the returned inpcb > > Callers must pass exactly one of these flags (for the time being). > > Some notes: > > - All protocols are updated to work within the new regime; especially, > TCP, UDPv4, and UDPv6. pcbinfo ipi_lock acquisitions are largely > eliminated, and global hash lock hold times are dramatically reduced > compared to previous locking. > - The TCP syncache still relies on the pcbinfo lock, something that we > may want to revisit. > - Support for reverting to the FreeBSD 7.x locking strategy in TCP input > is no longer available -- hash lookup locks are now held only very > briefly during inpcb lookup, rather than for potentially extended > periods. However, the pcbinfo ipi_lock will still be acquired if a > connection state might change such that a connection is added or > removed. > - Raw IP sockets continue to use the pcbinfo ipi_lock for protection, > due to maintaining their own hash tables. > - The interface in6_pcblookup_hash_locked() is maintained, which allows > callers to acquire hash locks and perform one or more lookups atomically > with 4-tuple allocation: this is required only for TCPv6, as there is no > in6_pcbconnect_setup(), which there should be. > - UDPv6 locking remains significantly more conservative than UDPv4 > locking, which relates to source address selection. This needs > attention, as it likely significantly reduces parallelism in this code > for multithreaded socket use (such as in BIND). > - In the UDPv4 and UDPv6 multicast cases, we need to revisit locking > somewhat, as they relied on ipi_lock to stablise 4-tuple matches, which > is no longer sufficient. A second check once the inpcb lock is held > should do the trick, keeping the general case from requiring the inpcb > lock for every inpcb visited. > - This work reminds us that we need to revisit locking of the v4/v6 flags, > which may be accessed lock-free both before and after this change. > - Right now, a single lock name is used for the pcbhash lock -- this is > undesirable, and probably another argument is required to take care of > this (or a char array name field in the pcbinfo?). > > This is not an MFC candidate for 8.x due to its impact on lookup and > locking semantics. It's possible some of these issues could be worked > around with compatibility wrappers, if necessary. > > Reviewed by: bz > Sponsored by: Juniper Networks, Inc. > > Modified: > head/sys/contrib/pf/net/pf.c > head/sys/netinet/in_pcb.c > head/sys/netinet/in_pcb.h > head/sys/netinet/ip_divert.c > head/sys/netinet/ipfw/ip_fw2.c > head/sys/netinet/raw_ip.c > head/sys/netinet/siftr.c > head/sys/netinet/tcp_input.c > head/sys/netinet/tcp_subr.c > head/sys/netinet/tcp_syncache.c > head/sys/netinet/tcp_timer.c > head/sys/netinet/tcp_usrreq.c > head/sys/netinet/udp_usrreq.c > head/sys/netinet6/in6_pcb.c > head/sys/netinet6/in6_pcb.h > head/sys/netinet6/in6_src.c > head/sys/netinet6/udp6_usrreq.c > > Modified: head/sys/contrib/pf/net/pf.c > ============================================================================== > --- head/sys/contrib/pf/net/pf.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/contrib/pf/net/pf.c Mon May 30 09:43:55 2011 (r222488) > @@ -3034,16 +3034,14 @@ pf_socket_lookup(int direction, struct p > #ifdef INET > case AF_INET: > #ifdef __FreeBSD__ > - INP_INFO_RLOCK(pi); /* XXX LOR */ > - inp = in_pcblookup_hash(pi, saddr->v4, sport, daddr->v4, > - dport, 0, NULL); > + inp = in_pcblookup(pi, saddr->v4, sport, daddr->v4, > + dport, INPLOOKUP_RLOCKPCB, NULL); > if (inp == NULL) { > - inp = in_pcblookup_hash(pi, saddr->v4, sport, > - daddr->v4, dport, INPLOOKUP_WILDCARD, NULL); > - if(inp == NULL) { > - INP_INFO_RUNLOCK(pi); > + inp = in_pcblookup(pi, saddr->v4, sport, > + daddr->v4, dport, INPLOOKUP_WILDCARD | > + INPLOOKUP_RLOCKPCB, NULL); > + if (inp == NULL) > return (-1); > - } > } > #else > inp = in_pcbhashlookup(tb, saddr->v4, sport, daddr->v4, dport); > @@ -3058,16 +3056,14 @@ pf_socket_lookup(int direction, struct p > #ifdef INET6 > case AF_INET6: > #ifdef __FreeBSD__ > - INP_INFO_RLOCK(pi); > - inp = in6_pcblookup_hash(pi, &saddr->v6, sport, > - &daddr->v6, dport, 0, NULL); > + inp = in6_pcblookup(pi, &saddr->v6, sport, > + &daddr->v6, dport, INPLOOKUP_RLOCKPCB, NULL); > if (inp == NULL) { > - inp = in6_pcblookup_hash(pi, &saddr->v6, sport, > - &daddr->v6, dport, INPLOOKUP_WILDCARD, NULL); > - if (inp == NULL) { > - INP_INFO_RUNLOCK(pi); > + inp = in6_pcblookup(pi, &saddr->v6, sport, > + &daddr->v6, dport, INPLOOKUP_WILDCARD | > + INPLOOKUP_RLOCKPCB, NULL); > + if (inp == NULL) > return (-1); > - } > } > #else > inp = in6_pcbhashlookup(tb, &saddr->v6, sport, &daddr->v6, > @@ -3085,9 +3081,10 @@ pf_socket_lookup(int direction, struct p > return (-1); > } > #ifdef __FreeBSD__ > + INP_RLOCK_ASSERT(inp); > pd->lookup.uid = inp->inp_cred->cr_uid; > pd->lookup.gid = inp->inp_cred->cr_groups[0]; > - INP_INFO_RUNLOCK(pi); > + INP_RUNLOCK(inp); > #else > pd->lookup.uid = inp->inp_socket->so_euid; > pd->lookup.gid = inp->inp_socket->so_egid; > > Modified: head/sys/netinet/in_pcb.c > ============================================================================== > --- head/sys/netinet/in_pcb.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/in_pcb.c Mon May 30 09:43:55 2011 (r222488) > @@ -127,6 +127,10 @@ static VNET_DEFINE(int, ipport_tcplastco > > #define V_ipport_tcplastcount VNET(ipport_tcplastcount) > > +static struct inpcb *in_pcblookup_hash_locked(struct inpcbinfo *pcbinfo, > + struct in_addr faddr, u_int fport_arg, > + struct in_addr laddr, u_int lport_arg, > + int lookupflags, struct ifnet *ifp); > static void in_pcbremlists(struct inpcb *inp); > > #ifdef INET > @@ -212,11 +216,13 @@ in_pcbinfo_init(struct inpcbinfo *pcbinf > { > > INP_INFO_LOCK_INIT(pcbinfo, name); > + INP_HASH_LOCK_INIT(pcbinfo, "pcbinfohash"); /* XXXRW: argument? */ > #ifdef VIMAGE > pcbinfo->ipi_vnet = curvnet; > #endif > pcbinfo->ipi_listhead = listhead; > LIST_INIT(pcbinfo->ipi_listhead); > + pcbinfo->ipi_count = 0; > pcbinfo->ipi_hashbase = hashinit(hash_nelements, M_PCB, > &pcbinfo->ipi_hashmask); > pcbinfo->ipi_porthashbase = hashinit(porthash_nelements, M_PCB, > @@ -234,10 +240,14 @@ void > in_pcbinfo_destroy(struct inpcbinfo *pcbinfo) > { > > + KASSERT(pcbinfo->ipi_count == 0, > + ("%s: ipi_count = %u", __func__, pcbinfo->ipi_count)); > + > hashdestroy(pcbinfo->ipi_hashbase, M_PCB, pcbinfo->ipi_hashmask); > hashdestroy(pcbinfo->ipi_porthashbase, M_PCB, > pcbinfo->ipi_porthashmask); > uma_zdestroy(pcbinfo->ipi_zone); > + INP_HASH_LOCK_DESTROY(pcbinfo); > INP_INFO_LOCK_DESTROY(pcbinfo); > } > > @@ -309,8 +319,8 @@ in_pcbbind(struct inpcb *inp, struct soc > { > int anonport, error; > > - INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo); > INP_WLOCK_ASSERT(inp); > + INP_HASH_WLOCK_ASSERT(inp->inp_pcbinfo); > > if (inp->inp_lport != 0 || inp->inp_laddr.s_addr != INADDR_ANY) > return (EINVAL); > @@ -351,8 +361,8 @@ in_pcb_lport(struct inpcb *inp, struct i > * Because no actual state changes occur here, a global write lock on > * the pcbinfo isn't required. > */ > - INP_INFO_LOCK_ASSERT(pcbinfo); > INP_LOCK_ASSERT(inp); > + INP_HASH_LOCK_ASSERT(pcbinfo); > > if (inp->inp_flags & INP_HIGHPORT) { > first = V_ipport_hifirstauto; /* sysctl */ > @@ -473,11 +483,10 @@ in_pcbbind_setup(struct inpcb *inp, stru > int error; > > /* > - * Because no actual state changes occur here, a global write lock on > - * the pcbinfo isn't required. > + * No state changes, so read locks are sufficient here. > */ > - INP_INFO_LOCK_ASSERT(pcbinfo); > INP_LOCK_ASSERT(inp); > + INP_HASH_LOCK_ASSERT(pcbinfo); > > if (TAILQ_EMPTY(&V_in_ifaddrhead)) /* XXX broken! */ > return (EADDRNOTAVAIL); > @@ -618,8 +627,8 @@ in_pcbconnect(struct inpcb *inp, struct > in_addr_t laddr, faddr; > int anonport, error; > > - INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo); > INP_WLOCK_ASSERT(inp); > + INP_HASH_WLOCK_ASSERT(inp->inp_pcbinfo); > > lport = inp->inp_lport; > laddr = inp->inp_laddr.s_addr; > @@ -907,8 +916,8 @@ in_pcbconnect_setup(struct inpcb *inp, s > * Because a global state change doesn't actually occur here, a read > * lock is sufficient. > */ > - INP_INFO_LOCK_ASSERT(inp->inp_pcbinfo); > INP_LOCK_ASSERT(inp); > + INP_HASH_LOCK_ASSERT(inp->inp_pcbinfo); > > if (oinpp != NULL) > *oinpp = NULL; > @@ -983,8 +992,8 @@ in_pcbconnect_setup(struct inpcb *inp, s > if (error) > return (error); > } > - oinp = in_pcblookup_hash(inp->inp_pcbinfo, faddr, fport, laddr, lport, > - 0, NULL); > + oinp = in_pcblookup_hash_locked(inp->inp_pcbinfo, faddr, fport, > + laddr, lport, 0, NULL); > if (oinp != NULL) { > if (oinpp != NULL) > *oinpp = oinp; > @@ -1007,8 +1016,8 @@ void > in_pcbdisconnect(struct inpcb *inp) > { > > - INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo); > INP_WLOCK_ASSERT(inp); > + INP_HASH_WLOCK_ASSERT(inp->inp_pcbinfo); > > inp->inp_faddr.s_addr = INADDR_ANY; > inp->inp_fport = 0; > @@ -1187,19 +1196,24 @@ void > in_pcbdrop(struct inpcb *inp) > { > > - INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo); > INP_WLOCK_ASSERT(inp); > > + /* > + * XXXRW: Possibly we should protect the setting of INP_DROPPED with > + * the hash lock...? > + */ > inp->inp_flags |= INP_DROPPED; > if (inp->inp_flags & INP_INHASHLIST) { > struct inpcbport *phd = inp->inp_phd; > > + INP_HASH_WLOCK(inp->inp_pcbinfo); > LIST_REMOVE(inp, inp_hash); > LIST_REMOVE(inp, inp_portlist); > if (LIST_FIRST(&phd->phd_pcblist) == NULL) { > LIST_REMOVE(phd, phd_hash); > free(phd, M_PCB); > } > + INP_HASH_WUNLOCK(inp->inp_pcbinfo); > inp->inp_flags &= ~INP_INHASHLIST; > } > } > @@ -1328,7 +1342,8 @@ in_pcbpurgeif0(struct inpcbinfo *pcbinfo > } > > /* > - * Lookup a PCB based on the local address and port. > + * Lookup a PCB based on the local address and port. Caller must hold the > + * hash lock. No inpcb locks or references are acquired. > */ > #define INP_LOOKUP_MAPPED_PCB_COST 3 > struct inpcb * > @@ -1346,7 +1361,7 @@ in_pcblookup_local(struct inpcbinfo *pcb > KASSERT((lookupflags & ~(INPLOOKUP_WILDCARD)) == 0, > ("%s: invalid lookup flags %d", __func__, lookupflags)); > > - INP_INFO_LOCK_ASSERT(pcbinfo); > + INP_HASH_LOCK_ASSERT(pcbinfo); > > if ((lookupflags & INPLOOKUP_WILDCARD) == 0) { > struct inpcbhead *head; > @@ -1450,10 +1465,12 @@ in_pcblookup_local(struct inpcbinfo *pcb > #undef INP_LOOKUP_MAPPED_PCB_COST > > /* > - * Lookup PCB in hash list. > + * Lookup PCB in hash list, using pcbinfo tables. This variation assumes > + * that the caller has locked the hash list, and will not perform any further > + * locking or reference operations on either the hash list or the connection. > */ > -struct inpcb * > -in_pcblookup_hash(struct inpcbinfo *pcbinfo, struct in_addr faddr, > +static struct inpcb * > +in_pcblookup_hash_locked(struct inpcbinfo *pcbinfo, struct in_addr faddr, > u_int fport_arg, struct in_addr laddr, u_int lport_arg, int lookupflags, > struct ifnet *ifp) > { > @@ -1464,7 +1481,7 @@ in_pcblookup_hash(struct inpcbinfo *pcbi > KASSERT((lookupflags & ~(INPLOOKUP_WILDCARD)) == 0, > ("%s: invalid lookup flags %d", __func__, lookupflags)); > > - INP_INFO_LOCK_ASSERT(pcbinfo); > + INP_HASH_LOCK_ASSERT(pcbinfo); > > /* > * First look for an exact match. > @@ -1574,6 +1591,56 @@ in_pcblookup_hash(struct inpcbinfo *pcbi > > return (NULL); > } > + > +/* > + * Lookup PCB in hash list, using pcbinfo tables. This variation locks the > + * hash list lock, and will return the inpcb locked (i.e., requires > + * INPLOOKUP_LOCKPCB). > + */ > +static struct inpcb * > +in_pcblookup_hash(struct inpcbinfo *pcbinfo, struct in_addr faddr, > + u_int fport, struct in_addr laddr, u_int lport, int lookupflags, > + struct ifnet *ifp) > +{ > + struct inpcb *inp; > + > + INP_HASH_RLOCK(pcbinfo); > + inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, lport, > + (lookupflags & ~(INPLOOKUP_RLOCKPCB | INPLOOKUP_WLOCKPCB)), ifp); > + if (inp != NULL) { > + in_pcbref(inp); > + INP_HASH_RUNLOCK(pcbinfo); > + if (lookupflags & INPLOOKUP_WLOCKPCB) { > + INP_WLOCK(inp); > + if (in_pcbrele_wlocked(inp)) > + return (NULL); > + } else if (lookupflags & INPLOOKUP_RLOCKPCB) { > + INP_RLOCK(inp); > + if (in_pcbrele_rlocked(inp)) > + return (NULL); > + } else > + panic("%s: locking bug", __func__); > + } else > + INP_HASH_RUNLOCK(pcbinfo); > + return (inp); > +} > + > +/* > + * Public inpcb lookup routines, accepting a 4-tuple. > + */ > +struct inpcb * > +in_pcblookup(struct inpcbinfo *pcbinfo, struct in_addr faddr, u_int fport, > + struct in_addr laddr, u_int lport, int lookupflags, struct ifnet *ifp) > +{ > + > + KASSERT((lookupflags & ~INPLOOKUP_MASK) == 0, > + ("%s: invalid lookup flags %d", __func__, lookupflags)); > + KASSERT((lookupflags & (INPLOOKUP_RLOCKPCB | INPLOOKUP_WLOCKPCB)) != 0, > + ("%s: LOCKPCB not set", __func__)); > + > + return (in_pcblookup_hash(pcbinfo, faddr, fport, laddr, lport, > + lookupflags, ifp)); > +} > #endif /* INET */ > > /* > @@ -1588,8 +1655,9 @@ in_pcbinshash(struct inpcb *inp) > struct inpcbport *phd; > u_int32_t hashkey_faddr; > > - INP_INFO_WLOCK_ASSERT(pcbinfo); > INP_WLOCK_ASSERT(inp); > + INP_HASH_WLOCK_ASSERT(pcbinfo); > + > KASSERT((inp->inp_flags & INP_INHASHLIST) == 0, > ("in_pcbinshash: INP_INHASHLIST")); > > @@ -1645,8 +1713,9 @@ in_pcbrehash(struct inpcb *inp) > struct inpcbhead *head; > u_int32_t hashkey_faddr; > > - INP_INFO_WLOCK_ASSERT(pcbinfo); > INP_WLOCK_ASSERT(inp); > + INP_HASH_WLOCK_ASSERT(pcbinfo); > + > KASSERT(inp->inp_flags & INP_INHASHLIST, > ("in_pcbrehash: !INP_INHASHLIST")); > > @@ -1679,12 +1748,14 @@ in_pcbremlists(struct inpcb *inp) > if (inp->inp_flags & INP_INHASHLIST) { > struct inpcbport *phd = inp->inp_phd; > > + INP_HASH_WLOCK(pcbinfo); > LIST_REMOVE(inp, inp_hash); > LIST_REMOVE(inp, inp_portlist); > if (LIST_FIRST(&phd->phd_pcblist) == NULL) { > LIST_REMOVE(phd, phd_hash); > free(phd, M_PCB); > } > + INP_HASH_WUNLOCK(pcbinfo); > inp->inp_flags &= ~INP_INHASHLIST; > } > LIST_REMOVE(inp, inp_list); > > Modified: head/sys/netinet/in_pcb.h > ============================================================================== > --- head/sys/netinet/in_pcb.h Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/in_pcb.h Mon May 30 09:43:55 2011 (r222488) > @@ -268,22 +268,22 @@ struct inpcbport { > * Global data structure for each high-level protocol (UDP, TCP, ...) in both > * IPv4 and IPv6. Holds inpcb lists and information for managing them. > * > - * Each pcbinfo is protected by ipi_lock, covering mutable global fields (such > - * as the global pcb list) and hashed lookup tables. The lock order is: > + * Each pcbinfo is protected by two locks: ipi_lock and ipi_hash_lock, > + * the former covering mutable global fields (such as the global pcb list), > + * and the latter covering the hashed lookup tables. The lock order is: > * > - * ipi_lock (before) inpcb locks > + * ipi_lock (before) inpcb locks (before) ipi_hash_lock > * > * Locking key: > * > * (c) Constant or nearly constant after initialisation > * (g) Locked by ipi_lock > - * (h) Read using either ipi_lock or inpcb lock; write requires both. > + * (h) Read using either ipi_hash_lock or inpcb lock; write requires both. > * (x) Synchronisation properties poorly defined > */ > struct inpcbinfo { > /* > - * Global lock protecting global inpcb list, inpcb count, hash tables, > - * etc. > + * Global lock protecting global inpcb list, inpcb count, etc. > */ > struct rwlock ipi_lock; > > @@ -312,17 +312,22 @@ struct inpcbinfo { > struct uma_zone *ipi_zone; /* (c) */ > > /* > + * Global lock protecting hash lookup tables. > + */ > + struct rwlock ipi_hash_lock; > + > + /* > * Global hash of inpcbs, hashed by local and foreign addresses and > * port numbers. > */ > - struct inpcbhead *ipi_hashbase; /* (g) */ > - u_long ipi_hashmask; /* (g) */ > + struct inpcbhead *ipi_hashbase; /* (h) */ > + u_long ipi_hashmask; /* (h) */ > > /* > * Global hash of inpcbs, hashed by only local port number. > */ > - struct inpcbporthead *ipi_porthashbase; /* (g) */ > - u_long ipi_porthashmask; /* (g) */ > + struct inpcbporthead *ipi_porthashbase; /* (h) */ > + u_long ipi_porthashmask; /* (h) */ > > /* > * Pointer to network stack instance > @@ -406,6 +411,18 @@ void inp_4tuple_get(struct inpcb *inp, > #define INP_INFO_WLOCK_ASSERT(ipi) rw_assert(&(ipi)->ipi_lock, RA_WLOCKED) > #define INP_INFO_UNLOCK_ASSERT(ipi) rw_assert(&(ipi)->ipi_lock, RA_UNLOCKED) > > +#define INP_HASH_LOCK_INIT(ipi, d) \ > + rw_init_flags(&(ipi)->ipi_hash_lock, (d), 0) > +#define INP_HASH_LOCK_DESTROY(ipi) rw_destroy(&(ipi)->ipi_hash_lock) > +#define INP_HASH_RLOCK(ipi) rw_rlock(&(ipi)->ipi_hash_lock) > +#define INP_HASH_WLOCK(ipi) rw_wlock(&(ipi)->ipi_hash_lock) > +#define INP_HASH_RUNLOCK(ipi) rw_runlock(&(ipi)->ipi_hash_lock) > +#define INP_HASH_WUNLOCK(ipi) rw_wunlock(&(ipi)->ipi_hash_lock) > +#define INP_HASH_LOCK_ASSERT(ipi) rw_assert(&(ipi)->ipi_hash_lock, \ > + RA_LOCKED) > +#define INP_HASH_WLOCK_ASSERT(ipi) rw_assert(&(ipi)->ipi_hash_lock, \ > + RA_WLOCKED) > + > #define INP_PCBHASH(faddr, lport, fport, mask) \ > (((faddr) ^ ((faddr) >> 16) ^ ntohs((lport) ^ (fport))) & (mask)) > #define INP_PCBPORTHASH(lport, mask) \ > @@ -466,7 +483,16 @@ void inp_4tuple_get(struct inpcb *inp, > #define INP_LLE_VALID 0x00000001 /* cached lle is valid */ > #define INP_RT_VALID 0x00000002 /* cached rtentry is valid */ > > -#define INPLOOKUP_WILDCARD 1 > +/* > + * Flags passed to in_pcblookup*() functions. > + */ > +#define INPLOOKUP_WILDCARD 0x00000001 /* Allow wildcard sockets. */ > +#define INPLOOKUP_RLOCKPCB 0x00000002 /* Return inpcb read-locked. */ > +#define INPLOOKUP_WLOCKPCB 0x00000004 /* Return inpcb write-locked. */ > + > +#define INPLOOKUP_MASK (INPLOOKUP_WILDCARD | INPLOOKUP_RLOCKPCB | \ > + INPLOOKUP_WLOCKPCB) > + > #define sotoinpcb(so) ((struct inpcb *)(so)->so_pcb) > #define sotoin6pcb(so) sotoinpcb(so) /* for KAME src sync over BSD*'s */ > > @@ -527,7 +553,7 @@ struct inpcb * > in_pcblookup_local(struct inpcbinfo *, > struct in_addr, u_short, int, struct ucred *); > struct inpcb * > - in_pcblookup_hash(struct inpcbinfo *, struct in_addr, u_int, > + in_pcblookup(struct inpcbinfo *, struct in_addr, u_int, > struct in_addr, u_int, int, struct ifnet *); > void in_pcbnotifyall(struct inpcbinfo *pcbinfo, struct in_addr, > int, struct inpcb *(*)(struct inpcb *, int)); > > Modified: head/sys/netinet/ip_divert.c > ============================================================================== > --- head/sys/netinet/ip_divert.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/ip_divert.c Mon May 30 09:43:55 2011 (r222488) > @@ -659,9 +659,9 @@ div_pcblist(SYSCTL_HANDLER_ARGS) > INP_INFO_WLOCK(&V_divcbinfo); > for (i = 0; i < n; i++) { > inp = inp_list[i]; > - INP_WLOCK(inp); > - if (!in_pcbrele(inp)) > - INP_WUNLOCK(inp); > + INP_RLOCK(inp); > + if (!in_pcbrele_rlocked(inp)) > + INP_RUNLOCK(inp); > } > INP_INFO_WUNLOCK(&V_divcbinfo); > > > Modified: head/sys/netinet/ipfw/ip_fw2.c > ============================================================================== > --- head/sys/netinet/ipfw/ip_fw2.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/ipfw/ip_fw2.c Mon May 30 09:43:55 2011 (r222488) > @@ -657,7 +657,7 @@ check_uidgid(ipfw_insn_u32 *insn, int pr > (struct bsd_ucred *)uc, ugid_lookupp, ((struct mbuf *)inp)->m_skb); > #else /* FreeBSD */ > struct inpcbinfo *pi; > - int wildcard; > + int lookupflags; > struct inpcb *pcb; > int match; > > @@ -682,30 +682,31 @@ check_uidgid(ipfw_insn_u32 *insn, int pr > if (*ugid_lookupp == -1) > return (0); > if (proto == IPPROTO_TCP) { > - wildcard = 0; > + lookupflags = 0; > pi = &V_tcbinfo; > } else if (proto == IPPROTO_UDP) { > - wildcard = INPLOOKUP_WILDCARD; > + lookupflags = INPLOOKUP_WILDCARD; > pi = &V_udbinfo; > } else > return 0; > + lookupflags |= INPLOOKUP_RLOCKPCB; > match = 0; > if (*ugid_lookupp == 0) { > - INP_INFO_RLOCK(pi); > pcb = (oif) ? > - in_pcblookup_hash(pi, > + in_pcblookup(pi, > dst_ip, htons(dst_port), > src_ip, htons(src_port), > - wildcard, oif) : > - in_pcblookup_hash(pi, > + lookupflags, oif) : > + in_pcblookup(pi, > src_ip, htons(src_port), > dst_ip, htons(dst_port), > - wildcard, NULL); > + lookupflags, NULL); > if (pcb != NULL) { > + INP_RLOCK_ASSERT(pcb); > *uc = crhold(pcb->inp_cred); > *ugid_lookupp = 1; > + INP_RUNLOCK(pcb); > } > - INP_INFO_RUNLOCK(pi); > if (*ugid_lookupp == 0) { > /* > * We tried and failed, set the variable to -1 > @@ -1827,21 +1828,32 @@ do { \ > else > break; > > + /* > + * XXXRW: so_user_cookie should almost > + * certainly be inp_user_cookie? > + */ > + > /* For incomming packet, lookup up the > inpcb using the src/dest ip/port tuple */ > if (inp == NULL) { > - INP_INFO_RLOCK(pi); > - inp = in_pcblookup_hash(pi, > + inp = in_pcblookup(pi, > src_ip, htons(src_port), > dst_ip, htons(dst_port), > - 0, NULL); > - INP_INFO_RUNLOCK(pi); > - } > - > - if (inp && inp->inp_socket) { > - tablearg = inp->inp_socket->so_user_cookie; > - if (tablearg) > - match = 1; > + INPLOOKUP_RLOCKPCB, NULL); > + if (inp != NULL) { > + tablearg = > + inp->inp_socket->so_user_cookie; > + if (tablearg) > + match = 1; > + INP_RUNLOCK(inp); > + } > + } else { > + if (inp->inp_socket) { > + tablearg = > + inp->inp_socket->so_user_cookie; > + if (tablearg) > + match = 1; > + } > } > break; > } > > Modified: head/sys/netinet/raw_ip.c > ============================================================================== > --- head/sys/netinet/raw_ip.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/raw_ip.c Mon May 30 09:43:55 2011 (r222488) > @@ -226,7 +226,7 @@ rip_append(struct inpcb *last, struct ip > { > int policyfail = 0; > > - INP_RLOCK_ASSERT(last); > + INP_LOCK_ASSERT(last); > > #ifdef IPSEC > /* check AH/ESP integrity. */ > @@ -834,16 +834,19 @@ rip_detach(struct socket *so) > static void > rip_dodisconnect(struct socket *so, struct inpcb *inp) > { > + struct inpcbinfo *pcbinfo; > > - INP_INFO_WLOCK_ASSERT(inp->inp_pcbinfo); > - INP_WLOCK_ASSERT(inp); > - > + pcbinfo = inp->inp_pcbinfo; > + INP_INFO_WLOCK(pcbinfo); > + INP_WLOCK(inp); > rip_delhash(inp); > inp->inp_faddr.s_addr = INADDR_ANY; > rip_inshash(inp); > SOCK_LOCK(so); > so->so_state &= ~SS_ISCONNECTED; > SOCK_UNLOCK(so); > + INP_WUNLOCK(inp); > + INP_INFO_WUNLOCK(pcbinfo); > } > > static void > @@ -854,11 +857,7 @@ rip_abort(struct socket *so) > inp = sotoinpcb(so); > KASSERT(inp != NULL, ("rip_abort: inp == NULL")); > > - INP_INFO_WLOCK(&V_ripcbinfo); > - INP_WLOCK(inp); > rip_dodisconnect(so, inp); > - INP_WUNLOCK(inp); > - INP_INFO_WUNLOCK(&V_ripcbinfo); > } > > static void > @@ -869,11 +868,7 @@ rip_close(struct socket *so) > inp = sotoinpcb(so); > KASSERT(inp != NULL, ("rip_close: inp == NULL")); > > - INP_INFO_WLOCK(&V_ripcbinfo); > - INP_WLOCK(inp); > rip_dodisconnect(so, inp); > - INP_WUNLOCK(inp); > - INP_INFO_WUNLOCK(&V_ripcbinfo); > } > > static int > @@ -887,11 +882,7 @@ rip_disconnect(struct socket *so) > inp = sotoinpcb(so); > KASSERT(inp != NULL, ("rip_disconnect: inp == NULL")); > > - INP_INFO_WLOCK(&V_ripcbinfo); > - INP_WLOCK(inp); > rip_dodisconnect(so, inp); > - INP_WUNLOCK(inp); > - INP_INFO_WUNLOCK(&V_ripcbinfo); > return (0); > } > > @@ -1077,9 +1068,9 @@ rip_pcblist(SYSCTL_HANDLER_ARGS) > INP_INFO_WLOCK(&V_ripcbinfo); > for (i = 0; i < n; i++) { > inp = inp_list[i]; > - INP_WLOCK(inp); > - if (!in_pcbrele(inp)) > - INP_WUNLOCK(inp); > + INP_RLOCK(inp); > + if (!in_pcbrele_rlocked(inp)) > + INP_RUNLOCK(inp); > } > INP_INFO_WUNLOCK(&V_ripcbinfo); > > > Modified: head/sys/netinet/siftr.c > ============================================================================== > --- head/sys/netinet/siftr.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/siftr.c Mon May 30 09:43:55 2011 (r222488) > @@ -696,17 +696,16 @@ siftr_findinpcb(int ipver, struct ip *ip > > /* We need the tcbinfo lock. */ > INP_INFO_UNLOCK_ASSERT(&V_tcbinfo); > - INP_INFO_RLOCK(&V_tcbinfo); > > if (dir == PFIL_IN) > inp = (ipver == INP_IPV4 ? > - in_pcblookup_hash(&V_tcbinfo, ip->ip_src, sport, ip->ip_dst, > - dport, 0, m->m_pkthdr.rcvif) > + in_pcblookup(&V_tcbinfo, ip->ip_src, sport, ip->ip_dst, > + dport, INPLOOKUP_RLOCKPCB, m->m_pkthdr.rcvif) > : > #ifdef SIFTR_IPV6 > - in6_pcblookup_hash(&V_tcbinfo, > + in6_pcblookup(&V_tcbinfo, > &((struct ip6_hdr *)ip)->ip6_src, sport, > - &((struct ip6_hdr *)ip)->ip6_dst, dport, 0, > + &((struct ip6_hdr *)ip)->ip6_dst, dport, INPLOOKUP_RLOCKPCB, > m->m_pkthdr.rcvif) > #else > NULL > @@ -715,13 +714,13 @@ siftr_findinpcb(int ipver, struct ip *ip > > else > inp = (ipver == INP_IPV4 ? > - in_pcblookup_hash(&V_tcbinfo, ip->ip_dst, dport, ip->ip_src, > - sport, 0, m->m_pkthdr.rcvif) > + in_pcblookup(&V_tcbinfo, ip->ip_dst, dport, ip->ip_src, > + sport, INPLOOKUP_RLOCKPCB, m->m_pkthdr.rcvif) > : > #ifdef SIFTR_IPV6 > - in6_pcblookup_hash(&V_tcbinfo, > + in6_pcblookup(&V_tcbinfo, > &((struct ip6_hdr *)ip)->ip6_dst, dport, > - &((struct ip6_hdr *)ip)->ip6_src, sport, 0, > + &((struct ip6_hdr *)ip)->ip6_src, sport, INPLOOKUP_RLOCKPCB, > m->m_pkthdr.rcvif) > #else > NULL > @@ -734,12 +733,7 @@ siftr_findinpcb(int ipver, struct ip *ip > ss->nskip_in_inpcb++; > else > ss->nskip_out_inpcb++; > - } else { > - /* Acquire the inpcb lock. */ > - INP_UNLOCK_ASSERT(inp); > - INP_RLOCK(inp); > } > - INP_INFO_RUNLOCK(&V_tcbinfo); > > return (inp); > } > > Modified: head/sys/netinet/tcp_input.c > ============================================================================== > --- head/sys/netinet/tcp_input.c Mon May 30 09:41:38 2011 (r222487) > +++ head/sys/netinet/tcp_input.c Mon May 30 09:43:55 2011 (r222488) > @@ -5,6 +5,7 @@ > * Swinburne University of Technology, Melbourne, Australia. > * Copyright (c) 2009-2010 Lawrence Stewart > * Copyright (c) 2010 The FreeBSD Foundation > + * Copyright (c) 2010-2011 Juniper Networks, Inc. > * All rights reserved. > * > * Portions of this software were developed at the Centre for Advanced Internet > @@ -16,6 +17,9 @@ > * Internet Architectures, Swinburne University of Technology, Melbourne, > * Australia by David Hayes under sponsorship from the FreeBSD Foundation. > * > + * Portions of this software were developed by Robert N. M. Watson under > + * contract to Juniper Networks, Inc. > + * > * Redistribution and use in source and binary forms, with or without > * modification, are permitted provided that the following conditions > * are met: > @@ -197,10 +201,6 @@ SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, > &VNET_NAME(tcp_autorcvbuf_max), 0, > "Max size of automatic receive buffer"); > > -int tcp_read_locking = 1; > -SYSCTL_INT(_net_inet_tcp, OID_AUTO, read_locking, CTLFLAG_RW, > - &tcp_read_locking, 0, "Enable read locking strategy"); > - > VNET_DEFINE(struct inpcbhead, tcb); > #define tcb6 tcb /* for KAME src sync over BSD*'s */ > VNET_DEFINE(struct inpcbinfo, tcbinfo); > @@ -591,8 +591,7 @@ tcp_input(struct mbuf *m, int off0) > char *s = NULL; /* address and port logging */ > int ti_locked; > #define TI_UNLOCKED 1 > -#define TI_RLOCKED 2 > -#define TI_WLOCKED 3 > +#define TI_WLOCKED 2 > > #ifdef TCPDEBUG > /* > @@ -756,30 +755,25 @@ tcp_input(struct mbuf *m, int off0) > drop_hdrlen = off0 + off; > > /* > - * Locate pcb for segment, which requires a lock on tcbinfo. > - * Optimisticaly acquire a global read lock rather than a write lock > - * unless header flags necessarily imply a state change. There are > - * two cases where we might discover later we need a write lock > - * despite the flags: ACKs moving a connection out of the syncache, > - * and ACKs for a connection in TIMEWAIT. > + * Locate pcb for segment; if we're likely to add or remove a > + * connection then first acquire pcbinfo lock. There are two cases > + * where we might discover later we need a write lock despite the > + * flags: ACKs moving a connection out of the syncache, and ACKs for > + * a connection in TIMEWAIT. > */ > - if ((thflags & (TH_SYN | TH_FIN | TH_RST)) != 0 || > - tcp_read_locking == 0) { > + if ((thflags & (TH_SYN | TH_FIN | TH_RST)) != 0) { > INP_INFO_WLOCK(&V_tcbinfo); > ti_locked = TI_WLOCKED; > - } else { > - INP_INFO_RLOCK(&V_tcbinfo); > - ti_locked = TI_RLOCKED; > - } > + } else > + ti_locked = TI_UNLOCKED; > > findpcb: > #ifdef INVARIANTS > - if (ti_locked == TI_RLOCKED) > - INP_INFO_RLOCK_ASSERT(&V_tcbinfo); > - else if (ti_locked == TI_WLOCKED) > + if (ti_locked == TI_WLOCKED) { > INP_INFO_WLOCK_ASSERT(&V_tcbinfo); > - else > - panic("%s: findpcb ti_locked %d\n", __func__, ti_locked); > + } else { > + INP_INFO_UNLOCK_ASSERT(&V_tcbinfo); > + } > #endif > > #ifdef INET > @@ -797,20 +791,18 @@ findpcb: > * Transparently forwarded. Pretend to be the destination. > * already got one like this? > */ > - inp = in_pcblookup_hash(&V_tcbinfo, > - ip->ip_src, th->th_sport, > - ip->ip_dst, th->th_dport, > - 0, m->m_pkthdr.rcvif); > + inp = in_pcblookup(&V_tcbinfo, ip->ip_src, th->th_sport, > + ip->ip_dst, th->th_dport, INPLOOKUP_WLOCKPCB, > + m->m_pkthdr.rcvif); > if (!inp) { > - /* It's new. Try to find the ambushing socket. */ > - inp = in_pcblookup_hash(&V_tcbinfo, > - ip->ip_src, th->th_sport, > - next_hop->sin_addr, > - next_hop->sin_port ? > - ntohs(next_hop->sin_port) : > - th->th_dport, > - INPLOOKUP_WILDCARD, > - m->m_pkthdr.rcvif); > + /* > + * It's new. Try to find the ambushing socket. > + */ > + inp = in_pcblookup(&V_tcbinfo, ip->ip_src, > + th->th_sport, next_hop->sin_addr, > + next_hop->sin_port ? ntohs(next_hop->sin_port) : > + th->th_dport, INPLOOKUP_WILDCARD | > + INPLOOKUP_WLOCKPCB, m->m_pkthdr.rcvif); > } > /* Remove the tag from the packet. We don't need it anymore. */ > m_tag_delete(m, fwd_tag); > @@ -820,21 +812,19 @@ findpcb: > { > #ifdef INET6 > if (isipv6) > - inp = in6_pcblookup_hash(&V_tcbinfo, > - &ip6->ip6_src, th->th_sport, > - &ip6->ip6_dst, th->th_dport, > - INPLOOKUP_WILDCARD, > - m->m_pkthdr.rcvif); > + inp = in6_pcblookup(&V_tcbinfo, &ip6->ip6_src, > + th->th_sport, &ip6->ip6_dst, th->th_dport, > + INPLOOKUP_WILDCARD | INPLOOKUP_WLOCKPCB, > + m->m_pkthdr.rcvif); > #endif > #if defined(INET) && defined(INET6) > else > #endif > #ifdef INET > - inp = in_pcblookup_hash(&V_tcbinfo, > - ip->ip_src, th->th_sport, > - ip->ip_dst, th->th_dport, > - INPLOOKUP_WILDCARD, > - m->m_pkthdr.rcvif); > + inp = in_pcblookup(&V_tcbinfo, ip->ip_src, > + th->th_sport, ip->ip_dst, th->th_dport, > + INPLOOKUP_WILDCARD | INPLOOKUP_WLOCKPCB, > + m->m_pkthdr.rcvif); > #endif > } > > @@ -865,7 +855,7 @@ findpcb: > rstreason = BANDLIM_RST_CLOSEDPORT; > goto dropwithreset; > } > - INP_WLOCK(inp); > + INP_WLOCK_ASSERT(inp); > if (!(inp->inp_flags & INP_HW_FLOWID) > && (m->m_flags & M_FLOWID) > && ((inp->inp_socket == NULL) > @@ -906,28 +896,26 @@ findpcb: > * legitimate new connection attempt the old INPCB gets removed and > * we can try again to find a listening socket. > * > - * At this point, due to earlier optimism, we may hold a read lock on > - * the inpcbinfo, rather than a write lock. If so, we need to > - * upgrade, or if that fails, acquire a reference on the inpcb, drop > - * all locks, acquire a global write lock, and then re-acquire the > - * inpcb lock. We may at that point discover that another thread has > - * tried to free the inpcb, in which case we need to loop back and > - * try to find a new inpcb to deliver to. > + * At this point, due to earlier optimism, we may hold only an inpcb > + * lock, and not the inpcbinfo write lock. If so, we need to try to > + * acquire it, or if that fails, acquire a reference on the inpcb, > + * drop all locks, acquire a global write lock, and then re-acquire > + * the inpcb lock. We may at that point discover that another thread > + * has tried to free the inpcb, in which case we need to loop back > + * and try to find a new inpcb to deliver to. > + * > + * XXXRW: It may be time to rethink timewait locking. > */ > relocked: > if (inp->inp_flags & INP_TIMEWAIT) { > - KASSERT(ti_locked == TI_RLOCKED || ti_locked == TI_WLOCKED, > - ("%s: INP_TIMEWAIT ti_locked %d", __func__, ti_locked)); > - > - if (ti_locked == TI_RLOCKED) { > - if (INP_INFO_TRY_UPGRADE(&V_tcbinfo) == 0) { > + if (ti_locked == TI_UNLOCKED) { > + if (INP_INFO_TRY_WLOCK(&V_tcbinfo) == 0) { > in_pcbref(inp); > INP_WUNLOCK(inp); > - INP_INFO_RUNLOCK(&V_tcbinfo); > INP_INFO_WLOCK(&V_tcbinfo); > ti_locked = TI_WLOCKED; > INP_WLOCK(inp); > - if (in_pcbrele(inp)) { > + if (in_pcbrele_wlocked(inp)) { > inp = NULL; > goto findpcb; > } > @@ -975,26 +963,24 @@ relocked: > > /* > * We've identified a valid inpcb, but it could be that we need an > - * inpcbinfo write lock and have only a read lock. In this case, > - * attempt to upgrade/relock using the same strategy as the TIMEWAIT > - * case above. If we relock, we have to jump back to 'relocked' as > - * the connection might now be in TIMEWAIT. > - */ > - if (tp->t_state != TCPS_ESTABLISHED || > - (thflags & (TH_SYN | TH_FIN | TH_RST)) != 0 || > - tcp_read_locking == 0) { > - KASSERT(ti_locked == TI_RLOCKED || ti_locked == TI_WLOCKED, > - ("%s: upgrade check ti_locked %d", __func__, ti_locked)); > - > - if (ti_locked == TI_RLOCKED) { > - if (INP_INFO_TRY_UPGRADE(&V_tcbinfo) == 0) { > + * inpcbinfo write lock but don't hold it. In this case, attempt to > + * acquire using the same strategy as the TIMEWAIT case above. If we > + * relock, we have to jump back to 'relocked' as the connection might > + * now be in TIMEWAIT. > + */ > +#ifdef INVARIANTS > + if ((thflags & (TH_SYN | TH_FIN | TH_RST)) != 0) > + INP_INFO_WLOCK_ASSERT(&V_tcbinfo); > +#endif > + if (tp->t_state != TCPS_ESTABLISHED) { > + if (ti_locked == TI_UNLOCKED) { > + if (INP_INFO_TRY_WLOCK(&V_tcbinfo) == 0) { > in_pcbref(inp); > INP_WUNLOCK(inp); > - INP_INFO_RUNLOCK(&V_tcbinfo); > INP_INFO_WLOCK(&V_tcbinfo); > ti_locked = TI_WLOCKED; > INP_WLOCK(inp); > - if (in_pcbrele(inp)) { > + if (in_pcbrele_wlocked(inp)) { > inp = NULL; > goto findpcb; > } > @@ -1027,13 +1013,16 @@ relocked: > /* > * When the socket is accepting connections (the INPCB is in LISTEN > * state) we look into the SYN cache if this is a new connection > - * attempt or the completion of a previous one. > + * attempt or the completion of a previous one. Because listen > + * sockets are never in TCPS_ESTABLISHED, the V_tcbinfo lock will be > + * held in this case. > */ > if (so->so_options & SO_ACCEPTCONN) { > struct in_conninfo inc; > > KASSERT(tp->t_state == TCPS_LISTEN, ("%s: so accepting but " > "tp not listening", __func__)); > + INP_INFO_WLOCK_ASSERT(&V_tcbinfo); > > bzero(&inc, sizeof(inc)); > #ifdef INET6 > @@ -1371,13 +1360,17 @@ relocked: > return; > > dropwithreset: > - if (ti_locked == TI_RLOCKED) > - INP_INFO_RUNLOCK(&V_tcbinfo); > - else if (ti_locked == TI_WLOCKED) > + if (ti_locked == TI_WLOCKED) { > INP_INFO_WUNLOCK(&V_tcbinfo); > - else > - panic("%s: dropwithreset ti_locked %d", __func__, ti_locked); > - ti_locked = TI_UNLOCKED; > + ti_locked = TI_UNLOCKED; > + } > +#ifdef INVARIANTS > + else { > + KASSERT(ti_locked == TI_UNLOCKED, ("%s: dropwithreset " > + "ti_locked: %d", __func__, ti_locked)); > + INP_INFO_UNLOCK_ASSERT(&V_tcbinfo); > + } > +#endif > > if (inp != NULL) { > tcp_dropwithreset(m, th, tp, tlen, rstreason); > @@ -1388,13 +1381,17 @@ dropwithreset: > goto drop; > > dropunlock: > - if (ti_locked == TI_RLOCKED) > - INP_INFO_RUNLOCK(&V_tcbinfo); > - else if (ti_locked == TI_WLOCKED) > + if (ti_locked == TI_WLOCKED) { > INP_INFO_WUNLOCK(&V_tcbinfo); > - else > - panic("%s: dropunlock ti_locked %d", __func__, ti_locked); > - ti_locked = TI_UNLOCKED; > + ti_locked = TI_UNLOCKED; > + } > +#ifdef INVARIANTS > + else { > + KASSERT(ti_locked == TI_UNLOCKED, ("%s: dropunlock " > + "ti_locked: %d", __func__, ti_locked)); > + INP_INFO_UNLOCK_ASSERT(&V_tcbinfo); > + } > +#endif > > if (inp != NULL) > INP_WUNLOCK(inp); > @@ -1449,13 +1446,13 @@ tcp_do_segment(struct mbuf *m, struct tc > INP_INFO_WLOCK_ASSERT(&V_tcbinfo); > } else { > #ifdef INVARIANTS > - if (ti_locked == TI_RLOCKED) > - INP_INFO_RLOCK_ASSERT(&V_tcbinfo); > > *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** > _______________________________________________ > svn-src-head@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/svn-src-head > To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org" >