From owner-svn-src-stable@freebsd.org Sun Dec 31 05:06:37 2017 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F9B2EA0ACB; Sun, 31 Dec 2017 05:06:37 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F11232438; Sun, 31 Dec 2017 05:06:36 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id vBV56aqP037698; Sun, 31 Dec 2017 05:06:36 GMT (envelope-from mjg@FreeBSD.org) Received: (from mjg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id vBV56aBN037695; Sun, 31 Dec 2017 05:06:36 GMT (envelope-from mjg@FreeBSD.org) Message-Id: <201712310506.vBV56aBN037695@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mjg set sender to mjg@FreeBSD.org using -f From: Mateusz Guzik Date: Sun, 31 Dec 2017 05:06:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r327413 - in stable/11/sys: kern sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mjg X-SVN-Commit-Paths: in stable/11/sys: kern sys X-SVN-Commit-Revision: 327413 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Dec 2017 05:06:37 -0000 Author: mjg Date: Sun Dec 31 05:06:35 2017 New Revision: 327413 URL: https://svnweb.freebsd.org/changeset/base/327413 Log: MFC r320561,r323236,r324041,r324314,r324609,r324613,r324778,r324780,r324787, r324803,r324836,r325469,r325706,r325917,r325918,r325919,r325920,r325921, r325922,r325925,r325963,r326106,r326107,r326110,r326111,r326112,r326194, r326195,r326196,r326197,r326198,r326199,r326200,r326237: rwlock: perform the typically false td_rw_rlocks check later Check if the lock is available first instead. ============= Sprinkle __read_frequently on few obvious places. Note that some of annotated variables should probably change their types to something smaller, preferably bit-sized. ============= mtx: drop the tid argument from _mtx_lock_sleep tid must be equal to curthread and the target routine was already reading it anyway, which is not a problem. Not passing it as a parameter allows for a little bit shorter code in callers. ============= locks: partially tidy up waiting on readers spin first instant of instantly re-readoing and don't re-read after spinning is finished - the state is already known. Note the code is subject to significant changes later. ============= locks: take the number of readers into account when waiting Previous code would always spin once before checking the lock. But a lock with e.g. 6 readers is not going to become free in the duration of once spin even if they start draining immediately. Conservatively perform one for each reader. Note that the total number of allowed spins is still extremely small and is subject to change later. ============= mtx: change MTX_UNOWNED from 4 to 0 The value is spread all over the kernel and zeroing a register is cheaper/shorter than setting it up to an arbitrary value. Reduces amd64 GENERIC-NODEBUG .text size by 0.4%. ============= mtx: fix up owner_mtx after r324609 Now that MTX_UNOWNED is 0 the test was alwayas false. ============= mtx: clean up locking spin mutexes 1) shorten the fast path by pushing the lockstat probe to the slow path 2) test for kernel panic only after it turns out we will have to spin, in particular test only after we know we are not recursing ============= mtx: stop testing SCHEDULER_STOPPED in kabi funcs for spin mutexes There is nothing panic-breaking to do in the unlock case and the lock case will fallback to the slow path doing the check already. ============= rwlock: reduce lockstat branches in the slowpath ============= mtx: fix up UP build after r324778 ============= mtx: implement thread lock fastpath ============= rwlock: fix up compilation without KDTRACE_HOOKS after r324787 ============= rwlock: use fcmpset for setting RW_LOCK_WRITE_SPINNER ============= sx: avoid branches if in the slow path if lockstat is disabled ============= rwlock: avoid branches in the slow path if lockstat is disabled ============= locks: pull up PMC_SOFT_CALLs out of slow path loops ============= mtx: unlock before traversing threads to wake up This shortens the lock hold time while not affecting corretness. All the woken up threads end up competing can lose the race against a completely unrelated thread getting the lock anyway. ============= rwlock: unlock before traversing threads to wake up While here perform a minor cleanup of the unlock path. ============= sx: perform a minor cleanup of the unlock slowpath No functional changes. ============= mtx: add missing parts of the diff in r325920 Fixes build breakage. ============= locks: fix compilation issues without SMP or KDTRACE_HOOKS ============= locks: remove the file + line argument from internal primitives when not used The pair is of use only in debug or LOCKPROF kernels, but was passed (zeroed) for many locks even in production kernels. While here whack the tid argument from wlock hard and xlock hard. There is no kbi change of any sort - "external" primitives still accept the pair. ============= locks: pass the found lock value to unlock slow path This avoids an explicit read later. While here whack the cheaply obtainable 'tid' argument. ============= rwlock: don't check for curthread's read lock count in the fast path ============= rwlock: unbreak WITNESS builds after r326110 ============= sx: unbreak debug after r326107 An assertion was modified to use the found value, but it was not updated to handle a race where blocked threads appear after the entrance to the func. Move the assertion down to the area protected with sleepq lock where the lock is read anyway. This does not affect coverage of the assertion and is consistent with what rw locks are doing. ============= rwlock: stop re-reading the owner when going to sleep ============= locks: retry turnstile/sleepq loops on failed cmpset In order to go to sleep threads set waiter flags, but that can spuriously fail e.g. when a new reader arrives. Instead of unlocking everything and looping back, re-evaluate the new state while still holding the lock necessary to go to sleep. ============= sx: change sunlock to wake waiters up if it locked sleepq sleepq is only locked if the curhtread is the last reader. By the time the lock gets acquired new ones could have arrived. The previous code would unlock and loop back. This results spurious relocking of sleepq. This is a step towards xadd-based unlock routine. ============= rwlock: add __rw_try_{r,w}lock_int ============= rwlock: fix up compilation of the previous change commmitted wrong version of the patch ============= Convert in-kernel thread_lock_flags calls to thread_lock when debug is disabled The flags argument is not used in this case. ============= Add the missing lockstat check for thread lock. ============= rw: fix runlock_hard when new readers show up When waiters/writer spinner flags are set no new readers can show up unless they already have a different rw rock read locked. The change in r326195 failed to take that into account - in presence of new readers it would spin until they all drain, which would be lead to trouble if e.g. they go off cpu and can get scheduled because of this thread. Modified: stable/11/sys/kern/kern_mutex.c stable/11/sys/kern/kern_rwlock.c stable/11/sys/kern/kern_sx.c stable/11/sys/sys/lock.h stable/11/sys/sys/mutex.h stable/11/sys/sys/rwlock.h stable/11/sys/sys/sx.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/kern_mutex.c ============================================================================== --- stable/11/sys/kern/kern_mutex.c Sun Dec 31 04:09:40 2017 (r327412) +++ stable/11/sys/kern/kern_mutex.c Sun Dec 31 05:06:35 2017 (r327413) @@ -217,7 +217,7 @@ owner_mtx(const struct lock_object *lock, struct threa m = (const struct mtx *)lock; x = m->mtx_lock; *owner = (struct thread *)(x & ~MTX_FLAGMASK); - return (x != MTX_UNOWNED); + return (*owner != NULL); } #endif @@ -248,7 +248,7 @@ __mtx_lock_flags(volatile uintptr_t *c, int opts, cons tid = (uintptr_t)curthread; v = MTX_UNOWNED; if (!_mtx_obtain_lock_fetch(m, &v, tid)) - _mtx_lock_sleep(m, v, tid, opts, file, line); + _mtx_lock_sleep(m, v, opts, file, line); else LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(adaptive__acquire, m, 0, 0, file, line); @@ -277,7 +277,7 @@ __mtx_unlock_flags(volatile uintptr_t *c, int opts, co mtx_assert(m, MA_OWNED); #ifdef LOCK_PROFILING - __mtx_unlock_sleep(c, opts, file, line); + __mtx_unlock_sleep(c, (uintptr_t)curthread, opts, file, line); #else __mtx_unlock(m, curthread, opts, file, line); #endif @@ -289,10 +289,10 @@ __mtx_lock_spin_flags(volatile uintptr_t *c, int opts, int line) { struct mtx *m; +#ifdef SMP + uintptr_t tid, v; +#endif - if (SCHEDULER_STOPPED()) - return; - m = mtxlock2mtx(c); KASSERT(m->mtx_lock != MTX_DESTROYED, @@ -308,7 +308,18 @@ __mtx_lock_spin_flags(volatile uintptr_t *c, int opts, opts &= ~MTX_RECURSE; WITNESS_CHECKORDER(&m->lock_object, opts | LOP_NEWORDER | LOP_EXCLUSIVE, file, line, NULL); +#ifdef SMP + spinlock_enter(); + tid = (uintptr_t)curthread; + v = MTX_UNOWNED; + if (!_mtx_obtain_lock_fetch(m, &v, tid)) + _mtx_lock_spin(m, v, opts, file, line); + else + LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(spin__acquire, + m, 0, 0, file, line); +#else __mtx_lock_spin(m, curthread, opts, file, line); +#endif LOCK_LOG_LOCK("LOCK", &m->lock_object, opts, m->mtx_recurse, file, line); WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE, file, line); @@ -348,9 +359,6 @@ __mtx_unlock_spin_flags(volatile uintptr_t *c, int opt { struct mtx *m; - if (SCHEDULER_STOPPED()) - return; - m = mtxlock2mtx(c); KASSERT(m->mtx_lock != MTX_DESTROYED, @@ -372,9 +380,8 @@ __mtx_unlock_spin_flags(volatile uintptr_t *c, int opt * is already owned, it will recursively acquire the lock. */ int -_mtx_trylock_flags_(volatile uintptr_t *c, int opts, const char *file, int line) +_mtx_trylock_flags_int(struct mtx *m, int opts LOCK_FILE_LINE_ARG_DEF) { - struct mtx *m; struct thread *td; uintptr_t tid, v; #ifdef LOCK_PROFILING @@ -389,8 +396,6 @@ _mtx_trylock_flags_(volatile uintptr_t *c, int opts, c if (SCHEDULER_STOPPED_TD(td)) return (1); - m = mtxlock2mtx(c); - KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(td), ("mtx_trylock() by idle thread %p on sleep mutex %s @ %s:%d", curthread, m->lock_object.lo_name, file, line)); @@ -435,6 +440,15 @@ _mtx_trylock_flags_(volatile uintptr_t *c, int opts, c return (rval); } +int +_mtx_trylock_flags_(volatile uintptr_t *c, int opts, const char *file, int line) +{ + struct mtx *m; + + m = mtxlock2mtx(c); + return (_mtx_trylock_flags_int(m, opts LOCK_FILE_LINE_ARG)); +} + /* * __mtx_lock_sleep: the tougher part of acquiring an MTX_DEF lock. * @@ -443,18 +457,18 @@ _mtx_trylock_flags_(volatile uintptr_t *c, int opts, c */ #if LOCK_DEBUG > 0 void -__mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v, uintptr_t tid, int opts, - const char *file, int line) +__mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v, int opts, const char *file, + int line) #else void -__mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v, uintptr_t tid) +__mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v) #endif { + struct thread *td; struct mtx *m; struct turnstile *ts; -#ifdef ADAPTIVE_MUTEXES - volatile struct thread *owner; -#endif + uintptr_t tid; + struct thread *owner; #ifdef KTR int cont_logged = 0; #endif @@ -473,8 +487,9 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v, u #if defined(KDTRACE_HOOKS) || defined(LOCK_PROFILING) int doing_lockprof; #endif - - if (SCHEDULER_STOPPED()) + td = curthread; + tid = (uintptr_t)td; + if (SCHEDULER_STOPPED_TD(td)) return; #if defined(ADAPTIVE_MUTEXES) @@ -486,7 +501,7 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v, u if (__predict_false(v == MTX_UNOWNED)) v = MTX_READ_VALUE(m); - if (__predict_false(lv_mtx_owner(v) == (struct thread *)tid)) { + if (__predict_false(lv_mtx_owner(v) == td)) { KASSERT((m->lock_object.lo_flags & LO_RECURSABLE) != 0 || (opts & MTX_RECURSE) != 0, ("_mtx_lock_sleep: recursed on non-recursive mutex %s @ %s:%d\n", @@ -618,7 +633,11 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t v, u #ifdef KDTRACE_HOOKS sleep_time -= lockstat_nsecs(&m->lock_object); #endif - turnstile_wait(ts, mtx_owner(m), TS_EXCLUSIVE_QUEUE); +#ifndef ADAPTIVE_MUTEXES + owner = mtx_owner(m); +#endif + MPASS(owner == mtx_owner(m)); + turnstile_wait(ts, owner, TS_EXCLUSIVE_QUEUE); #ifdef KDTRACE_HOOKS sleep_time += lockstat_nsecs(&m->lock_object); sleep_cnt++; @@ -679,12 +698,18 @@ _mtx_lock_spin_failed(struct mtx *m) * This is only called if we need to actually spin for the lock. Recursion * is handled inline. */ +#if LOCK_DEBUG > 0 void -_mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t v, uintptr_t tid, - int opts, const char *file, int line) +_mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t v, int opts, + const char *file, int line) +#else +void +_mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t v) +#endif { struct mtx *m; struct lock_delay_arg lda; + uintptr_t tid; #ifdef LOCK_PROFILING int contested = 0; uint64_t waittime = 0; @@ -696,10 +721,7 @@ _mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t int doing_lockprof; #endif - if (SCHEDULER_STOPPED()) - return; - - lock_delay_arg_init(&lda, &mtx_spin_delay); + tid = (uintptr_t)curthread; m = mtxlock2mtx(c); if (__predict_false(v == MTX_UNOWNED)) @@ -710,6 +732,11 @@ _mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t return; } + if (SCHEDULER_STOPPED()) + return; + + lock_delay_arg_init(&lda, &mtx_spin_delay); + if (LOCK_LOG_TEST(&m->lock_object, opts)) CTR1(KTR_LOCK, "_mtx_lock_spin: %p spinning", m); KTR_STATE1(KTR_SCHED, "thread", sched_tdname((struct thread *)tid), @@ -772,7 +799,74 @@ _mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t } #endif /* SMP */ +#ifdef INVARIANTS +static void +thread_lock_validate(struct mtx *m, int opts, const char *file, int line) +{ + + KASSERT(m->mtx_lock != MTX_DESTROYED, + ("thread_lock() of destroyed mutex @ %s:%d", file, line)); + KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin, + ("thread_lock() of sleep mutex %s @ %s:%d", + m->lock_object.lo_name, file, line)); + if (mtx_owned(m)) + KASSERT((m->lock_object.lo_flags & LO_RECURSABLE) != 0, + ("thread_lock: recursed on non-recursive mutex %s @ %s:%d\n", + m->lock_object.lo_name, file, line)); + WITNESS_CHECKORDER(&m->lock_object, + opts | LOP_NEWORDER | LOP_EXCLUSIVE, file, line, NULL); +} +#else +#define thread_lock_validate(m, opts, file, line) do { } while (0) +#endif + +#ifndef LOCK_PROFILING +#if LOCK_DEBUG > 0 void +_thread_lock(struct thread *td, int opts, const char *file, int line) +#else +void +_thread_lock(struct thread *td) +#endif +{ + struct mtx *m; + uintptr_t tid, v; + + tid = (uintptr_t)curthread; + + if (__predict_false(LOCKSTAT_PROFILE_ENABLED(spin__acquire))) + goto slowpath_noirq; + spinlock_enter(); + m = td->td_lock; + thread_lock_validate(m, 0, file, line); + v = MTX_READ_VALUE(m); + if (__predict_true(v == MTX_UNOWNED)) { + if (__predict_false(!_mtx_obtain_lock(m, tid))) + goto slowpath_unlocked; + } else if (v == tid) { + m->mtx_recurse++; + } else + goto slowpath_unlocked; + if (__predict_true(m == td->td_lock)) { + WITNESS_LOCK(&m->lock_object, LOP_EXCLUSIVE, file, line); + return; + } + if (m->mtx_recurse != 0) + m->mtx_recurse--; + else + _mtx_release_lock_quick(m); +slowpath_unlocked: + spinlock_exit(); +slowpath_noirq: +#if LOCK_DEBUG > 0 + thread_lock_flags_(td, opts, file, line); +#else + thread_lock_flags_(td, 0, 0, 0); +#endif +} +#endif + +void thread_lock_flags_(struct thread *td, int opts, const char *file, int line) { struct mtx *m; @@ -815,17 +909,7 @@ retry: v = MTX_UNOWNED; spinlock_enter(); m = td->td_lock; - KASSERT(m->mtx_lock != MTX_DESTROYED, - ("thread_lock() of destroyed mutex @ %s:%d", file, line)); - KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin, - ("thread_lock() of sleep mutex %s @ %s:%d", - m->lock_object.lo_name, file, line)); - if (mtx_owned(m)) - KASSERT((m->lock_object.lo_flags & LO_RECURSABLE) != 0, - ("thread_lock: recursed on non-recursive mutex %s @ %s:%d\n", - m->lock_object.lo_name, file, line)); - WITNESS_CHECKORDER(&m->lock_object, - opts | LOP_NEWORDER | LOP_EXCLUSIVE, file, line, NULL); + thread_lock_validate(m, opts, file, line); for (;;) { if (_mtx_obtain_lock_fetch(m, &v, tid)) break; @@ -925,24 +1009,27 @@ thread_lock_set(struct thread *td, struct mtx *new) */ #if LOCK_DEBUG > 0 void -__mtx_unlock_sleep(volatile uintptr_t *c, int opts, const char *file, int line) +__mtx_unlock_sleep(volatile uintptr_t *c, uintptr_t v, int opts, + const char *file, int line) #else void -__mtx_unlock_sleep(volatile uintptr_t *c) +__mtx_unlock_sleep(volatile uintptr_t *c, uintptr_t v) #endif { struct mtx *m; struct turnstile *ts; - uintptr_t tid, v; + uintptr_t tid; if (SCHEDULER_STOPPED()) return; tid = (uintptr_t)curthread; m = mtxlock2mtx(c); - v = MTX_READ_VALUE(m); - if (v & MTX_RECURSED) { + if (__predict_false(v == tid)) + v = MTX_READ_VALUE(m); + + if (__predict_false(v & MTX_RECURSED)) { if (--(m->mtx_recurse) == 0) atomic_clear_ptr(&m->mtx_lock, MTX_RECURSED); if (LOCK_LOG_TEST(&m->lock_object, opts)) @@ -959,12 +1046,12 @@ __mtx_unlock_sleep(volatile uintptr_t *c) * can be removed from the hash list if it is empty. */ turnstile_chain_lock(&m->lock_object); + _mtx_release_lock_quick(m); ts = turnstile_lookup(&m->lock_object); + MPASS(ts != NULL); if (LOCK_LOG_TEST(&m->lock_object, opts)) CTR1(KTR_LOCK, "_mtx_unlock_sleep: %p contested", m); - MPASS(ts != NULL); turnstile_broadcast(ts, TS_EXCLUSIVE_QUEUE); - _mtx_release_lock_quick(m); /* * This turnstile is now no longer associated with the mutex. We can Modified: stable/11/sys/kern/kern_rwlock.c ============================================================================== --- stable/11/sys/kern/kern_rwlock.c Sun Dec 31 04:09:40 2017 (r327412) +++ stable/11/sys/kern/kern_rwlock.c Sun Dec 31 05:06:35 2017 (r327413) @@ -273,7 +273,7 @@ _rw_wlock_cookie(volatile uintptr_t *c, const char *fi tid = (uintptr_t)curthread; v = RW_UNLOCKED; if (!_rw_write_lock_fetch(rw, &v, tid)) - _rw_wlock_hard(rw, v, tid, file, line); + _rw_wlock_hard(rw, v, file, line); else LOCKSTAT_PROFILE_OBTAIN_RWLOCK_SUCCESS(rw__acquire, rw, 0, 0, file, line, LOCKSTAT_WRITER); @@ -284,9 +284,8 @@ _rw_wlock_cookie(volatile uintptr_t *c, const char *fi } int -__rw_try_wlock(volatile uintptr_t *c, const char *file, int line) +__rw_try_wlock_int(struct rwlock *rw LOCK_FILE_LINE_ARG_DEF) { - struct rwlock *rw; struct thread *td; uintptr_t tid, v; int rval; @@ -297,8 +296,6 @@ __rw_try_wlock(volatile uintptr_t *c, const char *file if (SCHEDULER_STOPPED_TD(td)) return (1); - rw = rwlock2rw(c); - KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(td), ("rw_try_wlock() by idle thread %p on rwlock %s @ %s:%d", curthread, rw->lock_object.lo_name, file, line)); @@ -334,6 +331,15 @@ __rw_try_wlock(volatile uintptr_t *c, const char *file return (rval); } +int +__rw_try_wlock(volatile uintptr_t *c, const char *file, int line) +{ + struct rwlock *rw; + + rw = rwlock2rw(c); + return (__rw_try_wlock_int(rw LOCK_FILE_LINE_ARG)); +} + void _rw_wunlock_cookie(volatile uintptr_t *c, const char *file, int line) { @@ -364,14 +370,21 @@ _rw_wunlock_cookie(volatile uintptr_t *c, const char * * is unlocked and has no writer waiters or spinners. Failing otherwise * prioritizes writers before readers. */ -#define RW_CAN_READ(td, _rw) \ - (((td)->td_rw_rlocks && (_rw) & RW_LOCK_READ) || ((_rw) & \ - (RW_LOCK_READ | RW_LOCK_WRITE_WAITERS | RW_LOCK_WRITE_SPINNER)) == \ - RW_LOCK_READ) +static bool __always_inline +__rw_can_read(struct thread *td, uintptr_t v, bool fp) +{ + if ((v & (RW_LOCK_READ | RW_LOCK_WRITE_WAITERS | RW_LOCK_WRITE_SPINNER)) + == RW_LOCK_READ) + return (true); + if (!fp && td->td_rw_rlocks && (v & RW_LOCK_READ)) + return (true); + return (false); +} + static bool __always_inline -__rw_rlock_try(struct rwlock *rw, struct thread *td, uintptr_t *vp, - const char *file, int line) +__rw_rlock_try(struct rwlock *rw, struct thread *td, uintptr_t *vp, bool fp + LOCK_FILE_LINE_ARG_DEF) { /* @@ -384,7 +397,7 @@ __rw_rlock_try(struct rwlock *rw, struct thread *td, u * completely unlocked rwlock since such a lock is encoded * as a read lock with no waiters. */ - while (RW_CAN_READ(td, *vp)) { + while (__rw_can_read(td, *vp, fp)) { if (atomic_fcmpset_acq_ptr(&rw->rw_lock, vp, *vp + RW_ONE_READER)) { if (LOCK_LOG_TEST(&rw->lock_object, 0)) @@ -400,13 +413,12 @@ __rw_rlock_try(struct rwlock *rw, struct thread *td, u } static void __noinline -__rw_rlock_hard(volatile uintptr_t *c, struct thread *td, uintptr_t v, - const char *file, int line) +__rw_rlock_hard(struct rwlock *rw, struct thread *td, uintptr_t v + LOCK_FILE_LINE_ARG_DEF) { - struct rwlock *rw; struct turnstile *ts; + struct thread *owner; #ifdef ADAPTIVE_RWLOCKS - volatile struct thread *owner; int spintries = 0; int i; #endif @@ -418,11 +430,14 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * struct lock_delay_arg lda; #endif #ifdef KDTRACE_HOOKS - uintptr_t state; u_int sleep_cnt = 0; int64_t sleep_time = 0; int64_t all_time = 0; #endif +#if defined(KDTRACE_HOOKS) || defined(LOCK_PROFILING) + uintptr_t state; + int doing_lockprof; +#endif if (SCHEDULER_STOPPED()) return; @@ -432,25 +447,30 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * #elif defined(KDTRACE_HOOKS) lock_delay_arg_init(&lda, NULL); #endif - rw = rwlock2rw(c); -#ifdef KDTRACE_HOOKS - all_time -= lockstat_nsecs(&rw->lock_object); +#ifdef HWPMC_HOOKS + PMC_SOFT_CALL( , , lock, failed); #endif -#ifdef KDTRACE_HOOKS + lock_profile_obtain_lock_failed(&rw->lock_object, + &contested, &waittime); + +#ifdef LOCK_PROFILING + doing_lockprof = 1; state = v; +#elif defined(KDTRACE_HOOKS) + doing_lockprof = lockstat_enabled; + if (__predict_false(doing_lockprof)) { + all_time -= lockstat_nsecs(&rw->lock_object); + state = v; + } #endif + for (;;) { - if (__rw_rlock_try(rw, td, &v, file, line)) + if (__rw_rlock_try(rw, td, &v, false LOCK_FILE_LINE_ARG)) break; #ifdef KDTRACE_HOOKS lda.spin_cnt++; #endif -#ifdef HWPMC_HOOKS - PMC_SOFT_CALL( , , lock, failed); -#endif - lock_profile_obtain_lock_failed(&rw->lock_object, - &contested, &waittime); #ifdef ADAPTIVE_RWLOCKS /* @@ -483,12 +503,11 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * "spinning", "lockname:\"%s\"", rw->lock_object.lo_name); for (i = 0; i < rowner_loops; i++) { + cpu_spinwait(); v = RW_READ_VALUE(rw); - if ((v & RW_LOCK_READ) == 0 || RW_CAN_READ(td, v)) + if ((v & RW_LOCK_READ) == 0 || __rw_can_read(td, v, false)) break; - cpu_spinwait(); } - v = RW_READ_VALUE(rw); #ifdef KDTRACE_HOOKS lda.spin_cnt += rowner_loops - i; #endif @@ -512,11 +531,14 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * * recheck its state and restart the loop if needed. */ v = RW_READ_VALUE(rw); - if (RW_CAN_READ(td, v)) { +retry_ts: + if (__rw_can_read(td, v, false)) { turnstile_cancel(ts); continue; } + owner = lv_rw_wowner(v); + #ifdef ADAPTIVE_RWLOCKS /* * The current lock owner might have started executing @@ -525,8 +547,7 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * * chain lock. If so, drop the turnstile lock and try * again. */ - if ((v & RW_LOCK_READ) == 0) { - owner = (struct thread *)RW_OWNER(v); + if (owner != NULL) { if (TD_IS_RUNNING(owner)) { turnstile_cancel(ts); continue; @@ -537,7 +558,7 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * /* * The lock is held in write mode or it already has waiters. */ - MPASS(!RW_CAN_READ(td, v)); + MPASS(!__rw_can_read(td, v, false)); /* * If the RW_LOCK_READ_WAITERS flag is already set, then @@ -546,12 +567,9 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * * lock and restart the loop. */ if (!(v & RW_LOCK_READ_WAITERS)) { - if (!atomic_cmpset_ptr(&rw->rw_lock, v, - v | RW_LOCK_READ_WAITERS)) { - turnstile_cancel(ts); - v = RW_READ_VALUE(rw); - continue; - } + if (!atomic_fcmpset_ptr(&rw->rw_lock, &v, + v | RW_LOCK_READ_WAITERS)) + goto retry_ts; if (LOCK_LOG_TEST(&rw->lock_object, 0)) CTR2(KTR_LOCK, "%s: %p set read waiters flag", __func__, rw); @@ -567,7 +585,8 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * #ifdef KDTRACE_HOOKS sleep_time -= lockstat_nsecs(&rw->lock_object); #endif - turnstile_wait(ts, rw_owner(rw), TS_SHARED_QUEUE); + MPASS(owner == rw_owner(rw)); + turnstile_wait(ts, owner, TS_SHARED_QUEUE); #ifdef KDTRACE_HOOKS sleep_time += lockstat_nsecs(&rw->lock_object); sleep_cnt++; @@ -577,6 +596,10 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * __func__, rw); v = RW_READ_VALUE(rw); } +#if defined(KDTRACE_HOOKS) || defined(LOCK_PROFILING) + if (__predict_true(!doing_lockprof)) + return; +#endif #ifdef KDTRACE_HOOKS all_time += lockstat_nsecs(&rw->lock_object); if (sleep_time) @@ -600,14 +623,12 @@ __rw_rlock_hard(volatile uintptr_t *c, struct thread * } void -__rw_rlock(volatile uintptr_t *c, const char *file, int line) +__rw_rlock_int(struct rwlock *rw LOCK_FILE_LINE_ARG_DEF) { - struct rwlock *rw; struct thread *td; uintptr_t v; td = curthread; - rw = rwlock2rw(c); KASSERT(kdb_active != 0 || SCHEDULER_STOPPED_TD(td) || !TD_IS_IDLETHREAD(td), @@ -622,25 +643,31 @@ __rw_rlock(volatile uintptr_t *c, const char *file, in v = RW_READ_VALUE(rw); if (__predict_false(LOCKSTAT_OOL_PROFILE_ENABLED(rw__acquire) || - !__rw_rlock_try(rw, td, &v, file, line))) - __rw_rlock_hard(c, td, v, file, line); + !__rw_rlock_try(rw, td, &v, true LOCK_FILE_LINE_ARG))) + __rw_rlock_hard(rw, td, v LOCK_FILE_LINE_ARG); LOCK_LOG_LOCK("RLOCK", &rw->lock_object, 0, 0, file, line); WITNESS_LOCK(&rw->lock_object, 0, file, line); TD_LOCKS_INC(curthread); } -int -__rw_try_rlock(volatile uintptr_t *c, const char *file, int line) +void +__rw_rlock(volatile uintptr_t *c, const char *file, int line) { struct rwlock *rw; + + rw = rwlock2rw(c); + __rw_rlock_int(rw LOCK_FILE_LINE_ARG); +} + +int +__rw_try_rlock_int(struct rwlock *rw LOCK_FILE_LINE_ARG_DEF) +{ uintptr_t x; if (SCHEDULER_STOPPED()) return (1); - rw = rwlock2rw(c); - KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread), ("rw_try_rlock() by idle thread %p on rwlock %s @ %s:%d", curthread, rw->lock_object.lo_name, file, line)); @@ -667,6 +694,15 @@ __rw_try_rlock(volatile uintptr_t *c, const char *file return (0); } +int +__rw_try_rlock(volatile uintptr_t *c, const char *file, int line) +{ + struct rwlock *rw; + + rw = rwlock2rw(c); + return (__rw_try_rlock_int(rw LOCK_FILE_LINE_ARG)); +} + static bool __always_inline __rw_runlock_try(struct rwlock *rw, struct thread *td, uintptr_t *vp) { @@ -712,18 +748,15 @@ __rw_runlock_try(struct rwlock *rw, struct thread *td, } static void __noinline -__rw_runlock_hard(volatile uintptr_t *c, struct thread *td, uintptr_t v, - const char *file, int line) +__rw_runlock_hard(struct rwlock *rw, struct thread *td, uintptr_t v + LOCK_FILE_LINE_ARG_DEF) { - struct rwlock *rw; struct turnstile *ts; uintptr_t x, queue; if (SCHEDULER_STOPPED()) return; - rw = rwlock2rw(c); - for (;;) { if (__rw_runlock_try(rw, td, &v)) break; @@ -733,7 +766,14 @@ __rw_runlock_hard(volatile uintptr_t *c, struct thread * last reader, so grab the turnstile lock. */ turnstile_chain_lock(&rw->lock_object); - v = rw->rw_lock & (RW_LOCK_WAITERS | RW_LOCK_WRITE_SPINNER); + v = RW_READ_VALUE(rw); +retry_ts: + if (__predict_false(RW_READERS(v) > 1)) { + turnstile_chain_unlock(&rw->lock_object); + continue; + } + + v &= (RW_LOCK_WAITERS | RW_LOCK_WRITE_SPINNER); MPASS(v & RW_LOCK_WAITERS); /* @@ -758,12 +798,9 @@ __rw_runlock_hard(volatile uintptr_t *c, struct thread x |= (v & RW_LOCK_READ_WAITERS); } else queue = TS_SHARED_QUEUE; - if (!atomic_cmpset_rel_ptr(&rw->rw_lock, RW_READERS_LOCK(1) | v, - x)) { - turnstile_chain_unlock(&rw->lock_object); - v = RW_READ_VALUE(rw); - continue; - } + v |= RW_READERS_LOCK(1); + if (!atomic_fcmpset_rel_ptr(&rw->rw_lock, &v, x)) + goto retry_ts; if (LOCK_LOG_TEST(&rw->lock_object, 0)) CTR2(KTR_LOCK, "%s: %p last succeeded with waiters", __func__, rw); @@ -787,17 +824,14 @@ __rw_runlock_hard(volatile uintptr_t *c, struct thread } void -_rw_runlock_cookie(volatile uintptr_t *c, const char *file, int line) +_rw_runlock_cookie_int(struct rwlock *rw LOCK_FILE_LINE_ARG_DEF) { - struct rwlock *rw; struct thread *td; uintptr_t v; - rw = rwlock2rw(c); - KASSERT(rw->rw_lock != RW_DESTROYED, ("rw_runlock() of destroyed rwlock @ %s:%d", file, line)); - __rw_assert(c, RA_RLOCKED, file, line); + __rw_assert(&rw->rw_lock, RA_RLOCKED, file, line); WITNESS_UNLOCK(&rw->lock_object, 0, file, line); LOCK_LOG_LOCK("RUNLOCK", &rw->lock_object, 0, 0, file, line); @@ -806,24 +840,33 @@ _rw_runlock_cookie(volatile uintptr_t *c, const char * if (__predict_false(LOCKSTAT_OOL_PROFILE_ENABLED(rw__release) || !__rw_runlock_try(rw, td, &v))) - __rw_runlock_hard(c, td, v, file, line); + __rw_runlock_hard(rw, td, v LOCK_FILE_LINE_ARG); TD_LOCKS_DEC(curthread); } +void +_rw_runlock_cookie(volatile uintptr_t *c, const char *file, int line) +{ + struct rwlock *rw; + + rw = rwlock2rw(c); + _rw_runlock_cookie_int(rw LOCK_FILE_LINE_ARG); +} + /* * This function is called when we are unable to obtain a write lock on the * first try. This means that at least one other thread holds either a * read or write lock. */ void -__rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, uintptr_t tid, - const char *file, int line) +__rw_wlock_hard(volatile uintptr_t *c, uintptr_t v LOCK_FILE_LINE_ARG_DEF) { + uintptr_t tid; struct rwlock *rw; struct turnstile *ts; + struct thread *owner; #ifdef ADAPTIVE_RWLOCKS - volatile struct thread *owner; int spintries = 0; int i; #endif @@ -836,12 +879,16 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui struct lock_delay_arg lda; #endif #ifdef KDTRACE_HOOKS - uintptr_t state; u_int sleep_cnt = 0; int64_t sleep_time = 0; int64_t all_time = 0; #endif +#if defined(KDTRACE_HOOKS) || defined(LOCK_PROFILING) + uintptr_t state; + int doing_lockprof; +#endif + tid = (uintptr_t)curthread; if (SCHEDULER_STOPPED()) return; @@ -869,10 +916,23 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui CTR5(KTR_LOCK, "%s: %s contested (lock=%p) at %s:%d", __func__, rw->lock_object.lo_name, (void *)rw->rw_lock, file, line); -#ifdef KDTRACE_HOOKS - all_time -= lockstat_nsecs(&rw->lock_object); +#ifdef HWPMC_HOOKS + PMC_SOFT_CALL( , , lock, failed); +#endif + lock_profile_obtain_lock_failed(&rw->lock_object, + &contested, &waittime); + +#ifdef LOCK_PROFILING + doing_lockprof = 1; state = v; +#elif defined(KDTRACE_HOOKS) + doing_lockprof = lockstat_enabled; + if (__predict_false(doing_lockprof)) { + all_time -= lockstat_nsecs(&rw->lock_object); + state = v; + } #endif + for (;;) { if (v == RW_UNLOCKED) { if (_rw_write_lock_fetch(rw, &v, tid)) @@ -882,11 +942,7 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui #ifdef KDTRACE_HOOKS lda.spin_cnt++; #endif -#ifdef HWPMC_HOOKS - PMC_SOFT_CALL( , , lock, failed); -#endif - lock_profile_obtain_lock_failed(&rw->lock_object, - &contested, &waittime); + #ifdef ADAPTIVE_RWLOCKS /* * If the lock is write locked and the owner is @@ -913,9 +969,8 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui if ((v & RW_LOCK_READ) && RW_READERS(v) && spintries < rowner_retries) { if (!(v & RW_LOCK_WRITE_SPINNER)) { - if (!atomic_cmpset_ptr(&rw->rw_lock, v, + if (!atomic_fcmpset_ptr(&rw->rw_lock, &v, v | RW_LOCK_WRITE_SPINNER)) { - v = RW_READ_VALUE(rw); continue; } } @@ -924,13 +979,13 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui "spinning", "lockname:\"%s\"", rw->lock_object.lo_name); for (i = 0; i < rowner_loops; i++) { - if ((rw->rw_lock & RW_LOCK_WRITE_SPINNER) == 0) - break; cpu_spinwait(); + v = RW_READ_VALUE(rw); + if ((v & RW_LOCK_WRITE_SPINNER) == 0) + break; } KTR_STATE0(KTR_SCHED, "thread", sched_tdname(curthread), "running"); - v = RW_READ_VALUE(rw); #ifdef KDTRACE_HOOKS lda.spin_cnt += rowner_loops - i; #endif @@ -940,6 +995,8 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui #endif ts = turnstile_trywait(&rw->lock_object); v = RW_READ_VALUE(rw); +retry_ts: + owner = lv_rw_wowner(v); #ifdef ADAPTIVE_RWLOCKS /* @@ -949,8 +1006,7 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui * chain lock. If so, drop the turnstile lock and try * again. */ - if (!(v & RW_LOCK_READ)) { - owner = (struct thread *)RW_OWNER(v); + if (owner != NULL) { if (TD_IS_RUNNING(owner)) { turnstile_cancel(ts); continue; @@ -967,16 +1023,14 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui x = v & (RW_LOCK_WAITERS | RW_LOCK_WRITE_SPINNER); if ((v & ~x) == RW_UNLOCKED) { x &= ~RW_LOCK_WRITE_SPINNER; - if (atomic_cmpset_acq_ptr(&rw->rw_lock, v, tid | x)) { + if (atomic_fcmpset_acq_ptr(&rw->rw_lock, &v, tid | x)) { if (x) turnstile_claim(ts); else turnstile_cancel(ts); break; } - turnstile_cancel(ts); - v = RW_READ_VALUE(rw); - continue; + goto retry_ts; } /* * If the RW_LOCK_WRITE_WAITERS flag isn't set, then try to @@ -984,12 +1038,9 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui * again. */ if (!(v & RW_LOCK_WRITE_WAITERS)) { - if (!atomic_cmpset_ptr(&rw->rw_lock, v, - v | RW_LOCK_WRITE_WAITERS)) { - turnstile_cancel(ts); - v = RW_READ_VALUE(rw); - continue; - } + if (!atomic_fcmpset_ptr(&rw->rw_lock, &v, + v | RW_LOCK_WRITE_WAITERS)) + goto retry_ts; if (LOCK_LOG_TEST(&rw->lock_object, 0)) CTR2(KTR_LOCK, "%s: %p set write waiters flag", __func__, rw); @@ -1004,7 +1055,8 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui #ifdef KDTRACE_HOOKS sleep_time -= lockstat_nsecs(&rw->lock_object); #endif - turnstile_wait(ts, rw_owner(rw), TS_EXCLUSIVE_QUEUE); + MPASS(owner == rw_owner(rw)); + turnstile_wait(ts, owner, TS_EXCLUSIVE_QUEUE); #ifdef KDTRACE_HOOKS sleep_time += lockstat_nsecs(&rw->lock_object); sleep_cnt++; @@ -1017,6 +1069,10 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui #endif v = RW_READ_VALUE(rw); } +#if defined(KDTRACE_HOOKS) || defined(LOCK_PROFILING) + if (__predict_true(!doing_lockprof)) + return; +#endif #ifdef KDTRACE_HOOKS all_time += lockstat_nsecs(&rw->lock_object); if (sleep_time) @@ -1041,19 +1097,21 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v, ui * on this lock. */ void -__rw_wunlock_hard(volatile uintptr_t *c, uintptr_t tid, const char *file, - int line) +__rw_wunlock_hard(volatile uintptr_t *c, uintptr_t v LOCK_FILE_LINE_ARG_DEF) { struct rwlock *rw; struct turnstile *ts; - uintptr_t v; + uintptr_t tid, setv; int queue; + tid = (uintptr_t)curthread; if (SCHEDULER_STOPPED()) return; rw = rwlock2rw(c); - v = RW_READ_VALUE(rw); + if (__predict_false(v == tid)) + v = RW_READ_VALUE(rw); + if (v & RW_LOCK_WRITER_RECURSED) { if (--(rw->rw_recurse) == 0) atomic_clear_ptr(&rw->rw_lock, RW_LOCK_WRITER_RECURSED); @@ -1073,8 +1131,6 @@ __rw_wunlock_hard(volatile uintptr_t *c, uintptr_t tid CTR2(KTR_LOCK, "%s: %p contested", __func__, rw); turnstile_chain_lock(&rw->lock_object); - ts = turnstile_lookup(&rw->lock_object); - MPASS(ts != NULL); /* * Use the same algo as sx locks for now. Prefer waking up shared @@ -1092,19 +1148,23 @@ __rw_wunlock_hard(volatile uintptr_t *c, uintptr_t tid * there that could be worked around either by waking both queues * of waiters or doing some complicated lock handoff gymnastics. */ - v = RW_UNLOCKED; - if (rw->rw_lock & RW_LOCK_WRITE_WAITERS) { + setv = RW_UNLOCKED; + v = RW_READ_VALUE(rw); + queue = TS_SHARED_QUEUE; + if (v & RW_LOCK_WRITE_WAITERS) { queue = TS_EXCLUSIVE_QUEUE; - v |= (rw->rw_lock & RW_LOCK_READ_WAITERS); - } else - queue = TS_SHARED_QUEUE; + setv |= (v & RW_LOCK_READ_WAITERS); + } + atomic_store_rel_ptr(&rw->rw_lock, setv); /* Wake up all waiters for the specific queue. */ if (LOCK_LOG_TEST(&rw->lock_object, 0)) CTR3(KTR_LOCK, "%s: %p waking up %s waiters", __func__, rw, queue == TS_SHARED_QUEUE ? "read" : "write"); + + ts = turnstile_lookup(&rw->lock_object); + MPASS(ts != NULL); *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***