From owner-svn-src-all@freebsd.org Tue Jan 2 00:02:37 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 86674EA3C52; Tue, 2 Jan 2018 00:02:37 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4A74B6424E; Tue, 2 Jan 2018 00:02:37 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w0202aWi025159; Tue, 2 Jan 2018 00:02:36 GMT (envelope-from mjg@FreeBSD.org) Received: (from mjg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w0202agg025156; Tue, 2 Jan 2018 00:02:36 GMT (envelope-from mjg@FreeBSD.org) Message-Id: <201801020002.w0202agg025156@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mjg set sender to mjg@FreeBSD.org using -f From: Mateusz Guzik Date: Tue, 2 Jan 2018 00:02:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r327478 - in stable/11/sys: kern sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mjg X-SVN-Commit-Paths: in stable/11/sys: kern sys X-SVN-Commit-Revision: 327478 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Jan 2018 00:02:37 -0000 Author: mjg Date: Tue Jan 2 00:02:36 2018 New Revision: 327478 URL: https://svnweb.freebsd.org/changeset/base/327478 Log: MFC r324335,r327393,r327397,r327401,r327402: locks: take the number of readers into account when waiting Previous code would always spin once before checking the lock. But a lock with e.g. 6 readers is not going to become free in the duration of once spin even if they start draining immediately. Conservatively perform one for each reader. Note that the total number of allowed spins is still extremely small and is subject to change later. ============= rwlock: tidy up __rw_runlock_hard similarly to r325921 ============= sx: read the SX_NOADAPTIVE flag and Giant ownership only once These used to be read multiple times when waiting for the lock the become free, which had the potential to issue completely avoidable traffic. ============= locks: re-check the reason to go to sleep after locking sleepq/turnstile In both rw and sx locks we always go to sleep if the lock owner is not running. We do spin for some time if the lock is read-locked. However, if we decide to go to sleep due to the lock owner being off cpu and after sleepq/turnstile gets acquired the lock is read-locked, we should fallback to the aforementioned wait. ============= sx: fix up non-smp compilation after r327397 ============= locks: adjust loop limit check when waiting for readers The check was for the exact value, but since the counter started being incremented by the number of readers it could have jumped over. ============= Return a non-NULL owner only if the lock is exclusively held in owner_sx(). Fix some whitespace bugs while here. Modified: stable/11/sys/kern/kern_rwlock.c stable/11/sys/kern/kern_sx.c stable/11/sys/sys/lock.h Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/kern_rwlock.c ============================================================================== --- stable/11/sys/kern/kern_rwlock.c Mon Jan 1 23:45:09 2018 (r327477) +++ stable/11/sys/kern/kern_rwlock.c Tue Jan 2 00:02:36 2018 (r327478) @@ -420,7 +420,7 @@ __rw_rlock_hard(struct rwlock *rw, struct thread *td, struct thread *owner; #ifdef ADAPTIVE_RWLOCKS int spintries = 0; - int i; + int i, n; #endif #ifdef LOCK_PROFILING uint64_t waittime = 0; @@ -502,8 +502,9 @@ __rw_rlock_hard(struct rwlock *rw, struct thread *td, KTR_STATE1(KTR_SCHED, "thread", sched_tdname(curthread), "spinning", "lockname:\"%s\"", rw->lock_object.lo_name); - for (i = 0; i < rowner_loops; i++) { - cpu_spinwait(); + for (i = 0; i < rowner_loops; i += n) { + n = RW_READERS(v); + lock_delay_spin(n); v = RW_READ_VALUE(rw); if ((v & RW_LOCK_READ) == 0 || __rw_can_read(td, v, false)) break; @@ -513,7 +514,7 @@ __rw_rlock_hard(struct rwlock *rw, struct thread *td, #endif KTR_STATE0(KTR_SCHED, "thread", sched_tdname(curthread), "running"); - if (i != rowner_loops) + if (i < rowner_loops) continue; } #endif @@ -752,7 +753,7 @@ __rw_runlock_hard(struct rwlock *rw, struct thread *td LOCK_FILE_LINE_ARG_DEF) { struct turnstile *ts; - uintptr_t x, queue; + uintptr_t setv, queue; if (SCHEDULER_STOPPED()) return; @@ -792,14 +793,14 @@ retry_ts: * acquired a read lock, so drop the turnstile lock and * restart. */ - x = RW_UNLOCKED; + setv = RW_UNLOCKED; + queue = TS_SHARED_QUEUE; if (v & RW_LOCK_WRITE_WAITERS) { queue = TS_EXCLUSIVE_QUEUE; - x |= (v & RW_LOCK_READ_WAITERS); - } else - queue = TS_SHARED_QUEUE; + setv |= (v & RW_LOCK_READ_WAITERS); + } v |= RW_READERS_LOCK(1); - if (!atomic_fcmpset_rel_ptr(&rw->rw_lock, &v, x)) + if (!atomic_fcmpset_rel_ptr(&rw->rw_lock, &v, setv)) goto retry_ts; if (LOCK_LOG_TEST(&rw->lock_object, 0)) CTR2(KTR_LOCK, "%s: %p last succeeded with waiters", @@ -868,7 +869,8 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v LOC struct thread *owner; #ifdef ADAPTIVE_RWLOCKS int spintries = 0; - int i; + int i, n; + int sleep_reason = 0; #endif uintptr_t x; #ifdef LOCK_PROFILING @@ -949,6 +951,7 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v LOC * running on another CPU, spin until the owner stops * running or the state of the lock changes. */ + sleep_reason = 1; owner = lv_rw_wowner(v); if (!(v & RW_LOCK_READ) && TD_IS_RUNNING(owner)) { if (LOCK_LOG_TEST(&rw->lock_object, 0)) @@ -978,8 +981,9 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v LOC KTR_STATE1(KTR_SCHED, "thread", sched_tdname(curthread), "spinning", "lockname:\"%s\"", rw->lock_object.lo_name); - for (i = 0; i < rowner_loops; i++) { - cpu_spinwait(); + for (i = 0; i < rowner_loops; i += n) { + n = RW_READERS(v); + lock_delay_spin(n); v = RW_READ_VALUE(rw); if ((v & RW_LOCK_WRITE_SPINNER) == 0) break; @@ -989,8 +993,9 @@ __rw_wlock_hard(volatile uintptr_t *c, uintptr_t v LOC #ifdef KDTRACE_HOOKS lda.spin_cnt += rowner_loops - i; #endif - if (i != rowner_loops) + if (i < rowner_loops) continue; + sleep_reason = 2; } #endif ts = turnstile_trywait(&rw->lock_object); @@ -1011,6 +1016,9 @@ retry_ts: turnstile_cancel(ts); continue; } + } else if (RW_READERS(v) > 0 && sleep_reason == 1) { + turnstile_cancel(ts); + continue; } #endif /* Modified: stable/11/sys/kern/kern_sx.c ============================================================================== --- stable/11/sys/kern/kern_sx.c Mon Jan 1 23:45:09 2018 (r327477) +++ stable/11/sys/kern/kern_sx.c Tue Jan 2 00:02:36 2018 (r327478) @@ -89,7 +89,7 @@ PMC_SOFT_DECLARE( , , lock, failed); WITNESS_SAVE_DECL(Giant) \ #define GIANT_SAVE(work) do { \ - if (mtx_owned(&Giant)) { \ + if (__predict_false(mtx_owned(&Giant))) { \ work++; \ WITNESS_SAVE(&Giant.lock_object, Giant); \ while (mtx_owned(&Giant)) { \ @@ -198,12 +198,14 @@ unlock_sx(struct lock_object *lock) int owner_sx(const struct lock_object *lock, struct thread **owner) { - const struct sx *sx = (const struct sx *)lock; - uintptr_t x = sx->sx_lock; + const struct sx *sx; + uintptr_t x; - *owner = (struct thread *)SX_OWNER(x); - return ((x & SX_LOCK_SHARED) != 0 ? (SX_SHARERS(x) != 0) : - (*owner != NULL)); + sx = (const struct sx *)lock; + x = sx->sx_lock; + *owner = NULL; + return ((x & SX_LOCK_SHARED) != 0 ? (SX_SHARERS(x) != 0) : + ((*owner = (struct thread *)SX_OWNER(x)) != NULL)); } #endif @@ -528,7 +530,9 @@ _sx_xlock_hard(struct sx *sx, uintptr_t x, int opts LO uintptr_t tid; #ifdef ADAPTIVE_SX volatile struct thread *owner; - u_int i, spintries = 0; + u_int i, n, spintries = 0; + bool adaptive; + int sleep_reason = 0; #endif #ifdef LOCK_PROFILING uint64_t waittime = 0; @@ -577,6 +581,10 @@ _sx_xlock_hard(struct sx *sx, uintptr_t x, int opts LO CTR5(KTR_LOCK, "%s: %s contested (lock=%p) at %s:%d", __func__, sx->lock_object.lo_name, (void *)sx->sx_lock, file, line); +#ifdef ADAPTIVE_SX + adaptive = ((sx->lock_object.lo_flags & SX_NOADAPTIVE) != 0); +#endif + #ifdef HWPMC_HOOKS PMC_SOFT_CALL( , , lock, failed); #endif @@ -593,6 +601,9 @@ _sx_xlock_hard(struct sx *sx, uintptr_t x, int opts LO state = x; } #endif +#ifndef INVARIANTS + GIANT_SAVE(extra_work); +#endif for (;;) { if (x == SX_LOCK_UNLOCKED) { @@ -600,66 +611,70 @@ _sx_xlock_hard(struct sx *sx, uintptr_t x, int opts LO break; continue; } +#ifdef INVARIANTS + GIANT_SAVE(extra_work); +#endif #ifdef KDTRACE_HOOKS lda.spin_cnt++; #endif #ifdef ADAPTIVE_SX + if (__predict_false(!adaptive)) + goto sleepq; /* * If the lock is write locked and the owner is * running on another CPU, spin until the owner stops * running or the state of the lock changes. */ - if ((sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) { - if ((x & SX_LOCK_SHARED) == 0) { - owner = lv_sx_owner(x); - if (TD_IS_RUNNING(owner)) { - if (LOCK_LOG_TEST(&sx->lock_object, 0)) - CTR3(KTR_LOCK, - "%s: spinning on %p held by %p", - __func__, sx, owner); - KTR_STATE1(KTR_SCHED, "thread", - sched_tdname(curthread), "spinning", - "lockname:\"%s\"", - sx->lock_object.lo_name); - GIANT_SAVE(extra_work); - do { - lock_delay(&lda); - x = SX_READ_VALUE(sx); - owner = lv_sx_owner(x); - } while (owner != NULL && - TD_IS_RUNNING(owner)); - KTR_STATE0(KTR_SCHED, "thread", - sched_tdname(curthread), "running"); - continue; - } - } else if (SX_SHARERS(x) && spintries < asx_retries) { + if ((x & SX_LOCK_SHARED) == 0) { + owner = lv_sx_owner(x); + if (TD_IS_RUNNING(owner)) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR3(KTR_LOCK, + "%s: spinning on %p held by %p", + __func__, sx, owner); KTR_STATE1(KTR_SCHED, "thread", sched_tdname(curthread), "spinning", - "lockname:\"%s\"", sx->lock_object.lo_name); - GIANT_SAVE(extra_work); - spintries++; - for (i = 0; i < asx_loops; i++) { - if (LOCK_LOG_TEST(&sx->lock_object, 0)) - CTR4(KTR_LOCK, - "%s: shared spinning on %p with %u and %u", - __func__, sx, spintries, i); - cpu_spinwait(); + "lockname:\"%s\"", + sx->lock_object.lo_name); + do { + lock_delay(&lda); x = SX_READ_VALUE(sx); - if ((x & SX_LOCK_SHARED) == 0 || - SX_SHARERS(x) == 0) - break; - } -#ifdef KDTRACE_HOOKS - lda.spin_cnt += i; -#endif + owner = lv_sx_owner(x); + } while (owner != NULL && + TD_IS_RUNNING(owner)); KTR_STATE0(KTR_SCHED, "thread", sched_tdname(curthread), "running"); - if (i != asx_loops) - continue; + continue; } + sleep_reason = 1; + } else if (SX_SHARERS(x) && spintries < asx_retries) { + KTR_STATE1(KTR_SCHED, "thread", + sched_tdname(curthread), "spinning", + "lockname:\"%s\"", sx->lock_object.lo_name); + spintries++; + for (i = 0; i < asx_loops; i += n) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR4(KTR_LOCK, + "%s: shared spinning on %p with %u and %u", + __func__, sx, spintries, i); + n = SX_SHARERS(x); + lock_delay_spin(n); + x = SX_READ_VALUE(sx); + if ((x & SX_LOCK_SHARED) == 0 || + SX_SHARERS(x) == 0) + break; + } +#ifdef KDTRACE_HOOKS + lda.spin_cnt += i; +#endif + KTR_STATE0(KTR_SCHED, "thread", + sched_tdname(curthread), "running"); + if (i < asx_loops) + continue; + sleep_reason = 2; } +sleepq: #endif - sleepq_lock(&sx->lock_object); x = SX_READ_VALUE(sx); retry_sleepq: @@ -681,10 +696,14 @@ retry_sleepq: * chain lock. If so, drop the sleep queue lock and try * again. */ - if (!(x & SX_LOCK_SHARED) && - (sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) { - owner = (struct thread *)SX_OWNER(x); - if (TD_IS_RUNNING(owner)) { + if (adaptive) { + if (!(x & SX_LOCK_SHARED)) { + owner = (struct thread *)SX_OWNER(x); + if (TD_IS_RUNNING(owner)) { + sleepq_release(&sx->lock_object); + continue; + } + } else if (SX_SHARERS(x) > 0 && sleep_reason == 1) { sleepq_release(&sx->lock_object); continue; } @@ -737,7 +756,6 @@ retry_sleepq: #ifdef KDTRACE_HOOKS sleep_time -= lockstat_nsecs(&sx->lock_object); #endif - GIANT_SAVE(extra_work); sleepq_add(&sx->lock_object, NULL, sx->lock_object.lo_name, SLEEPQ_SX | ((opts & SX_INTERRUPTIBLE) ? SLEEPQ_INTERRUPTIBLE : 0), SQ_EXCLUSIVE_QUEUE); @@ -887,6 +905,7 @@ _sx_slock_hard(struct sx *sx, int opts, uintptr_t x LO GIANT_DECLARE; #ifdef ADAPTIVE_SX volatile struct thread *owner; + bool adaptive; #endif #ifdef LOCK_PROFILING uint64_t waittime = 0; @@ -915,6 +934,10 @@ _sx_slock_hard(struct sx *sx, int opts, uintptr_t x LO lock_delay_arg_init(&lda, NULL); #endif +#ifdef ADAPTIVE_SX + adaptive = ((sx->lock_object.lo_flags & SX_NOADAPTIVE) != 0); +#endif + #ifdef HWPMC_HOOKS PMC_SOFT_CALL( , , lock, failed); #endif @@ -931,6 +954,9 @@ _sx_slock_hard(struct sx *sx, int opts, uintptr_t x LO state = x; } #endif +#ifndef INVARIANTS + GIANT_SAVE(extra_work); +#endif /* * As with rwlocks, we don't make any attempt to try to block @@ -939,37 +965,40 @@ _sx_slock_hard(struct sx *sx, int opts, uintptr_t x LO for (;;) { if (__sx_slock_try(sx, &x LOCK_FILE_LINE_ARG)) break; +#ifdef INVARIANTS + GIANT_SAVE(extra_work); +#endif #ifdef KDTRACE_HOOKS lda.spin_cnt++; #endif #ifdef ADAPTIVE_SX + if (__predict_false(!adaptive)) + goto sleepq; /* * If the owner is running on another CPU, spin until * the owner stops running or the state of the lock * changes. */ - if ((sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) { - owner = lv_sx_owner(x); - if (TD_IS_RUNNING(owner)) { - if (LOCK_LOG_TEST(&sx->lock_object, 0)) - CTR3(KTR_LOCK, - "%s: spinning on %p held by %p", - __func__, sx, owner); - KTR_STATE1(KTR_SCHED, "thread", - sched_tdname(curthread), "spinning", - "lockname:\"%s\"", sx->lock_object.lo_name); - GIANT_SAVE(extra_work); - do { - lock_delay(&lda); - x = SX_READ_VALUE(sx); - owner = lv_sx_owner(x); - } while (owner != NULL && TD_IS_RUNNING(owner)); - KTR_STATE0(KTR_SCHED, "thread", - sched_tdname(curthread), "running"); - continue; - } + owner = lv_sx_owner(x); + if (TD_IS_RUNNING(owner)) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR3(KTR_LOCK, + "%s: spinning on %p held by %p", + __func__, sx, owner); + KTR_STATE1(KTR_SCHED, "thread", + sched_tdname(curthread), "spinning", + "lockname:\"%s\"", sx->lock_object.lo_name); + do { + lock_delay(&lda); + x = SX_READ_VALUE(sx); + owner = lv_sx_owner(x); + } while (owner != NULL && TD_IS_RUNNING(owner)); + KTR_STATE0(KTR_SCHED, "thread", + sched_tdname(curthread), "running"); + continue; } +sleepq: #endif /* @@ -994,8 +1023,7 @@ retry_sleepq: * the owner stops running or the state of the lock * changes. */ - if (!(x & SX_LOCK_SHARED) && - (sx->lock_object.lo_flags & SX_NOADAPTIVE) == 0) { + if (!(x & SX_LOCK_SHARED) && adaptive) { owner = (struct thread *)SX_OWNER(x); if (TD_IS_RUNNING(owner)) { sleepq_release(&sx->lock_object); @@ -1030,7 +1058,6 @@ retry_sleepq: #ifdef KDTRACE_HOOKS sleep_time -= lockstat_nsecs(&sx->lock_object); #endif - GIANT_SAVE(extra_work); sleepq_add(&sx->lock_object, NULL, sx->lock_object.lo_name, SLEEPQ_SX | ((opts & SX_INTERRUPTIBLE) ? SLEEPQ_INTERRUPTIBLE : 0), SQ_SHARED_QUEUE); Modified: stable/11/sys/sys/lock.h ============================================================================== --- stable/11/sys/sys/lock.h Mon Jan 1 23:45:09 2018 (r327477) +++ stable/11/sys/sys/lock.h Tue Jan 2 00:02:36 2018 (r327478) @@ -230,6 +230,13 @@ lock_delay_arg_init(struct lock_delay_arg *la, struct la->spin_cnt = 0; } +#define lock_delay_spin(n) do { \ + u_int _i; \ + \ + for (_i = (n); _i > 0; _i--) \ + cpu_spinwait(); \ +} while (0) + #define LOCK_DELAY_SYSINIT(func) \ SYSINIT(func##_ld, SI_SUB_LOCK, SI_ORDER_ANY, func, NULL)