From owner-freebsd-smp@FreeBSD.ORG Fri Aug 31 08:13:06 2007 Return-Path: Delivered-To: smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 47E2216A418; Fri, 31 Aug 2007 08:13:06 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 38F0813C458; Fri, 31 Aug 2007 08:13:05 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 288901A4D89; Fri, 31 Aug 2007 00:10:48 -0700 (PDT) Date: Fri, 31 Aug 2007 00:10:48 -0700 From: Alfred Perlstein To: smp@freebsd.org Message-ID: <20070831071048.GF87451@elvis.mu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="PPYy/fEw/8QCHSq3" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: attilio@freebsd.org Subject: request for review: backport of sx and rwlocks from 7.0 to 6-stable X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Aug 2007 08:13:06 -0000 --PPYy/fEw/8QCHSq3 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi guys, Some work here at work was approved for sharing with community so I'm posting it here in hope of a review. We run some pretty good stress testing on our code, so I think it's pretty solid. My only concern is that I've tried my best to preserve kernel source API, but not binary compat though a few simple #defines. I can make binary compat, in albeit a somewhat confusing manner, but that will require some rototilling and weird renaming of calls to the sleepq and turnstile code. In short, I'd rather not, but I will if you think it's something that should be done. There's also a few placeholders for lock profiling which I will very likely be backporting shortly as well. Patch is attached. Comments/questions? -- - Alfred Perlstein --PPYy/fEw/8QCHSq3 Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="netsmp_rwlock_freebsd6_08312007.diff" Index: conf/NOTES =================================================================== RCS file: /cvs/ncvs/src/sys/conf/NOTES,v retrieving revision 1.1325.2.36 diff -u -r1.1325.2.36 NOTES --- conf/NOTES 8 Jul 2007 15:30:28 -0000 1.1325.2.36 +++ conf/NOTES 31 Aug 2007 00:39:59 -0000 @@ -189,12 +189,26 @@ # to disable it. options NO_ADAPTIVE_MUTEXES +# ADAPTIVE_RWLOCKS changes the behavior of reader/writer locks to spin +# if the thread that currently owns the rwlock is executing on another +# CPU. This behaviour is enabled by default, so this option can be used +# to disable it. +options NO_ADAPTIVE_RWLOCKS + + # ADAPTIVE_GIANT causes the Giant lock to also be made adaptive when # running without NO_ADAPTIVE_MUTEXES. Normally, because Giant is assumed # to be held for extended periods, contention on Giant will cause a thread # to sleep rather than spinning. options ADAPTIVE_GIANT + +# ADAPTIVE_SX changes the behavior of sx locks to spin if the thread +# that currently owns the lock is executing on another CPU. Note that +# in addition to enabling this option, individual sx locks must be +# initialized with the SX_ADAPTIVESPIN flag. +options ADAPTIVE_SX + # MUTEX_NOINLINE forces mutex operations to call functions to perform each # operation rather than inlining the simple cases. This can be used to # shrink the size of the kernel text segment. Note that this behavior is @@ -207,6 +221,20 @@ # priority waiter. options MUTEX_WAKE_ALL +# RWLOCK_NOINLINE forces rwlock operations to call functions to perform each +# operation rather than inlining the simple cases. This can be used to +# shrink the size of the kernel text segment. Note that this behavior is +# already implied by the INVARIANT_SUPPORT, INVARIANTS, KTR, LOCK_PROFILING, +# and WITNESS options. +options RWLOCK_NOINLINE + +# SX_NOINLINE forces sx lock operations to call functions to perform each +# operation rather than inlining the simple cases. This can be used to +# shrink the size of the kernel text segment. Note that this behavior is +# already implied by the INVARIANT_SUPPORT, INVARIANTS, KTR, LOCK_PROFILING, +# and WITNESS options. +options SX_NOINLINE + # SMP Debugging Options: # # PREEMPTION allows the threads that are in the kernel to be preempted Index: conf/files =================================================================== RCS file: /cvs/ncvs/src/sys/conf/files,v retrieving revision 1.1031.2.67 diff -u -r1.1031.2.67 files --- conf/files 23 Aug 2007 22:30:14 -0000 1.1031.2.67 +++ conf/files 31 Aug 2007 00:39:59 -0000 @@ -1312,6 +1312,7 @@ kern/kern_proc.c standard kern/kern_prot.c standard kern/kern_resource.c standard +kern/kern_rwlock.c standard kern/kern_sema.c standard kern/kern_shutdown.c standard kern/kern_sig.c standard Index: conf/options =================================================================== RCS file: /cvs/ncvs/src/sys/conf/options,v retrieving revision 1.510.2.21 diff -u -r1.510.2.21 options --- conf/options 8 Jul 2007 15:30:28 -0000 1.510.2.21 +++ conf/options 31 Aug 2007 00:39:59 -0000 @@ -60,7 +60,9 @@ # Miscellaneous options. ADAPTIVE_GIANT opt_adaptive_mutexes.h +ADAPTIVE_SX NO_ADAPTIVE_MUTEXES opt_adaptive_mutexes.h +NO_ADAPTIVE_RWLOCKS ALQ AUDIT opt_global.h CODA_COMPAT_5 opt_coda.h @@ -517,6 +519,8 @@ MSIZE opt_global.h REGRESSION opt_global.h RESTARTABLE_PANICS opt_global.h +RWLOCK_NOINLINE opt_global.h +SX_NOINLINE opt_global.h VFS_BIO_DEBUG opt_global.h # These are VM related options Index: dev/acpica/acpi_ec.c =================================================================== RCS file: /cvs/ncvs/src/sys/dev/acpica/acpi_ec.c,v retrieving revision 1.65.2.2 diff -u -r1.65.2.2 acpi_ec.c --- dev/acpica/acpi_ec.c 11 May 2006 17:41:00 -0000 1.65.2.2 +++ dev/acpica/acpi_ec.c 31 Aug 2007 01:20:08 -0000 @@ -144,6 +144,7 @@ #include #include #include +#include #include #include Index: kern/kern_ktrace.c =================================================================== RCS file: /cvs/ncvs/src/sys/kern/kern_ktrace.c,v retrieving revision 1.101.2.5 diff -u -r1.101.2.5 kern_ktrace.c --- kern/kern_ktrace.c 6 Sep 2006 21:43:59 -0000 1.101.2.5 +++ kern/kern_ktrace.c 31 Aug 2007 00:39:59 -0000 @@ -53,6 +53,7 @@ #include #include #include +#include #include #include #include cvs diff: kern/kern_rwlock.c is a new entry, no comparison available Index: kern/kern_sx.c =================================================================== RCS file: /cvs/ncvs/src/sys/kern/kern_sx.c,v retrieving revision 1.25.2.4 diff -u -r1.25.2.4 kern_sx.c --- kern/kern_sx.c 17 Aug 2006 19:53:06 -0000 1.25.2.4 +++ kern/kern_sx.c 31 Aug 2007 01:48:11 -0000 @@ -1,12 +1,14 @@ /*- - * Copyright (C) 2001 Jason Evans . All rights reserved. + * Copyright (c) 2007 Attilio Rao + * Copyright (c) 2001 Jason Evans + * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice(s), this list of conditions and the following disclaimer as - * the first lines of this file unmodified other than the possible + * the first lines of this file unmodified other than the possible * addition of one or more copyright notices. * 2. Redistributions in binary form must reproduce the above copyright * notice(s), this list of conditions and the following disclaimer in the @@ -26,39 +28,90 @@ */ /* - * Shared/exclusive locks. This implementation assures deterministic lock - * granting behavior, so that slocks and xlocks are interleaved. + * Shared/exclusive locks. This implementation attempts to ensure + * deterministic lock granting behavior, so that slocks and xlocks are + * interleaved. * * Priority propagation will not generally raise the priority of lock holders, * so should not be relied upon in combination with sx locks. */ -#include -__FBSDID("$FreeBSD: src/sys/kern/kern_sx.c,v 1.25.2.4 2006/08/17 19:53:06 jhb Exp $"); - +#include "opt_adaptive_sx.h" #include "opt_ddb.h" +#include +__FBSDID("$FreeBSD: src/sys/kern/kern_sx.c,v 1.54 2007/07/06 13:20:44 attilio Exp $"); + #include -#include #include -#include -#include #include #include #include +#include #include +#include + +#ifdef ADAPTIVE_SX +#include +#endif #ifdef DDB #include +#endif + +#if !defined(SMP) && defined(ADAPTIVE_SX) +#error "You must have SMP to enable the ADAPTIVE_SX option" +#endif + +CTASSERT(((SX_ADAPTIVESPIN | SX_RECURSE) & LO_CLASSFLAGS) == + (SX_ADAPTIVESPIN | SX_RECURSE)); + +/* Handy macros for sleep queues. */ +#define SQ_EXCLUSIVE_QUEUE 0 +#define SQ_SHARED_QUEUE 1 +/* + * Variations on DROP_GIANT()/PICKUP_GIANT() for use in this file. We + * drop Giant anytime we have to sleep or if we adaptively spin. + */ +#define GIANT_DECLARE \ + int _giantcnt = 0; \ + WITNESS_SAVE_DECL(Giant) \ + +#define GIANT_SAVE() do { \ + if (mtx_owned(&Giant)) { \ + WITNESS_SAVE(&Giant.mtx_object, Giant); \ + while (mtx_owned(&Giant)) { \ + _giantcnt++; \ + mtx_unlock(&Giant); \ + } \ + } \ +} while (0) + +#define GIANT_RESTORE() do { \ + if (_giantcnt > 0) { \ + mtx_assert(&Giant, MA_NOTOWNED); \ + while (_giantcnt--) \ + mtx_lock(&Giant); \ + WITNESS_RESTORE(&Giant.mtx_object, Giant); \ + } \ +} while (0) + +/* + * Returns true if an exclusive lock is recursed. It assumes + * curthread currently has an exclusive lock. + */ +#define sx_recursed(sx) ((sx)->sx_recurse != 0) + +#ifdef DDB static void db_show_sx(struct lock_object *lock); #endif struct lock_class lock_class_sx = { - "sx", - LC_SLEEPLOCK | LC_SLEEPABLE | LC_RECURSABLE | LC_UPGRADABLE, + .lc_name = "sx", + .lc_flags = LC_SLEEPLOCK | LC_SLEEPABLE | LC_RECURSABLE | LC_UPGRADABLE, #ifdef DDB - db_show_sx + .lc_ddb_show = db_show_sx, #endif }; @@ -75,243 +128,724 @@ } void -sx_init(struct sx *sx, const char *description) +sx_init_flags(struct sx *sx, const char *description, int opts) { + struct lock_object *lock; + int flags; - sx->sx_lock = mtx_pool_find(mtxpool_lockbuilder, sx); - sx->sx_cnt = 0; - cv_init(&sx->sx_shrd_cv, description); - sx->sx_shrd_wcnt = 0; - cv_init(&sx->sx_excl_cv, description); - sx->sx_excl_wcnt = 0; - sx->sx_xholder = NULL; - lock_init(&sx->sx_object, &lock_class_sx, description, NULL, - LO_WITNESS | LO_RECURSABLE | LO_SLEEPABLE | LO_UPGRADABLE); + MPASS((opts & ~(SX_QUIET | SX_RECURSE | SX_NOWITNESS | SX_DUPOK | + SX_NOPROFILE | SX_ADAPTIVESPIN)) == 0); + + bzero(sx, sizeof(*sx)); + + flags = LO_RECURSABLE | LO_SLEEPABLE | LO_UPGRADABLE; + if (opts & SX_DUPOK) + flags |= LO_DUPOK; + if (!(opts & SX_NOWITNESS)) + flags |= LO_WITNESS; + if (opts & SX_QUIET) + flags |= LO_QUIET; + + flags |= opts & (SX_ADAPTIVESPIN | SX_RECURSE); + sx->sx_lock = SX_LOCK_UNLOCKED; + sx->sx_recurse = 0; + lock = &sx->lock_object; + lock->lo_class = &lock_class_sx; + lock->lo_flags = flags; + lock->lo_name = lock->lo_type = description; + LOCK_LOG_INIT(lock, opts); + WITNESS_INIT(lock); } void sx_destroy(struct sx *sx) { + LOCK_LOG_DESTROY(&sx->lock_object, 0); - KASSERT((sx->sx_cnt == 0 && sx->sx_shrd_wcnt == 0 && sx->sx_excl_wcnt == - 0), ("%s (%s): holders or waiters\n", __func__, - sx->sx_object.lo_name)); - - sx->sx_lock = NULL; - cv_destroy(&sx->sx_shrd_cv); - cv_destroy(&sx->sx_excl_cv); - - lock_destroy(&sx->sx_object); + KASSERT(sx->sx_lock == SX_LOCK_UNLOCKED, ("sx lock still held")); + KASSERT(sx->sx_recurse == 0, ("sx lock still recursed")); + sx->sx_lock = SX_LOCK_DESTROYED; + WITNESS_DESTROY(&sx->lock_object); } -void -_sx_slock(struct sx *sx, const char *file, int line) +int +_sx_slock(struct sx *sx, int opts, const char *file, int line) { + int error = 0; - mtx_lock(sx->sx_lock); - KASSERT(sx->sx_xholder != curthread, - ("%s (%s): slock while xlock is held @ %s:%d\n", __func__, - sx->sx_object.lo_name, file, line)); - WITNESS_CHECKORDER(&sx->sx_object, LOP_NEWORDER, file, line); - - /* - * Loop in case we lose the race for lock acquisition. - */ - while (sx->sx_cnt < 0) { - sx->sx_shrd_wcnt++; - cv_wait(&sx->sx_shrd_cv, sx->sx_lock); - sx->sx_shrd_wcnt--; + MPASS(curthread != NULL); + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_slock() of destroyed sx @ %s:%d", file, line)); + WITNESS_CHECKORDER(&sx->lock_object, LOP_NEWORDER, file, line); + error = __sx_slock(sx, opts, file, line); + if (!error) { + LOCK_LOG_LOCK("SLOCK", &sx->lock_object, 0, 0, file, line); + WITNESS_LOCK(&sx->lock_object, 0, file, line); + curthread->td_locks++; } - /* Acquire a shared lock. */ - sx->sx_cnt++; - - LOCK_LOG_LOCK("SLOCK", &sx->sx_object, 0, 0, file, line); - WITNESS_LOCK(&sx->sx_object, 0, file, line); - curthread->td_locks++; - - mtx_unlock(sx->sx_lock); + return (error); } int _sx_try_slock(struct sx *sx, const char *file, int line) { + uintptr_t x; - mtx_lock(sx->sx_lock); - if (sx->sx_cnt >= 0) { - sx->sx_cnt++; - LOCK_LOG_TRY("SLOCK", &sx->sx_object, 0, 1, file, line); - WITNESS_LOCK(&sx->sx_object, LOP_TRYLOCK, file, line); + x = sx->sx_lock; + KASSERT(x != SX_LOCK_DESTROYED, + ("sx_try_slock() of destroyed sx @ %s:%d", file, line)); + if ((x & SX_LOCK_SHARED) && atomic_cmpset_acq_ptr(&sx->sx_lock, x, + x + SX_ONE_SHARER)) { + LOCK_LOG_TRY("SLOCK", &sx->lock_object, 0, 1, file, line); + WITNESS_LOCK(&sx->lock_object, LOP_TRYLOCK, file, line); curthread->td_locks++; - mtx_unlock(sx->sx_lock); return (1); - } else { - LOCK_LOG_TRY("SLOCK", &sx->sx_object, 0, 0, file, line); - mtx_unlock(sx->sx_lock); - return (0); } + + LOCK_LOG_TRY("SLOCK", &sx->lock_object, 0, 0, file, line); + return (0); } -void -_sx_xlock(struct sx *sx, const char *file, int line) +int +_sx_xlock(struct sx *sx, int opts, const char *file, int line) { + int error = 0; - mtx_lock(sx->sx_lock); - - /* - * With sx locks, we're absolutely not permitted to recurse on - * xlocks, as it is fatal (deadlock). Normally, recursion is handled - * by WITNESS, but as it is not semantically correct to hold the - * xlock while in here, we consider it API abuse and put it under - * INVARIANTS. - */ - KASSERT(sx->sx_xholder != curthread, - ("%s (%s): xlock already held @ %s:%d", __func__, - sx->sx_object.lo_name, file, line)); - WITNESS_CHECKORDER(&sx->sx_object, LOP_NEWORDER | LOP_EXCLUSIVE, file, + MPASS(curthread != NULL); + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_xlock() of destroyed sx @ %s:%d", file, line)); + WITNESS_CHECKORDER(&sx->lock_object, LOP_NEWORDER | LOP_EXCLUSIVE, file, line); - - /* Loop in case we lose the race for lock acquisition. */ - while (sx->sx_cnt != 0) { - sx->sx_excl_wcnt++; - cv_wait(&sx->sx_excl_cv, sx->sx_lock); - sx->sx_excl_wcnt--; + error = __sx_xlock(sx, curthread, opts, file, line); + if (!error) { + LOCK_LOG_LOCK("XLOCK", &sx->lock_object, 0, sx->sx_recurse, + file, line); + WITNESS_LOCK(&sx->lock_object, LOP_EXCLUSIVE, file, line); + curthread->td_locks++; } - MPASS(sx->sx_cnt == 0); - - /* Acquire an exclusive lock. */ - sx->sx_cnt--; - sx->sx_xholder = curthread; - - LOCK_LOG_LOCK("XLOCK", &sx->sx_object, 0, 0, file, line); - WITNESS_LOCK(&sx->sx_object, LOP_EXCLUSIVE, file, line); - curthread->td_locks++; - - mtx_unlock(sx->sx_lock); + return (error); } int _sx_try_xlock(struct sx *sx, const char *file, int line) { + int rval; - mtx_lock(sx->sx_lock); - if (sx->sx_cnt == 0) { - sx->sx_cnt--; - sx->sx_xholder = curthread; - LOCK_LOG_TRY("XLOCK", &sx->sx_object, 0, 1, file, line); - WITNESS_LOCK(&sx->sx_object, LOP_EXCLUSIVE | LOP_TRYLOCK, file, - line); + MPASS(curthread != NULL); + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_try_xlock() of destroyed sx @ %s:%d", file, line)); + + if (sx_xlocked(sx) && (sx->lock_object.lo_flags & SX_RECURSE) != 0) { + sx->sx_recurse++; + atomic_set_ptr(&sx->sx_lock, SX_LOCK_RECURSED); + rval = 1; + } else + rval = atomic_cmpset_acq_ptr(&sx->sx_lock, SX_LOCK_UNLOCKED, + (uintptr_t)curthread); + LOCK_LOG_TRY("XLOCK", &sx->lock_object, 0, rval, file, line); + if (rval) { + WITNESS_LOCK(&sx->lock_object, LOP_EXCLUSIVE | LOP_TRYLOCK, + file, line); curthread->td_locks++; - mtx_unlock(sx->sx_lock); - return (1); - } else { - LOCK_LOG_TRY("XLOCK", &sx->sx_object, 0, 0, file, line); - mtx_unlock(sx->sx_lock); - return (0); } + + return (rval); } void _sx_sunlock(struct sx *sx, const char *file, int line) { - _sx_assert(sx, SX_SLOCKED, file, line); - mtx_lock(sx->sx_lock); + MPASS(curthread != NULL); + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_sunlock() of destroyed sx @ %s:%d", file, line)); + _sx_assert(sx, SA_SLOCKED, file, line); + curthread->td_locks--; + WITNESS_UNLOCK(&sx->lock_object, 0, file, line); + LOCK_LOG_LOCK("SUNLOCK", &sx->lock_object, 0, 0, file, line); +#ifdef LOCK_PROFILING_SHARED + if (SX_SHARERS(sx->sx_lock) == 1) + lock_profile_release_lock(&sx->lock_object); +#endif + __sx_sunlock(sx, file, line); +} + +void +_sx_xunlock(struct sx *sx, const char *file, int line) +{ + MPASS(curthread != NULL); + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_xunlock() of destroyed sx @ %s:%d", file, line)); + _sx_assert(sx, SA_XLOCKED, file, line); curthread->td_locks--; - WITNESS_UNLOCK(&sx->sx_object, 0, file, line); + WITNESS_UNLOCK(&sx->lock_object, LOP_EXCLUSIVE, file, line); + LOCK_LOG_LOCK("XUNLOCK", &sx->lock_object, 0, sx->sx_recurse, file, + line); + if (!sx_recursed(sx)) + lock_profile_release_lock(&sx->lock_object); + __sx_xunlock(sx, curthread, file, line); +} - /* Release. */ - sx->sx_cnt--; +/* + * Try to do a non-blocking upgrade from a shared lock to an exclusive lock. + * This will only succeed if this thread holds a single shared lock. + * Return 1 if if the upgrade succeed, 0 otherwise. + */ +int +_sx_try_upgrade(struct sx *sx, const char *file, int line) +{ + uintptr_t x; + int success; + + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_try_upgrade() of destroyed sx @ %s:%d", file, line)); + _sx_assert(sx, SA_SLOCKED, file, line); /* - * If we just released the last shared lock, wake any waiters up, giving - * exclusive lockers precedence. In order to make sure that exclusive - * lockers won't be blocked forever, don't wake shared lock waiters if - * there are exclusive lock waiters. + * Try to switch from one shared lock to an exclusive lock. We need + * to maintain the SX_LOCK_EXCLUSIVE_WAITERS flag if set so that + * we will wake up the exclusive waiters when we drop the lock. */ - if (sx->sx_excl_wcnt > 0) { - if (sx->sx_cnt == 0) - cv_signal(&sx->sx_excl_cv); - } else if (sx->sx_shrd_wcnt > 0) - cv_broadcast(&sx->sx_shrd_cv); - - LOCK_LOG_LOCK("SUNLOCK", &sx->sx_object, 0, 0, file, line); - - mtx_unlock(sx->sx_lock); + x = sx->sx_lock & SX_LOCK_EXCLUSIVE_WAITERS; + success = atomic_cmpset_ptr(&sx->sx_lock, SX_SHARERS_LOCK(1) | x, + (uintptr_t)curthread | x); + LOCK_LOG_TRY("XUPGRADE", &sx->lock_object, 0, success, file, line); + if (success) + WITNESS_UPGRADE(&sx->lock_object, LOP_EXCLUSIVE | LOP_TRYLOCK, + file, line); + return (success); } +/* + * Downgrade an unrecursed exclusive lock into a single shared lock. + */ void -_sx_xunlock(struct sx *sx, const char *file, int line) +_sx_downgrade(struct sx *sx, const char *file, int line) { + uintptr_t x; - _sx_assert(sx, SX_XLOCKED, file, line); - mtx_lock(sx->sx_lock); - MPASS(sx->sx_cnt == -1); + KASSERT(sx->sx_lock != SX_LOCK_DESTROYED, + ("sx_downgrade() of destroyed sx @ %s:%d", file, line)); + _sx_assert(sx, SA_XLOCKED | SA_NOTRECURSED, file, line); +#ifndef INVARIANTS + if (sx_recursed(sx)) + panic("downgrade of a recursed lock"); +#endif - curthread->td_locks--; - WITNESS_UNLOCK(&sx->sx_object, LOP_EXCLUSIVE, file, line); + WITNESS_DOWNGRADE(&sx->lock_object, 0, file, line); - /* Release. */ - sx->sx_cnt++; - sx->sx_xholder = NULL; + /* + * Try to switch from an exclusive lock with no shared waiters + * to one sharer with no shared waiters. If there are + * exclusive waiters, we don't need to lock the sleep queue so + * long as we preserve the flag. We do one quick try and if + * that fails we grab the sleepq lock to keep the flags from + * changing and do it the slow way. + * + * We have to lock the sleep queue if there are shared waiters + * so we can wake them up. + */ + x = sx->sx_lock; + if (!(x & SX_LOCK_SHARED_WAITERS) && + atomic_cmpset_rel_ptr(&sx->sx_lock, x, SX_SHARERS_LOCK(1) | + (x & SX_LOCK_EXCLUSIVE_WAITERS))) { + LOCK_LOG_LOCK("XDOWNGRADE", &sx->lock_object, 0, 0, file, line); + return; + } /* - * Wake up waiters if there are any. Give precedence to slock waiters. + * Lock the sleep queue so we can read the waiters bits + * without any races and wakeup any shared waiters. */ - if (sx->sx_shrd_wcnt > 0) - cv_broadcast(&sx->sx_shrd_cv); - else if (sx->sx_excl_wcnt > 0) - cv_signal(&sx->sx_excl_cv); + sleepq_lock(&sx->lock_object); - LOCK_LOG_LOCK("XUNLOCK", &sx->sx_object, 0, 0, file, line); + /* + * Preserve SX_LOCK_EXCLUSIVE_WAITERS while downgraded to a single + * shared lock. If there are any shared waiters, wake them up. + */ + x = sx->sx_lock; + atomic_store_rel_ptr(&sx->sx_lock, SX_SHARERS_LOCK(1) | + (x & SX_LOCK_EXCLUSIVE_WAITERS)); + if (x & SX_LOCK_SHARED_WAITERS) + sleepq_broadcast_queue(&sx->lock_object, SLEEPQ_SX, -1, + SQ_SHARED_QUEUE); + else + sleepq_release(&sx->lock_object); - mtx_unlock(sx->sx_lock); + LOCK_LOG_LOCK("XDOWNGRADE", &sx->lock_object, 0, 0, file, line); } +/* + * This function represents the so-called 'hard case' for sx_xlock + * operation. All 'easy case' failures are redirected to this. Note + * that ideally this would be a static function, but it needs to be + * accessible from at least sx.h. + */ int -_sx_try_upgrade(struct sx *sx, const char *file, int line) +_sx_xlock_hard(struct sx *sx, uintptr_t tid, int opts, const char *file, + int line) { + GIANT_DECLARE; +#ifdef ADAPTIVE_SX + volatile struct thread *owner; +#endif + /* uint64_t waittime = 0; */ + uintptr_t x; + int /* contested = 0, */error = 0; + + /* If we already hold an exclusive lock, then recurse. */ + if (sx_xlocked(sx)) { + KASSERT((sx->lock_object.lo_flags & SX_RECURSE) != 0, + ("_sx_xlock_hard: recursed on non-recursive sx %s @ %s:%d\n", + sx->lock_object.lo_name, file, line)); + sx->sx_recurse++; + atomic_set_ptr(&sx->sx_lock, SX_LOCK_RECURSED); + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p recursing", __func__, sx); + return (0); + } - _sx_assert(sx, SX_SLOCKED, file, line); - mtx_lock(sx->sx_lock); + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR5(KTR_LOCK, "%s: %s contested (lock=%p) at %s:%d", __func__, + sx->lock_object.lo_name, (void *)sx->sx_lock, file, line); - if (sx->sx_cnt == 1) { - sx->sx_cnt = -1; - sx->sx_xholder = curthread; + while (!atomic_cmpset_acq_ptr(&sx->sx_lock, SX_LOCK_UNLOCKED, tid)) { +#ifdef ADAPTIVE_SX + /* + * If the lock is write locked and the owner is + * running on another CPU, spin until the owner stops + * running or the state of the lock changes. + */ + x = sx->sx_lock; + if (!(x & SX_LOCK_SHARED) && + (sx->lock_object.lo_flags & SX_ADAPTIVESPIN)) { + x = SX_OWNER(x); + owner = (struct thread *)x; + if (TD_IS_RUNNING(owner)) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR3(KTR_LOCK, + "%s: spinning on %p held by %p", + __func__, sx, owner); + GIANT_SAVE(); + lock_profile_obtain_lock_failed( + &sx->lock_object, &contested, &waittime); + while (SX_OWNER(sx->sx_lock) == x && + TD_IS_RUNNING(owner)) + cpu_spinwait(); + continue; + } + } +#endif - LOCK_LOG_TRY("XUPGRADE", &sx->sx_object, 0, 1, file, line); - WITNESS_UPGRADE(&sx->sx_object, LOP_EXCLUSIVE | LOP_TRYLOCK, - file, line); + sleepq_lock(&sx->lock_object); + x = sx->sx_lock; - mtx_unlock(sx->sx_lock); - return (1); - } else { - LOCK_LOG_TRY("XUPGRADE", &sx->sx_object, 0, 0, file, line); - mtx_unlock(sx->sx_lock); - return (0); + /* + * If the lock was released while spinning on the + * sleep queue chain lock, try again. + */ + if (x == SX_LOCK_UNLOCKED) { + sleepq_release(&sx->lock_object); + continue; + } + +#ifdef ADAPTIVE_SX + /* + * The current lock owner might have started executing + * on another CPU (or the lock could have changed + * owners) while we were waiting on the sleep queue + * chain lock. If so, drop the sleep queue lock and try + * again. + */ + if (!(x & SX_LOCK_SHARED) && + (sx->lock_object.lo_flags & SX_ADAPTIVESPIN)) { + owner = (struct thread *)SX_OWNER(x); + if (TD_IS_RUNNING(owner)) { + sleepq_release(&sx->lock_object); + continue; + } + } +#endif + + /* + * If an exclusive lock was released with both shared + * and exclusive waiters and a shared waiter hasn't + * woken up and acquired the lock yet, sx_lock will be + * set to SX_LOCK_UNLOCKED | SX_LOCK_EXCLUSIVE_WAITERS. + * If we see that value, try to acquire it once. Note + * that we have to preserve SX_LOCK_EXCLUSIVE_WAITERS + * as there are other exclusive waiters still. If we + * fail, restart the loop. + */ + if (x == (SX_LOCK_UNLOCKED | SX_LOCK_EXCLUSIVE_WAITERS)) { + if (atomic_cmpset_acq_ptr(&sx->sx_lock, + SX_LOCK_UNLOCKED | SX_LOCK_EXCLUSIVE_WAITERS, + tid | SX_LOCK_EXCLUSIVE_WAITERS)) { + sleepq_release(&sx->lock_object); + CTR2(KTR_LOCK, "%s: %p claimed by new writer", + __func__, sx); + break; + } + sleepq_release(&sx->lock_object); + continue; + } + + /* + * Try to set the SX_LOCK_EXCLUSIVE_WAITERS. If we fail, + * than loop back and retry. + */ + if (!(x & SX_LOCK_EXCLUSIVE_WAITERS)) { + if (!atomic_cmpset_ptr(&sx->sx_lock, x, + x | SX_LOCK_EXCLUSIVE_WAITERS)) { + sleepq_release(&sx->lock_object); + continue; + } + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p set excl waiters flag", + __func__, sx); + } + + /* + * Since we have been unable to acquire the exclusive + * lock and the exclusive waiters flag is set, we have + * to sleep. + */ +#if 0 + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p blocking on sleep queue", + __func__, sx); +#endif + + GIANT_SAVE(); + lock_profile_obtain_lock_failed(&sx->lock_object, &contested, + &waittime); + sleepq_add_queue(&sx->lock_object, NULL, sx->lock_object.lo_name, + SLEEPQ_SX | ((opts & SX_INTERRUPTIBLE) ? + SLEEPQ_INTERRUPTIBLE : 0), SQ_EXCLUSIVE_QUEUE); + if (!(opts & SX_INTERRUPTIBLE)) + sleepq_wait(&sx->lock_object); + else + error = sleepq_wait_sig(&sx->lock_object); + + if (error) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, + "%s: interruptible sleep by %p suspended by signal", + __func__, sx); + break; + } + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p resuming from sleep queue", + __func__, sx); } + + GIANT_RESTORE(); + if (!error) + lock_profile_obtain_lock_success(&sx->lock_object, contested, + waittime, file, line); + return (error); } +/* + * This function represents the so-called 'hard case' for sx_xunlock + * operation. All 'easy case' failures are redirected to this. Note + * that ideally this would be a static function, but it needs to be + * accessible from at least sx.h. + */ void -_sx_downgrade(struct sx *sx, const char *file, int line) +_sx_xunlock_hard(struct sx *sx, uintptr_t tid, const char *file, int line) +{ + uintptr_t x; + int queue; + + MPASS(!(sx->sx_lock & SX_LOCK_SHARED)); + + /* If the lock is recursed, then unrecurse one level. */ + if (sx_xlocked(sx) && sx_recursed(sx)) { + if ((--sx->sx_recurse) == 0) + atomic_clear_ptr(&sx->sx_lock, SX_LOCK_RECURSED); + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p unrecursing", __func__, sx); + return; + } + MPASS(sx->sx_lock & (SX_LOCK_SHARED_WAITERS | + SX_LOCK_EXCLUSIVE_WAITERS)); + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p contested", __func__, sx); + + sleepq_lock(&sx->lock_object); + x = SX_LOCK_UNLOCKED; + + /* + * The wake up algorithm here is quite simple and probably not + * ideal. It gives precedence to shared waiters if they are + * present. For this condition, we have to preserve the + * state of the exclusive waiters flag. + */ + if (sx->sx_lock & SX_LOCK_SHARED_WAITERS) { + queue = SQ_SHARED_QUEUE; + x |= (sx->sx_lock & SX_LOCK_EXCLUSIVE_WAITERS); + } else + queue = SQ_EXCLUSIVE_QUEUE; + + /* Wake up all the waiters for the specific queue. */ + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR3(KTR_LOCK, "%s: %p waking up all threads on %s queue", + __func__, sx, queue == SQ_SHARED_QUEUE ? "shared" : + "exclusive"); + atomic_store_rel_ptr(&sx->sx_lock, x); + sleepq_broadcast_queue(&sx->lock_object, SLEEPQ_SX, -1, queue); +} + +/* + * This function represents the so-called 'hard case' for sx_slock + * operation. All 'easy case' failures are redirected to this. Note + * that ideally this would be a static function, but it needs to be + * accessible from at least sx.h. + */ +int +_sx_slock_hard(struct sx *sx, int opts, const char *file, int line) +{ + GIANT_DECLARE; +#ifdef ADAPTIVE_SX + volatile struct thread *owner; +#endif +#ifdef LOCK_PROFILING_SHARED + uint64_t waittime = 0; + int contested = 0; +#endif + uintptr_t x; + int error = 0; + + /* + * As with rwlocks, we don't make any attempt to try to block + * shared locks once there is an exclusive waiter. + */ + for (;;) { + x = sx->sx_lock; + + /* + * If no other thread has an exclusive lock then try to bump up + * the count of sharers. Since we have to preserve the state + * of SX_LOCK_EXCLUSIVE_WAITERS, if we fail to acquire the + * shared lock loop back and retry. + */ + if (x & SX_LOCK_SHARED) { + MPASS(!(x & SX_LOCK_SHARED_WAITERS)); + if (atomic_cmpset_acq_ptr(&sx->sx_lock, x, + x + SX_ONE_SHARER)) { +#ifdef LOCK_PROFILING_SHARED + if (SX_SHARERS(x) == 0) + lock_profile_obtain_lock_success( + &sx->lock_object, contested, + waittime, file, line); +#endif + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR4(KTR_LOCK, + "%s: %p succeed %p -> %p", __func__, + sx, (void *)x, + (void *)(x + SX_ONE_SHARER)); + break; + } + continue; + } + +#ifdef ADAPTIVE_SX + /* + * If the owner is running on another CPU, spin until + * the owner stops running or the state of the lock + * changes. + */ + else if (sx->lock_object.lo_flags & SX_ADAPTIVESPIN) { + x = SX_OWNER(x); + owner = (struct thread *)x; + if (TD_IS_RUNNING(owner)) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR3(KTR_LOCK, + "%s: spinning on %p held by %p", + __func__, sx, owner); + GIANT_SAVE(); +#ifdef LOCK_PROFILING_SHARED + lock_profile_obtain_lock_failed( + &sx->lock_object, &contested, &waittime); +#endif + while (SX_OWNER(sx->sx_lock) == x && + TD_IS_RUNNING(owner)) + cpu_spinwait(); + continue; + } + } +#endif + + /* + * Some other thread already has an exclusive lock, so + * start the process of blocking. + */ + sleepq_lock(&sx->lock_object); + x = sx->sx_lock; + + /* + * The lock could have been released while we spun. + * In this case loop back and retry. + */ + if (x & SX_LOCK_SHARED) { + sleepq_release(&sx->lock_object); + continue; + } + +#ifdef ADAPTIVE_SX + /* + * If the owner is running on another CPU, spin until + * the owner stops running or the state of the lock + * changes. + */ + if (!(x & SX_LOCK_SHARED) && + (sx->lock_object.lo_flags & SX_ADAPTIVESPIN)) { + owner = (struct thread *)SX_OWNER(x); + if (TD_IS_RUNNING(owner)) { + sleepq_release(&sx->lock_object); + continue; + } + } +#endif + + /* + * Try to set the SX_LOCK_SHARED_WAITERS flag. If we + * fail to set it drop the sleep queue lock and loop + * back. + */ + if (!(x & SX_LOCK_SHARED_WAITERS)) { + if (!atomic_cmpset_ptr(&sx->sx_lock, x, + x | SX_LOCK_SHARED_WAITERS)) { + sleepq_release(&sx->lock_object); + continue; + } + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p set shared waiters flag", + __func__, sx); + } + + /* + * Since we have been unable to acquire the shared lock, + * we have to sleep. + */ + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p blocking on sleep queue", + __func__, sx); + + GIANT_SAVE(); +#ifdef LOCK_PROFILING_SHARED + lock_profile_obtain_lock_failed(&sx->lock_object, &contested, + &waittime); +#endif + sleepq_add_queue(&sx->lock_object, NULL, sx->lock_object.lo_name, + SLEEPQ_SX | ((opts & SX_INTERRUPTIBLE) ? + SLEEPQ_INTERRUPTIBLE : 0), SQ_SHARED_QUEUE); + if (!(opts & SX_INTERRUPTIBLE)) + sleepq_wait(&sx->lock_object); + else + error = sleepq_wait_sig(&sx->lock_object); + + if (error) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, + "%s: interruptible sleep by %p suspended by signal", + __func__, sx); + break; + } + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p resuming from sleep queue", + __func__, sx); + } + + GIANT_RESTORE(); + return (error); +} + +/* + * This function represents the so-called 'hard case' for sx_sunlock + * operation. All 'easy case' failures are redirected to this. Note + * that ideally this would be a static function, but it needs to be + * accessible from at least sx.h. + */ +void +_sx_sunlock_hard(struct sx *sx, const char *file, int line) { + uintptr_t x; + + for (;;) { + x = sx->sx_lock; + + /* + * We should never have sharers while at least one thread + * holds a shared lock. + */ + KASSERT(!(x & SX_LOCK_SHARED_WAITERS), + ("%s: waiting sharers", __func__)); - _sx_assert(sx, SX_XLOCKED, file, line); - mtx_lock(sx->sx_lock); - MPASS(sx->sx_cnt == -1); + /* + * See if there is more than one shared lock held. If + * so, just drop one and return. + */ + if (SX_SHARERS(x) > 1) { + if (atomic_cmpset_ptr(&sx->sx_lock, x, + x - SX_ONE_SHARER)) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR4(KTR_LOCK, + "%s: %p succeeded %p -> %p", + __func__, sx, (void *)x, + (void *)(x - SX_ONE_SHARER)); + break; + } + continue; + } - WITNESS_DOWNGRADE(&sx->sx_object, 0, file, line); + /* + * If there aren't any waiters for an exclusive lock, + * then try to drop it quickly. + */ + if (!(x & SX_LOCK_EXCLUSIVE_WAITERS)) { + MPASS(x == SX_SHARERS_LOCK(1)); + if (atomic_cmpset_ptr(&sx->sx_lock, SX_SHARERS_LOCK(1), + SX_LOCK_UNLOCKED)) { + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p last succeeded", + __func__, sx); + break; + } + continue; + } - sx->sx_cnt = 1; - sx->sx_xholder = NULL; - if (sx->sx_shrd_wcnt > 0) - cv_broadcast(&sx->sx_shrd_cv); + /* + * At this point, there should just be one sharer with + * exclusive waiters. + */ + MPASS(x == (SX_SHARERS_LOCK(1) | SX_LOCK_EXCLUSIVE_WAITERS)); - LOCK_LOG_LOCK("XDOWNGRADE", &sx->sx_object, 0, 0, file, line); + sleepq_lock(&sx->lock_object); - mtx_unlock(sx->sx_lock); + /* + * Wake up semantic here is quite simple: + * Just wake up all the exclusive waiters. + * Note that the state of the lock could have changed, + * so if it fails loop back and retry. + */ + if (!atomic_cmpset_ptr(&sx->sx_lock, + SX_SHARERS_LOCK(1) | SX_LOCK_EXCLUSIVE_WAITERS, + SX_LOCK_UNLOCKED)) { + sleepq_release(&sx->lock_object); + continue; + } + if (LOCK_LOG_TEST(&sx->lock_object, 0)) + CTR2(KTR_LOCK, "%s: %p waking up all thread on" + "exclusive queue", __func__, sx); + sleepq_broadcast_queue(&sx->lock_object, SLEEPQ_SX, -1, + SQ_EXCLUSIVE_QUEUE); + break; + } } #ifdef INVARIANT_SUPPORT @@ -327,44 +861,76 @@ void _sx_assert(struct sx *sx, int what, const char *file, int line) { +#ifndef WITNESS + int slocked = 0; +#endif if (panicstr != NULL) return; switch (what) { - case SX_LOCKED: - case SX_SLOCKED: + case SA_SLOCKED: + case SA_SLOCKED | SA_NOTRECURSED: + case SA_SLOCKED | SA_RECURSED: +#ifndef WITNESS + slocked = 1; + /* FALLTHROUGH */ +#endif + case SA_LOCKED: + case SA_LOCKED | SA_NOTRECURSED: + case SA_LOCKED | SA_RECURSED: #ifdef WITNESS - witness_assert(&sx->sx_object, what, file, line); + witness_assert(&sx->lock_object, what, file, line); #else - mtx_lock(sx->sx_lock); - if (sx->sx_cnt <= 0 && - (what == SX_SLOCKED || sx->sx_xholder != curthread)) + /* + * If some other thread has an exclusive lock or we + * have one and are asserting a shared lock, fail. + * Also, if no one has a lock at all, fail. + */ + if (sx->sx_lock == SX_LOCK_UNLOCKED || + (!(sx->sx_lock & SX_LOCK_SHARED) && (slocked || + sx_xholder(sx) != curthread))) panic("Lock %s not %slocked @ %s:%d\n", - sx->sx_object.lo_name, (what == SX_SLOCKED) ? - "share " : "", file, line); - mtx_unlock(sx->sx_lock); + sx->lock_object.lo_name, slocked ? "share " : "", + file, line); + + if (!(sx->sx_lock & SX_LOCK_SHARED)) { + if (sx_recursed(sx)) { + if (what & SA_NOTRECURSED) + panic("Lock %s recursed @ %s:%d\n", + sx->lock_object.lo_name, file, + line); + } else if (what & SA_RECURSED) + panic("Lock %s not recursed @ %s:%d\n", + sx->lock_object.lo_name, file, line); + } #endif break; - case SX_XLOCKED: - mtx_lock(sx->sx_lock); - if (sx->sx_xholder != curthread) + case SA_XLOCKED: + case SA_XLOCKED | SA_NOTRECURSED: + case SA_XLOCKED | SA_RECURSED: + if (sx_xholder(sx) != curthread) panic("Lock %s not exclusively locked @ %s:%d\n", - sx->sx_object.lo_name, file, line); - mtx_unlock(sx->sx_lock); + sx->lock_object.lo_name, file, line); + if (sx_recursed(sx)) { + if (what & SA_NOTRECURSED) + panic("Lock %s recursed @ %s:%d\n", + sx->lock_object.lo_name, file, line); + } else if (what & SA_RECURSED) + panic("Lock %s not recursed @ %s:%d\n", + sx->lock_object.lo_name, file, line); break; - case SX_UNLOCKED: + case SA_UNLOCKED: #ifdef WITNESS - witness_assert(&sx->sx_object, what, file, line); + witness_assert(&sx->lock_object, what, file, line); #else /* - * We are able to check only exclusive lock here, - * we cannot assert that *this* thread owns slock. + * If we hold an exclusve lock fail. We can't + * reliably check to see if we hold a shared lock or + * not. */ - mtx_lock(sx->sx_lock); - if (sx->sx_xholder == curthread) + if (sx_xholder(sx) == curthread) panic("Lock %s exclusively locked @ %s:%d\n", - sx->sx_object.lo_name, file, line); - mtx_unlock(sx->sx_lock); + sx->lock_object.lo_name, file, line); #endif break; default: @@ -375,7 +941,7 @@ #endif /* INVARIANT_SUPPORT */ #ifdef DDB -void +static void db_show_sx(struct lock_object *lock) { struct thread *td; @@ -384,16 +950,36 @@ sx = (struct sx *)lock; db_printf(" state: "); - if (sx->sx_cnt < 0) { - td = sx->sx_xholder; + if (sx->sx_lock == SX_LOCK_UNLOCKED) + db_printf("UNLOCKED\n"); + else if (sx->sx_lock == SX_LOCK_DESTROYED) { + db_printf("DESTROYED\n"); + return; + } else if (sx->sx_lock & SX_LOCK_SHARED) + db_printf("SLOCK: %ju\n", (uintmax_t)SX_SHARERS(sx->sx_lock)); + else { + td = sx_xholder(sx); db_printf("XLOCK: %p (tid %d, pid %d, \"%s\")\n", td, td->td_tid, td->td_proc->p_pid, td->td_proc->p_comm); - } else if (sx->sx_cnt > 0) - db_printf("SLOCK: %d locks\n", sx->sx_cnt); - else - db_printf("UNLOCKED\n"); - db_printf(" waiters: %d shared, %d exclusive\n", sx->sx_shrd_wcnt, - sx->sx_excl_wcnt); + if (sx_recursed(sx)) + db_printf(" recursed: %d\n", sx->sx_recurse); + } + + db_printf(" waiters: "); + switch(sx->sx_lock & + (SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS)) { + case SX_LOCK_SHARED_WAITERS: + db_printf("shared\n"); + break; + case SX_LOCK_EXCLUSIVE_WAITERS: + db_printf("exclusive\n"); + break; + case SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS: + db_printf("exclusive and shared\n"); + break; + default: + db_printf("none\n"); + } } /* @@ -405,47 +991,26 @@ sx_chain(struct thread *td, struct thread **ownerp) { struct sx *sx; - struct cv *cv; /* - * First, see if it looks like td is blocked on a condition - * variable. + * Check to see if this thread is blocked on an sx lock. + * First, we check the lock class. If that is ok, then we + * compare the lock name against the wait message. */ - cv = td->td_wchan; - if (cv->cv_description != td->td_wmesg) +#define LOCK_CLASS(lo) (lo)->lo_class + sx = td->td_wchan; + if (LOCK_CLASS(&sx->lock_object) != &lock_class_sx || + sx->lock_object.lo_name != td->td_wmesg) return (0); - /* - * Ok, see if it looks like td is blocked on the exclusive - * condition variable. - */ - sx = (struct sx *)((char *)cv - offsetof(struct sx, sx_excl_cv)); - if (LOCK_CLASS(&sx->sx_object) == &lock_class_sx && - sx->sx_excl_wcnt > 0) - goto ok; - - /* - * Second, see if it looks like td is blocked on the shared - * condition variable. - */ - sx = (struct sx *)((char *)cv - offsetof(struct sx, sx_shrd_cv)); - if (LOCK_CLASS(&sx->sx_object) == &lock_class_sx && - sx->sx_shrd_wcnt > 0) - goto ok; - - /* Doesn't seem to be an sx lock. */ - return (0); - -ok: /* We think we have an sx lock, so output some details. */ db_printf("blocked on sx \"%s\" ", td->td_wmesg); - if (sx->sx_cnt >= 0) { - db_printf("SLOCK (count %d)\n", sx->sx_cnt); - *ownerp = NULL; - } else { + *ownerp = sx_xholder(sx); + if (sx->sx_lock & SX_LOCK_SHARED) + db_printf("SLOCK (count %ju)\n", + (uintmax_t)SX_SHARERS(sx->sx_lock)); + else db_printf("XLOCK\n"); - *ownerp = sx->sx_xholder; - } return (1); } #endif Index: kern/subr_sleepqueue.c =================================================================== RCS file: /cvs/ncvs/src/sys/kern/subr_sleepqueue.c,v retrieving revision 1.18.2.4 diff -u -r1.18.2.4 subr_sleepqueue.c --- kern/subr_sleepqueue.c 17 Aug 2006 19:53:06 -0000 1.18.2.4 +++ kern/subr_sleepqueue.c 31 Aug 2007 00:39:59 -0000 @@ -82,6 +82,12 @@ #include #endif +#include + +#ifdef DDB +#include +#endif + /* * Constants for the hash table of sleep queue chains. These constants are * the same ones that 4BSD (and possibly earlier versions of BSD) used. @@ -94,7 +100,7 @@ #define SC_SHIFT 8 #define SC_HASH(wc) (((uintptr_t)(wc) >> SC_SHIFT) & SC_MASK) #define SC_LOOKUP(wc) &sleepq_chains[SC_HASH(wc)] - +#define NR_SLEEPQS 2 /* * There two different lists of sleep queues. Both lists are connected * via the sq_hash entries. The first list is the sleep queue chain list @@ -114,13 +120,13 @@ * c - sleep queue chain lock */ struct sleepqueue { - TAILQ_HEAD(, thread) sq_blocked; /* (c) Blocked threads. */ + TAILQ_HEAD(, thread) sq_blocked[NR_SLEEPQS]; /* (c) Blocked threads. */ LIST_ENTRY(sleepqueue) sq_hash; /* (c) Chain and free list. */ LIST_HEAD(, sleepqueue) sq_free; /* (c) Free queues. */ void *sq_wchan; /* (c) Wait channel. */ #ifdef INVARIANTS int sq_type; /* (c) Queue type. */ - struct mtx *sq_lock; /* (c) Associated lock. */ + struct mtx *sq_lock; /* (c) Associated lock. */ #endif }; @@ -142,16 +148,22 @@ 0, "maxmimum depth achieved of a single chain"); #endif static struct sleepqueue_chain sleepq_chains[SC_TABLESIZE]; - -static MALLOC_DEFINE(M_SLEEPQUEUE, "sleep queues", "sleep queues"); +static uma_zone_t sleepq_zone; /* * Prototypes for non-exported routines. */ +static int sleepq_catch_signals(void *wchan); +static int sleepq_check_signals(void); static int sleepq_check_timeout(void); +#ifdef INVARIANTS +static void sleepq_dtor(void *mem, int size, void *arg); +#endif +static int sleepq_init(void *mem, int size, int flags); +static void sleepq_resume_thread(struct sleepqueue *sq, struct thread *td, + int pri); static void sleepq_switch(void *wchan); static void sleepq_timeout(void *arg); -static void sleepq_resume_thread(struct sleepqueue *sq, struct thread *td, int pri); /* * Early initialization of sleep queues that is called from the sleepinit() @@ -182,21 +194,24 @@ NULL); #endif } + sleepq_zone = uma_zcreate("SLEEPQUEUE", sizeof(struct sleepqueue), +#ifdef INVARIANTS + NULL, sleepq_dtor, sleepq_init, NULL, UMA_ALIGN_CACHE, 0); +#else + NULL, NULL, sleepq_init, NULL, UMA_ALIGN_CACHE, 0); +#endif + thread0.td_sleepqueue = sleepq_alloc(); } /* - * Malloc and initialize a new sleep queue for a new thread. + * Get a sleep queue for a new thread. */ struct sleepqueue * sleepq_alloc(void) { - struct sleepqueue *sq; - sq = malloc(sizeof(struct sleepqueue), M_SLEEPQUEUE, M_WAITOK | M_ZERO); - TAILQ_INIT(&sq->sq_blocked); - LIST_INIT(&sq->sq_free); - return (sq); + return (uma_zalloc(sleepq_zone, M_WAITOK)); } /* @@ -206,9 +221,7 @@ sleepq_free(struct sleepqueue *sq) { - MPASS(sq != NULL); - MPASS(TAILQ_EMPTY(&sq->sq_blocked)); - free(sq, M_SLEEPQUEUE); + uma_zfree(sleepq_zone, sq); } /* @@ -262,7 +275,8 @@ * woken up. */ void -sleepq_add(void *wchan, struct mtx *lock, const char *wmesg, int flags) +sleepq_add_queue(void *wchan, struct mtx *lock, const char *wmesg, int flags, + int queue) { struct sleepqueue_chain *sc; struct sleepqueue *sq; @@ -273,10 +287,11 @@ mtx_assert(&sc->sc_lock, MA_OWNED); MPASS(td->td_sleepqueue != NULL); MPASS(wchan != NULL); + MPASS((queue >= 0) && (queue < NR_SLEEPQS)); /* If this thread is not allowed to sleep, die a horrible death. */ KASSERT(!(td->td_pflags & TDP_NOSLEEPING), - ("trying to sleep while sleeping is prohibited")); + ("Trying sleep, but thread marked as sleeping prohibited")); /* Look up the sleep queue associated with the wait channel 'wchan'. */ sq = sleepq_lookup(wchan); @@ -287,6 +302,19 @@ * into the sleep queue already in use by this wait channel. */ if (sq == NULL) { +#ifdef INVARIANTS + int i; + + sq = td->td_sleepqueue; + for (i = 0; i < NR_SLEEPQS; i++) + KASSERT(TAILQ_EMPTY(&sq->sq_blocked[i]), + ("thread's sleep queue %d is not empty", i)); + KASSERT(LIST_EMPTY(&sq->sq_free), + ("thread's sleep queue has a non-empty free list")); + KASSERT(sq->sq_wchan == NULL, ("stale sq_wchan pointer")); + sq->sq_lock = lock; + sq->sq_type = flags & SLEEPQ_TYPE; +#endif #ifdef SLEEPQUEUE_PROFILING sc->sc_depth++; if (sc->sc_depth > sc->sc_max_depth) { @@ -297,25 +325,17 @@ #endif sq = td->td_sleepqueue; LIST_INSERT_HEAD(&sc->sc_queues, sq, sq_hash); - KASSERT(TAILQ_EMPTY(&sq->sq_blocked), - ("thread's sleep queue has a non-empty queue")); - KASSERT(LIST_EMPTY(&sq->sq_free), - ("thread's sleep queue has a non-empty free list")); - KASSERT(sq->sq_wchan == NULL, ("stale sq_wchan pointer")); sq->sq_wchan = wchan; -#ifdef INVARIANTS - sq->sq_lock = lock; - sq->sq_type = flags & SLEEPQ_TYPE; -#endif } else { MPASS(wchan == sq->sq_wchan); MPASS(lock == sq->sq_lock); MPASS((flags & SLEEPQ_TYPE) == sq->sq_type); LIST_INSERT_HEAD(&sq->sq_free, td->td_sleepqueue, sq_hash); } - TAILQ_INSERT_TAIL(&sq->sq_blocked, td, td_slpq); + TAILQ_INSERT_TAIL(&sq->sq_blocked[queue], td, td_slpq); td->td_sleepqueue = NULL; mtx_lock_spin(&sched_lock); + td->td_sqqueue = queue; td->td_wchan = wchan; td->td_wmesg = wmesg; if (flags & SLEEPQ_INTERRUPTIBLE) { @@ -606,12 +626,13 @@ MPASS(td != NULL); MPASS(sq->sq_wchan != NULL); MPASS(td->td_wchan == sq->sq_wchan); + MPASS(td->td_sqqueue < NR_SLEEPQS && td->td_sqqueue >= 0); sc = SC_LOOKUP(sq->sq_wchan); mtx_assert(&sc->sc_lock, MA_OWNED); mtx_assert(&sched_lock, MA_OWNED); /* Remove the thread from the queue. */ - TAILQ_REMOVE(&sq->sq_blocked, td, td_slpq); + TAILQ_REMOVE(&sq->sq_blocked[td->td_sqqueue], td, td_slpq); /* * Get a sleep queue for this thread. If this is the last waiter, @@ -652,17 +673,51 @@ setrunnable(td); } +#ifdef INVARIANTS +/* + * UMA zone item deallocator. + */ +static void +sleepq_dtor(void *mem, int size, void *arg) +{ + struct sleepqueue *sq; + int i; + + sq = mem; + for (i = 0; i < NR_SLEEPQS; i++) + MPASS(TAILQ_EMPTY(&sq->sq_blocked[i])); +} +#endif + +/* + * UMA zone item initializer. + */ +static int +sleepq_init(void *mem, int size, int flags) +{ + struct sleepqueue *sq; + int i; + + bzero(mem, size); + sq = mem; + for (i = 0; i < NR_SLEEPQS; i++) + TAILQ_INIT(&sq->sq_blocked[i]); + LIST_INIT(&sq->sq_free); + return (0); +} + /* * Find the highest priority thread sleeping on a wait channel and resume it. */ void -sleepq_signal(void *wchan, int flags, int pri) +sleepq_signal_queue(void *wchan, int flags, int pri, int queue) { struct sleepqueue *sq; struct thread *td, *besttd; CTR2(KTR_PROC, "sleepq_signal(%p, %d)", wchan, flags); KASSERT(wchan != NULL, ("%s: invalid NULL wait channel", __func__)); + MPASS((queue >= 0) && (queue < NR_SLEEPQS)); sq = sleepq_lookup(wchan); if (sq == NULL) { sleepq_release(wchan); @@ -678,7 +733,7 @@ * the tail of sleep queues. */ besttd = NULL; - TAILQ_FOREACH(td, &sq->sq_blocked, td_slpq) { + TAILQ_FOREACH(td, &sq->sq_blocked[queue], td_slpq) { if (besttd == NULL || td->td_priority < besttd->td_priority) besttd = td; } @@ -693,12 +748,13 @@ * Resume all threads sleeping on a specified wait channel. */ void -sleepq_broadcast(void *wchan, int flags, int pri) +sleepq_broadcast_queue(void *wchan, int flags, int pri, int queue) { struct sleepqueue *sq; CTR2(KTR_PROC, "sleepq_broadcast(%p, %d)", wchan, flags); KASSERT(wchan != NULL, ("%s: invalid NULL wait channel", __func__)); + MPASS((queue >= 0) && (queue < NR_SLEEPQS)); sq = sleepq_lookup(wchan); if (sq == NULL) { sleepq_release(wchan); @@ -709,8 +765,9 @@ /* Resume all blocked threads on the sleep queue. */ mtx_lock_spin(&sched_lock); - while (!TAILQ_EMPTY(&sq->sq_blocked)) - sleepq_resume_thread(sq, TAILQ_FIRST(&sq->sq_blocked), pri); + while (!TAILQ_EMPTY(&sq->sq_blocked[queue])) + sleepq_resume_thread(sq, TAILQ_FIRST(&sq->sq_blocked[queue]), + pri); mtx_unlock_spin(&sched_lock); sleepq_release(wchan); } @@ -859,6 +916,76 @@ struct sleepqueue_chain *sc; struct sleepqueue *sq; #ifdef INVARIANTS + struct mtx *lock; +#endif + struct thread *td; + void *wchan; + int i; + + if (!have_addr) + return; + + /* + * First, see if there is an active sleep queue for the wait channel + * indicated by the address. + */ + wchan = (void *)addr; + sc = SC_LOOKUP(wchan); + LIST_FOREACH(sq, &sc->sc_queues, sq_hash) + if (sq->sq_wchan == wchan) + goto found; + + /* + * Second, see if there is an active sleep queue at the address + * indicated. + */ + for (i = 0; i < SC_TABLESIZE; i++) + LIST_FOREACH(sq, &sleepq_chains[i].sc_queues, sq_hash) { + if (sq == (struct sleepqueue *)addr) + goto found; + } + + db_printf("Unable to locate a sleep queue via %p\n", (void *)addr); + return; +found: + db_printf("Wait channel: %p\n", sq->sq_wchan); +#ifdef INVARIANTS + db_printf("Queue type: %d\n", sq->sq_type); + if (sq->sq_lock) { + lock = sq->sq_lock; +#define LOCK_CLASS(lock) (lock)->mtx_object.lo_class + db_printf("Associated Interlock: %p - (%s) %s\n", lock, + LOCK_CLASS(lock)->lc_name, lock->mtx_object.lo_name); + } +#endif + db_printf("Blocked threads:\n"); + for (i = 0; i < NR_SLEEPQS; i++) { + db_printf("\nQueue[%d]:\n", i); + if (TAILQ_EMPTY(&sq->sq_blocked[i])) + db_printf("\tempty\n"); + else + TAILQ_FOREACH(td, &sq->sq_blocked[0], + td_slpq) { + db_printf("\t%p (tid %d, pid %d, \"%s\")\n", td, + td->td_tid, td->td_proc->p_pid, + /*td->td_name[i] != '\0' ? td->td_name :*/ + td->td_proc->p_comm); + } + } +} + +#if 0 +/* Alias 'show sleepqueue' to 'show sleepq'. */ +DB_SET(sleepqueue, db_show_sleepqueue, db_show_cmd_set, 0, NULL); +#endif +#endif + +#ifdef DDB +DB_SHOW_COMMAND(sleepq, db_show_sleepqueue) +{ + struct sleepqueue_chain *sc; + struct sleepqueue *sq; +#ifdef INVARIANTS struct lock_object *lock; #endif struct thread *td; Index: kern/subr_turnstile.c =================================================================== RCS file: /cvs/ncvs/src/sys/kern/subr_turnstile.c,v retrieving revision 1.152.2.5 diff -u -r1.152.2.5 subr_turnstile.c --- kern/subr_turnstile.c 23 Jan 2007 22:16:33 -0000 1.152.2.5 +++ kern/subr_turnstile.c 31 Aug 2007 02:15:23 -0000 @@ -114,7 +114,8 @@ * q - td_contested lock */ struct turnstile { - TAILQ_HEAD(, thread) ts_blocked; /* (c + q) Blocked threads. */ + /* struct mtx ts_lock; */ /* Spin lock for self. */ + TAILQ_HEAD(, thread) ts_blocked[2]; /* (c + q) Blocked threads. */ TAILQ_HEAD(, thread) ts_pending; /* (c) Pending threads. */ LIST_ENTRY(turnstile) ts_hash; /* (c) Chain and free list. */ LIST_ENTRY(turnstile) ts_link; /* (q) Contested locks. */ @@ -143,6 +144,12 @@ static struct mtx td_contested_lock; static struct turnstile_chain turnstile_chains[TC_TABLESIZE]; +/* XXX: stats, remove me */ +static u_int turnstile_nullowners; +SYSCTL_UINT(_debug, OID_AUTO, turnstile_nullowners, CTLFLAG_RD, + &turnstile_nullowners, 0, "called with null owner on a shared queue"); + + static MALLOC_DEFINE(M_TURNSTILE, "turnstiles", "turnstiles"); /* @@ -267,6 +274,7 @@ { struct turnstile_chain *tc; struct thread *td1, *td2; + int queue; mtx_assert(&sched_lock, MA_OWNED); MPASS(TD_ON_LOCK(td)); @@ -300,16 +308,18 @@ * Remove thread from blocked chain and determine where * it should be moved to. */ + queue = td->td_tsqueue; + MPASS(queue == TS_EXCLUSIVE_QUEUE || queue == TS_SHARED_QUEUE); mtx_lock_spin(&td_contested_lock); - TAILQ_REMOVE(&ts->ts_blocked, td, td_lockq); - TAILQ_FOREACH(td1, &ts->ts_blocked, td_lockq) { + TAILQ_REMOVE(&ts->ts_blocked[queue], td, td_lockq); + TAILQ_FOREACH(td1, &ts->ts_blocked[queue], td_lockq) { MPASS(td1->td_proc->p_magic == P_MAGIC); if (td1->td_priority > td->td_priority) break; } if (td1 == NULL) - TAILQ_INSERT_TAIL(&ts->ts_blocked, td, td_lockq); + TAILQ_INSERT_TAIL(&ts->ts_blocked[queue], td, td_lockq); else TAILQ_INSERT_BEFORE(td1, td, td_lockq); mtx_unlock_spin(&td_contested_lock); @@ -412,7 +422,10 @@ * Note that we currently don't try to revoke lent priorities * when our priority goes up. */ - if (td == TAILQ_FIRST(&ts->ts_blocked) && td->td_priority < oldpri) { + MPASS(td->td_tsqueue == TS_EXCLUSIVE_QUEUE || + td->td_tsqueue == TS_SHARED_QUEUE); + if (td == TAILQ_FIRST(&ts->ts_blocked[td->td_tsqueue]) && + td->td_priority < oldpri) { mtx_unlock_spin(&tc->tc_lock); critical_enter(); propagate_priority(td); @@ -429,8 +442,11 @@ { mtx_assert(&td_contested_lock, MA_OWNED); - MPASS(owner->td_proc->p_magic == P_MAGIC); MPASS(ts->ts_owner == NULL); + if (owner == NULL) + return; + + MPASS(owner->td_proc->p_magic == P_MAGIC); ts->ts_owner = owner; LIST_INSERT_HEAD(&owner->td_contested, ts, ts_link); } @@ -444,7 +460,8 @@ struct turnstile *ts; ts = malloc(sizeof(struct turnstile), M_TURNSTILE, M_WAITOK | M_ZERO); - TAILQ_INIT(&ts->ts_blocked); + TAILQ_INIT(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE]); + TAILQ_INIT(&ts->ts_blocked[TS_SHARED_QUEUE]); TAILQ_INIT(&ts->ts_pending); LIST_INIT(&ts->ts_free); return (ts); @@ -458,7 +475,8 @@ { MPASS(ts != NULL); - MPASS(TAILQ_EMPTY(&ts->ts_blocked)); + MPASS(TAILQ_EMPTY(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE])); + MPASS(TAILQ_EMPTY(&ts->ts_blocked[TS_SHARED_QUEUE])); MPASS(TAILQ_EMPTY(&ts->ts_pending)); free(ts, M_TURNSTILE); } @@ -507,6 +525,22 @@ } /* + * Return a pointer to the thread waiting on this turnstile with the + * most important priority or NULL if the turnstile has no waiters. + */ +static struct thread * +turnstile_first_waiter(struct turnstile *ts) +{ + struct thread *std, *xtd; + + std = TAILQ_FIRST(&ts->ts_blocked[TS_SHARED_QUEUE]); + xtd = TAILQ_FIRST(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE]); + if (xtd == NULL || (std != NULL && std->td_priority < xtd->td_priority)) + return (std); + return (xtd); +} + +/* * Take ownership of a turnstile and adjust the priority of the new * owner appropriately. */ @@ -527,7 +561,7 @@ turnstile_setowner(ts, owner); mtx_unlock_spin(&td_contested_lock); - td = TAILQ_FIRST(&ts->ts_blocked); + td = turnstile_first_waiter(ts); MPASS(td != NULL); MPASS(td->td_proc->p_magic == P_MAGIC); mtx_unlock_spin(&tc->tc_lock); @@ -548,7 +582,7 @@ * turnstile chain locked and will return with it unlocked. */ void -turnstile_wait(struct lock_object *lock, struct thread *owner) +turnstile_wait_queue(struct lock_object *lock, struct thread *owner, int queue) { struct turnstile_chain *tc; struct turnstile *ts; @@ -558,8 +592,13 @@ tc = TC_LOOKUP(lock); mtx_assert(&tc->tc_lock, MA_OWNED); MPASS(td->td_turnstile != NULL); - MPASS(owner != NULL); - MPASS(owner->td_proc->p_magic == P_MAGIC); + if (owner) + MPASS(owner->td_proc->p_magic == P_MAGIC); + /* XXX: stats, remove me */ + if (!owner && queue == TS_SHARED_QUEUE) { + turnstile_nullowners++; + } + MPASS(queue == TS_SHARED_QUEUE || queue == TS_EXCLUSIVE_QUEUE); /* Look up the turnstile associated with the lock 'lock'. */ ts = turnstile_lookup(lock); @@ -582,25 +621,27 @@ LIST_INSERT_HEAD(&tc->tc_turnstiles, ts, ts_hash); KASSERT(TAILQ_EMPTY(&ts->ts_pending), ("thread's turnstile has pending threads")); - KASSERT(TAILQ_EMPTY(&ts->ts_blocked), - ("thread's turnstile has a non-empty queue")); + KASSERT(TAILQ_EMPTY(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE]), + ("thread's turnstile has exclusive waiters")); + KASSERT(TAILQ_EMPTY(&ts->ts_blocked[TS_SHARED_QUEUE]), + ("thread's turnstile has shared waiters")); KASSERT(LIST_EMPTY(&ts->ts_free), ("thread's turnstile has a non-empty free list")); KASSERT(ts->ts_lockobj == NULL, ("stale ts_lockobj pointer")); ts->ts_lockobj = lock; mtx_lock_spin(&td_contested_lock); - TAILQ_INSERT_TAIL(&ts->ts_blocked, td, td_lockq); + TAILQ_INSERT_TAIL(&ts->ts_blocked[queue], td, td_lockq); turnstile_setowner(ts, owner); mtx_unlock_spin(&td_contested_lock); } else { - TAILQ_FOREACH(td1, &ts->ts_blocked, td_lockq) + TAILQ_FOREACH(td1, &ts->ts_blocked[queue], td_lockq) if (td1->td_priority > td->td_priority) break; mtx_lock_spin(&td_contested_lock); if (td1 != NULL) TAILQ_INSERT_BEFORE(td1, td, td_lockq); else - TAILQ_INSERT_TAIL(&ts->ts_blocked, td, td_lockq); + TAILQ_INSERT_TAIL(&ts->ts_blocked[queue], td, td_lockq); mtx_unlock_spin(&td_contested_lock); MPASS(td->td_turnstile != NULL); LIST_INSERT_HEAD(&ts->ts_free, td->td_turnstile, ts_hash); @@ -664,7 +705,7 @@ * pending list. This must be called with the turnstile chain locked. */ int -turnstile_signal(struct turnstile *ts) +turnstile_signal_queue(struct turnstile *ts, int queue) { struct turnstile_chain *tc; struct thread *td; @@ -675,15 +716,18 @@ MPASS(ts->ts_owner == curthread); tc = TC_LOOKUP(ts->ts_lockobj); mtx_assert(&tc->tc_lock, MA_OWNED); + MPASS(ts->ts_owner == curthread || + (queue == TS_EXCLUSIVE_QUEUE && ts->ts_owner == NULL)); + MPASS(queue == TS_SHARED_QUEUE || queue == TS_EXCLUSIVE_QUEUE); /* * Pick the highest priority thread blocked on this lock and * move it to the pending list. */ - td = TAILQ_FIRST(&ts->ts_blocked); + td = TAILQ_FIRST(&ts->ts_blocked[queue]); MPASS(td->td_proc->p_magic == P_MAGIC); mtx_lock_spin(&td_contested_lock); - TAILQ_REMOVE(&ts->ts_blocked, td, td_lockq); + TAILQ_REMOVE(&ts->ts_blocked[queue], td, td_lockq); mtx_unlock_spin(&td_contested_lock); TAILQ_INSERT_TAIL(&ts->ts_pending, td, td_lockq); @@ -692,7 +736,8 @@ * give it to the about-to-be-woken thread. Otherwise take a * turnstile from the free list and give it to the thread. */ - empty = TAILQ_EMPTY(&ts->ts_blocked); + empty = TAILQ_EMPTY(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE]) && + TAILQ_EMPTY(&ts->ts_blocked[TS_SHARED_QUEUE]); if (empty) { MPASS(LIST_EMPTY(&ts->ts_free)); #ifdef TURNSTILE_PROFILING @@ -712,7 +757,7 @@ * the turnstile chain locked. */ void -turnstile_broadcast(struct turnstile *ts) +turnstile_broadcast_queue(struct turnstile *ts, int queue) { struct turnstile_chain *tc; struct turnstile *ts1; @@ -720,15 +765,17 @@ MPASS(ts != NULL); MPASS(curthread->td_proc->p_magic == P_MAGIC); - MPASS(ts->ts_owner == curthread); + MPASS(ts->ts_owner == curthread || + (queue == TS_EXCLUSIVE_QUEUE && ts->ts_owner == NULL)); tc = TC_LOOKUP(ts->ts_lockobj); mtx_assert(&tc->tc_lock, MA_OWNED); + MPASS(queue == TS_SHARED_QUEUE || queue == TS_EXCLUSIVE_QUEUE); /* * Transfer the blocked list to the pending list. */ mtx_lock_spin(&td_contested_lock); - TAILQ_CONCAT(&ts->ts_pending, &ts->ts_blocked, td_lockq); + TAILQ_CONCAT(&ts->ts_pending, &ts->ts_blocked[queue], td_lockq); mtx_unlock_spin(&td_contested_lock); /* @@ -756,15 +803,17 @@ * chain locked. */ void -turnstile_unpend(struct turnstile *ts) +turnstile_unpend_queue(struct turnstile *ts, int owner_type) { TAILQ_HEAD( ,thread) pending_threads; struct turnstile_chain *tc; + struct turnstile *nts; struct thread *td; u_char cp, pri; MPASS(ts != NULL); - MPASS(ts->ts_owner == curthread); + MPASS(ts->ts_owner == curthread || + (owner_type == TS_SHARED_LOCK && ts->ts_owner == NULL)); tc = TC_LOOKUP(ts->ts_lockobj); mtx_assert(&tc->tc_lock, MA_OWNED); MPASS(!TAILQ_EMPTY(&ts->ts_pending)); @@ -776,7 +825,8 @@ TAILQ_INIT(&pending_threads); TAILQ_CONCAT(&pending_threads, &ts->ts_pending, td_lockq); #ifdef INVARIANTS - if (TAILQ_EMPTY(&ts->ts_blocked)) + if (TAILQ_EMPTY(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE]) && + TAILQ_EMPTY(&ts->ts_blocked[TS_SHARED_QUEUE])) ts->ts_lockobj = NULL; #endif @@ -802,8 +852,8 @@ pri = PRI_MAX; mtx_lock_spin(&sched_lock); mtx_lock_spin(&td_contested_lock); - LIST_FOREACH(ts, &td->td_contested, ts_link) { - cp = TAILQ_FIRST(&ts->ts_blocked)->td_priority; + LIST_FOREACH(nts, &td->td_contested, ts_link) { + cp = turnstile_first_waiter(nts)->td_priority; if (cp < pri) pri = cp; } @@ -837,19 +887,69 @@ } /* + * Give up ownership of a turnstile. This must be called with the + * turnstile chain locked. + */ +void +turnstile_disown(struct turnstile *ts) +{ + struct turnstile_chain *tc; + struct thread *td; + u_char cp, pri; + + MPASS(ts != NULL); + MPASS(ts->ts_owner == curthread); + tc = TC_LOOKUP(ts->ts_lockobj); + mtx_assert(&tc->tc_lock, MA_OWNED); + MPASS(TAILQ_EMPTY(&ts->ts_pending)); + MPASS(!TAILQ_EMPTY(&ts->ts_blocked[TS_EXCLUSIVE_QUEUE]) || + !TAILQ_EMPTY(&ts->ts_blocked[TS_SHARED_QUEUE])); + + /* + * Remove the turnstile from this thread's list of contested locks + * since this thread doesn't own it anymore. New threads will + * not be blocking on the turnstile until it is claimed by a new + * owner. + */ + mtx_lock_spin(&td_contested_lock); + ts->ts_owner = NULL; + LIST_REMOVE(ts, ts_link); + mtx_unlock_spin(&td_contested_lock); + + /* + * Adjust the priority of curthread based on other contested + * locks it owns. Don't lower the priority below the base + * priority however. + */ + td = curthread; + pri = PRI_MAX; + mtx_lock_spin(&sched_lock); + mtx_lock_spin(&td_contested_lock); + LIST_FOREACH(ts, &td->td_contested, ts_link) { + cp = turnstile_first_waiter(ts)->td_priority; + if (cp < pri) + pri = cp; + } + mtx_unlock_spin(&td_contested_lock); + sched_unlend_prio(td, pri); + mtx_unlock_spin(&sched_lock); +} + +/* * Return the first thread in a turnstile. */ struct thread * -turnstile_head(struct turnstile *ts) +turnstile_head_queue(struct turnstile *ts, int queue) { #ifdef INVARIANTS struct turnstile_chain *tc; MPASS(ts != NULL); tc = TC_LOOKUP(ts->ts_lockobj); + MPASS(queue == TS_SHARED_QUEUE || queue == TS_EXCLUSIVE_QUEUE); mtx_assert(&tc->tc_lock, MA_OWNED); #endif - return (TAILQ_FIRST(&ts->ts_blocked)); + return (TAILQ_FIRST(&ts->ts_blocked[queue])); } #ifdef DDB @@ -1146,7 +1246,7 @@ * Returns true if a turnstile is empty. */ int -turnstile_empty(struct turnstile *ts) +turnstile_empty_queue(struct turnstile *ts, int queue) { #ifdef INVARIANTS struct turnstile_chain *tc; @@ -1154,6 +1254,7 @@ MPASS(ts != NULL); tc = TC_LOOKUP(ts->ts_lockobj); mtx_assert(&tc->tc_lock, MA_OWNED); + MPASS(queue == TS_SHARED_QUEUE || queue == TS_EXCLUSIVE_QUEUE); #endif - return (TAILQ_EMPTY(&ts->ts_blocked)); + return (TAILQ_EMPTY(&ts->ts_blocked[queue])); } Index: netinet6/in6_src.c =================================================================== RCS file: /cvs/ncvs/src/sys/netinet6/in6_src.c,v retrieving revision 1.30.2.4 diff -u -r1.30.2.4 in6_src.c --- netinet6/in6_src.c 25 Dec 2005 14:03:37 -0000 1.30.2.4 +++ netinet6/in6_src.c 31 Aug 2007 01:23:38 -0000 @@ -76,6 +76,7 @@ #include #include #include +#include #include #include cvs diff: sys/_rwlock.h is a new entry, no comparison available cvs diff: sys/_sx.h is a new entry, no comparison available cvs diff: sys/lock_profile.h is a new entry, no comparison available Index: sys/proc.h =================================================================== RCS file: /cvs/ncvs/src/sys/sys/proc.h,v retrieving revision 1.432.2.10 diff -u -r1.432.2.10 proc.h --- sys/proc.h 11 Jun 2007 11:27:04 -0000 1.432.2.10 +++ sys/proc.h 31 Aug 2007 00:51:44 -0000 @@ -261,12 +261,14 @@ int td_inhibitors; /* (j) Why can not run. */ int td_pflags; /* (k) Private thread (TDP_*) flags. */ int td_dupfd; /* (k) Ret value from fdopen. XXX */ + int td_sqqueue; /* (t) Sleepqueue queue blocked on. */ void *td_wchan; /* (j) Sleep address. */ const char *td_wmesg; /* (j) Reason for sleep. */ u_char td_lastcpu; /* (j) Last cpu we were on. */ u_char td_oncpu; /* (j) Which cpu we are on. */ volatile u_char td_owepreempt; /* (k*) Preempt on last critical_exit */ short td_locks; /* (k) Count of non-spin locks. */ + u_char td_tsqueue; /* (t) Turnstile queue blocked on. */ struct turnstile *td_blocked; /* (j) Lock process is blocked on. */ void *td_ithd; /* (n) Unused, kept to preserve ABI. */ const char *td_lockname; /* (j) Name of lock blocked on. */ @@ -324,7 +326,19 @@ struct mdthread td_md; /* (k) Any machine-dependent fields. */ struct td_sched *td_sched; /* (*) Scheduler-specific data. */ struct kaudit_record *td_ar; /* (k) Active audit record, if any. */ -}; +} __attribute__ ((aligned (16))); /* see comment below */ + +/* + * Comment about aligned attribute on struct thread. + * + * We must force at least 16 byte alignment for "struct thread" + * because the rwlocks and sxlocks expect to use the bottom bits + * of the pointer for bookkeeping information. + * + * This causes problems for the thread0 data structure because it + * may not be properly aligned otherwise. + */ + /* * Flags kept in td_flags: cvs diff: sys/rwlock.h is a new entry, no comparison available Index: sys/sleepqueue.h =================================================================== RCS file: /cvs/ncvs/src/sys/sys/sleepqueue.h,v retrieving revision 1.6.2.1 diff -u -r1.6.2.1 sleepqueue.h --- sys/sleepqueue.h 27 Feb 2006 00:19:39 -0000 1.6.2.1 +++ sys/sleepqueue.h 31 Aug 2007 01:47:52 -0000 @@ -26,15 +26,16 @@ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * - * $FreeBSD: src/sys/sys/sleepqueue.h,v 1.6.2.1 2006/02/27 00:19:39 davidxu Exp $ + * $FreeBSD: src/sys/sys/sleepqueue.h,v 1.12 2007/03/31 23:23:42 jhb Exp $ */ #ifndef _SYS_SLEEPQUEUE_H_ #define _SYS_SLEEPQUEUE_H_ /* - * Sleep queue interface. Sleep/wakeup and condition variables use a sleep - * queue for the queue of threads blocked on a sleep channel. + * Sleep queue interface. Sleep/wakeup, condition variables, and sx + * locks use a sleep queue for the queue of threads blocked on a sleep + * channel. * * A thread calls sleepq_lock() to lock the sleep queue chain associated * with a given wait channel. A thread can then call call sleepq_add() to @@ -84,24 +85,33 @@ #define SLEEPQ_TYPE 0x0ff /* Mask of sleep queue types. */ #define SLEEPQ_MSLEEP 0x00 /* Used by msleep/wakeup. */ #define SLEEPQ_CONDVAR 0x01 /* Used for a cv. */ +#define SLEEPQ_SX 0x03 /* Used by an sx lock. */ #define SLEEPQ_INTERRUPTIBLE 0x100 /* Sleep is interruptible. */ void init_sleepqueues(void); void sleepq_abort(struct thread *td, int intrval); -void sleepq_add(void *, struct mtx *, const char *, int); +void sleepq_add_queue(void *, struct mtx *, const char *, int, int); struct sleepqueue *sleepq_alloc(void); -void sleepq_broadcast(void *, int, int); +void sleepq_broadcast_queue(void *, int, int, int); void sleepq_free(struct sleepqueue *); void sleepq_lock(void *); struct sleepqueue *sleepq_lookup(void *); void sleepq_release(void *); void sleepq_remove(struct thread *, void *); -void sleepq_signal(void *, int, int); +void sleepq_signal_queue(void *, int, int, int); void sleepq_set_timeout(void *wchan, int timo); int sleepq_timedwait(void *wchan); int sleepq_timedwait_sig(void *wchan); void sleepq_wait(void *); int sleepq_wait_sig(void *wchan); +/* Preserve source compat with 6.x */ +#define sleepq_add(wchan, lock, wmesg, flags) \ + sleepq_add_queue(wchan, lock, wmesg, flags, 0) +#define sleepq_broadcast(wchan, flags, pri) \ + sleepq_broadcast_queue(wchan, flags, pri, 0) +#define sleepq_signal(wchan, flags, pri) \ + sleepq_signal_queue(wchan, flags, pri, 0) + #endif /* _KERNEL */ #endif /* !_SYS_SLEEPQUEUE_H_ */ Index: sys/sx.h =================================================================== RCS file: /cvs/ncvs/src/sys/sys/sx.h,v retrieving revision 1.21.2.5 diff -u -r1.21.2.5 sx.h --- sys/sx.h 27 Aug 2007 13:45:35 -0000 1.21.2.5 +++ sys/sx.h 31 Aug 2007 01:02:47 -0000 @@ -1,5 +1,7 @@ /*- - * Copyright (C) 2001 Jason Evans . All rights reserved. + * Copyright (c) 2007 Attilio Rao + * Copyright (c) 2001 Jason Evans + * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -24,39 +26,93 @@ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH * DAMAGE. * - * $FreeBSD: src/sys/sys/sx.h,v 1.21.2.5 2007/08/27 13:45:35 jhb Exp $ + * $FreeBSD: src/sys/sys/sx.h,v 1.37 2007/07/06 13:20:44 attilio Exp $ */ #ifndef _SYS_SX_H_ #define _SYS_SX_H_ -#include #include -#include /* XXX */ +#include +#include -struct sx { - struct lock_object sx_object; /* Common lock properties. */ - struct mtx *sx_lock; /* General protection lock. */ - int sx_cnt; /* -1: xlock, > 0: slock count. */ - struct cv sx_shrd_cv; /* slock waiters. */ - int sx_shrd_wcnt; /* Number of slock waiters. */ - struct cv sx_excl_cv; /* xlock waiters. */ - int sx_excl_wcnt; /* Number of xlock waiters. */ - struct thread *sx_xholder; /* Thread presently holding xlock. */ -}; +#ifdef _KERNEL +#include +#endif + +/* + * In general, the sx locks and rwlocks use very similar algorithms. + * The main difference in the implementations is how threads are + * blocked when a lock is unavailable. For this, sx locks use sleep + * queues which do not support priority propagation, and rwlocks use + * turnstiles which do. + * + * The sx_lock field consists of several fields. The low bit + * indicates if the lock is locked with a shared or exclusive lock. A + * value of 0 indicates an exclusive lock, and a value of 1 indicates + * a shared lock. Bit 1 is a boolean indicating if there are any + * threads waiting for a shared lock. Bit 2 is a boolean indicating + * if there are any threads waiting for an exclusive lock. Bit 3 is a + * boolean indicating if an exclusive lock is recursively held. The + * rest of the variable's definition is dependent on the value of the + * first bit. For an exclusive lock, it is a pointer to the thread + * holding the lock, similar to the mtx_lock field of mutexes. For + * shared locks, it is a count of read locks that are held. + * + * When the lock is not locked by any thread, it is encoded as a + * shared lock with zero waiters. + * + * A note about memory barriers. Exclusive locks need to use the same + * memory barriers as mutexes: _acq when acquiring an exclusive lock + * and _rel when releasing an exclusive lock. On the other side, + * shared lock needs to use an _acq barrier when acquiring the lock + * but, since they don't update any locked data, no memory barrier is + * needed when releasing a shared lock. + */ + +#define SX_LOCK_SHARED 0x01 +#define SX_LOCK_SHARED_WAITERS 0x02 +#define SX_LOCK_EXCLUSIVE_WAITERS 0x04 +#define SX_LOCK_RECURSED 0x08 +#define SX_LOCK_FLAGMASK \ + (SX_LOCK_SHARED | SX_LOCK_SHARED_WAITERS | \ + SX_LOCK_EXCLUSIVE_WAITERS | SX_LOCK_RECURSED) + +#define SX_OWNER(x) ((x) & ~SX_LOCK_FLAGMASK) +#define SX_SHARERS_SHIFT 4 +#define SX_SHARERS(x) (SX_OWNER(x) >> SX_SHARERS_SHIFT) +#define SX_SHARERS_LOCK(x) \ + ((x) << SX_SHARERS_SHIFT | SX_LOCK_SHARED) +#define SX_ONE_SHARER (1 << SX_SHARERS_SHIFT) + +#define SX_LOCK_UNLOCKED SX_SHARERS_LOCK(0) +#define SX_LOCK_DESTROYED \ + (SX_LOCK_SHARED_WAITERS | SX_LOCK_EXCLUSIVE_WAITERS) #ifdef _KERNEL + +/* + * Function prototipes. Routines that start with an underscore are not part + * of the public interface and are wrappered with a macro. + */ void sx_sysinit(void *arg); -void sx_init(struct sx *sx, const char *description); +#define sx_init(sx, desc) sx_init_flags((sx), (desc), 0) +void sx_init_flags(struct sx *sx, const char *description, int opts); void sx_destroy(struct sx *sx); -void _sx_slock(struct sx *sx, const char *file, int line); -void _sx_xlock(struct sx *sx, const char *file, int line); +int _sx_slock(struct sx *sx, int opts, const char *file, int line); +int _sx_xlock(struct sx *sx, int opts, const char *file, int line); int _sx_try_slock(struct sx *sx, const char *file, int line); int _sx_try_xlock(struct sx *sx, const char *file, int line); void _sx_sunlock(struct sx *sx, const char *file, int line); void _sx_xunlock(struct sx *sx, const char *file, int line); int _sx_try_upgrade(struct sx *sx, const char *file, int line); void _sx_downgrade(struct sx *sx, const char *file, int line); +int _sx_xlock_hard(struct sx *sx, uintptr_t tid, int opts, + const char *file, int line); +int _sx_slock_hard(struct sx *sx, int opts, const char *file, int line); +void _sx_xunlock_hard(struct sx *sx, uintptr_t tid, const char *file, int + line); +void _sx_sunlock_hard(struct sx *sx, const char *file, int line); #if defined(INVARIANTS) || defined(INVARIANT_SUPPORT) void _sx_assert(struct sx *sx, int what, const char *file, int line); #endif @@ -79,15 +135,119 @@ SYSUNINIT(name##_sx_sysuninit, SI_SUB_LOCK, SI_ORDER_MIDDLE, \ sx_destroy, (sxa)) -#define sx_xlocked(sx) ((sx)->sx_cnt < 0 && (sx)->sx_xholder == curthread) -#define sx_slock(sx) _sx_slock((sx), LOCK_FILE, LOCK_LINE) -#define sx_xlock(sx) _sx_xlock((sx), LOCK_FILE, LOCK_LINE) +/* + * Full lock operations that are suitable to be inlined in non-debug kernels. + * If the lock can't be acquired or released trivially then the work is + * deferred to 'tougher' functions. + */ + +/* Acquire an exclusive lock. */ +static __inline int +__sx_xlock(struct sx *sx, struct thread *td, int opts, const char *file, + int line) +{ + uintptr_t tid = (uintptr_t)td; + int error = 0; + + if (!atomic_cmpset_acq_ptr(&sx->sx_lock, SX_LOCK_UNLOCKED, tid)) + error = _sx_xlock_hard(sx, tid, opts, file, line); + else + lock_profile_obtain_lock_success(&sx->lock_object, 0, 0, file, + line); + + return (error); +} + +/* Release an exclusive lock. */ +static __inline void +__sx_xunlock(struct sx *sx, struct thread *td, const char *file, int line) +{ + uintptr_t tid = (uintptr_t)td; + + if (!atomic_cmpset_rel_ptr(&sx->sx_lock, tid, SX_LOCK_UNLOCKED)) + _sx_xunlock_hard(sx, tid, file, line); +} + +/* Acquire a shared lock. */ +static __inline int +__sx_slock(struct sx *sx, int opts, const char *file, int line) +{ + uintptr_t x = sx->sx_lock; + int error = 0; + + if (!(x & SX_LOCK_SHARED) || + !atomic_cmpset_acq_ptr(&sx->sx_lock, x, x + SX_ONE_SHARER)) + error = _sx_slock_hard(sx, opts, file, line); +#ifdef LOCK_PROFILING_SHARED + else if (SX_SHARERS(x) == 0) + lock_profile_obtain_lock_success(&sx->lock_object, 0, 0, file, + line); +#endif + + return (error); +} + +/* + * Release a shared lock. We can just drop a single shared lock so + * long as we aren't trying to drop the last shared lock when other + * threads are waiting for an exclusive lock. This takes advantage of + * the fact that an unlocked lock is encoded as a shared lock with a + * count of 0. + */ +static __inline void +__sx_sunlock(struct sx *sx, const char *file, int line) +{ + uintptr_t x = sx->sx_lock; + + if (x == (SX_SHARERS_LOCK(1) | SX_LOCK_EXCLUSIVE_WAITERS) || + !atomic_cmpset_ptr(&sx->sx_lock, x, x - SX_ONE_SHARER)) + _sx_sunlock_hard(sx, file, line); +} + +/* + * Public interface for lock operations. + */ +#ifndef LOCK_DEBUG +#error "LOCK_DEBUG not defined, include before " +#endif +#if (LOCK_DEBUG > 0) || defined(SX_NOINLINE) +#define sx_xlock(sx) (void)_sx_xlock((sx), 0, LOCK_FILE, LOCK_LINE) +#define sx_xlock_sig(sx) \ + _sx_xlock((sx), SX_INTERRUPTIBLE, LOCK_FILE, LOCK_LINE) +#define sx_xunlock(sx) _sx_xunlock((sx), LOCK_FILE, LOCK_LINE) +#define sx_slock(sx) (void)_sx_slock((sx), 0, LOCK_FILE, LOCK_LINE) +#define sx_slock_sig(sx) \ + _sx_slock((sx), SX_INTERRUPTIBLE, LOCK_FILE, LOCK_LINE) +#define sx_sunlock(sx) _sx_sunlock((sx), LOCK_FILE, LOCK_LINE) +#else +#define sx_xlock(sx) \ + (void)__sx_xlock((sx), curthread, 0, LOCK_FILE, LOCK_LINE) +#define sx_xlock_sig(sx) \ + __sx_xlock((sx), curthread, SX_INTERRUPTIBLE, LOCK_FILE, LOCK_LINE) +#define sx_xunlock(sx) \ + __sx_xunlock((sx), curthread, LOCK_FILE, LOCK_LINE) +#define sx_slock(sx) (void)__sx_slock((sx), 0, LOCK_FILE, LOCK_LINE) +#define sx_slock_sig(sx) \ + __sx_slock((sx), SX_INTERRUPTIBLE, LOCK_FILE, LOCK_LINE) +#define sx_sunlock(sx) __sx_sunlock((sx), LOCK_FILE, LOCK_LINE) +#endif /* LOCK_DEBUG > 0 || SX_NOINLINE */ #define sx_try_slock(sx) _sx_try_slock((sx), LOCK_FILE, LOCK_LINE) #define sx_try_xlock(sx) _sx_try_xlock((sx), LOCK_FILE, LOCK_LINE) -#define sx_sunlock(sx) _sx_sunlock((sx), LOCK_FILE, LOCK_LINE) -#define sx_xunlock(sx) _sx_xunlock((sx), LOCK_FILE, LOCK_LINE) #define sx_try_upgrade(sx) _sx_try_upgrade((sx), LOCK_FILE, LOCK_LINE) #define sx_downgrade(sx) _sx_downgrade((sx), LOCK_FILE, LOCK_LINE) + +/* + * Return a pointer to the owning thread if the lock is exclusively + * locked. + */ +#define sx_xholder(sx) \ + ((sx)->sx_lock & SX_LOCK_SHARED ? NULL : \ + (struct thread *)SX_OWNER((sx)->sx_lock)) + +#define sx_xlocked(sx) \ + (((sx)->sx_lock & ~(SX_LOCK_FLAGMASK & ~SX_LOCK_SHARED)) == \ + (uintptr_t)curthread) + #define sx_unlock(sx) do { \ if (sx_xlocked(sx)) \ sx_xunlock(sx); \ @@ -95,17 +255,39 @@ sx_sunlock(sx); \ } while (0) +#define sx_sleep(chan, sx, pri, wmesg, timo) \ + _sleep((chan), &(sx)->lock_object, (pri), (wmesg), (timo)) + +/* + * Options passed to sx_init_flags(). + */ +#define SX_DUPOK 0x01 +#define SX_NOPROFILE 0x02 +#define SX_NOWITNESS 0x04 +#define SX_QUIET 0x08 +#define SX_ADAPTIVESPIN 0x10 +#define SX_RECURSE 0x20 + +/* + * Options passed to sx_*lock_hard(). + */ +#define SX_INTERRUPTIBLE 0x40 + #if defined(INVARIANTS) || defined(INVARIANT_SUPPORT) #define SA_LOCKED LA_LOCKED #define SA_SLOCKED LA_SLOCKED #define SA_XLOCKED LA_XLOCKED #define SA_UNLOCKED LA_UNLOCKED +#define SA_RECURSED LA_RECURSED +#define SA_NOTRECURSED LA_NOTRECURSED /* Backwards compatability. */ #define SX_LOCKED LA_LOCKED #define SX_SLOCKED LA_SLOCKED #define SX_XLOCKED LA_XLOCKED #define SX_UNLOCKED LA_UNLOCKED +#define SX_RECURSED LA_RECURSED +#define SX_NOTRECURSED LA_NOTRECURSED #endif #ifdef INVARIANTS Index: sys/turnstile.h =================================================================== RCS file: /cvs/ncvs/src/sys/sys/turnstile.h,v retrieving revision 1.7 diff -u -r1.7 turnstile.h --- sys/turnstile.h 7 Jan 2005 02:29:24 -0000 1.7 +++ sys/turnstile.h 31 Aug 2007 00:39:59 -0000 @@ -73,20 +73,43 @@ #ifdef _KERNEL +/* Which queue to block on or which queue to wakeup one or more threads from. */ +#define TS_EXCLUSIVE_QUEUE 0 +#define TS_SHARED_QUEUE 1 + +/* The type of lock currently held. */ +#define TS_EXCLUSIVE_LOCK TS_EXCLUSIVE_QUEUE +#define TS_SHARED_LOCK TS_SHARED_QUEUE + void init_turnstiles(void); void turnstile_adjust(struct thread *, u_char); struct turnstile *turnstile_alloc(void); -void turnstile_broadcast(struct turnstile *); +#define turnstile_wakeup(turnstile) turnstile_broadcast(turnstile) +#define turnstile_broadcast(turnstile) \ + turnstile_broadcast_queue(turnstile, TS_EXCLUSIVE_QUEUE) +void turnstile_broadcast_queue(struct turnstile *, int); void turnstile_claim(struct lock_object *); -int turnstile_empty(struct turnstile *); +void turnstile_disown(struct turnstile *); +#define turnstile_empty(turnstile) \ + turnstile_empty_queue(turnstile, TS_EXCLUSIVE_QUEUE); +int turnstile_empty_queue(struct turnstile *, int); void turnstile_free(struct turnstile *); -struct thread *turnstile_head(struct turnstile *); +#define turnstile_head(turnstile) \ + turnstile_head_queue(turnstile, TS_EXCLUSIVE_QUEUE) +struct thread *turnstile_head_queue(struct turnstile *, int); void turnstile_lock(struct lock_object *); struct turnstile *turnstile_lookup(struct lock_object *); void turnstile_release(struct lock_object *); -int turnstile_signal(struct turnstile *); -void turnstile_unpend(struct turnstile *); -void turnstile_wait(struct lock_object *, struct thread *); +#define turnstile_signal(turnstile) \ + turnstile_signal_queue(turnstile, TS_EXCLUSIVE_QUEUE) +int turnstile_signal_queue(struct turnstile *, int); +struct turnstile *turnstile_trywait(struct lock_object *); +#define turnstile_unpend(turnstile) \ + turnstile_unpend_queue(turnstile, TS_EXCLUSIVE_QUEUE); +void turnstile_unpend_queue(struct turnstile *, int); +#define turnstile_wait(lock_object, thread) \ + turnstile_wait_queue(lock_object, thread, TS_EXCLUSIVE_QUEUE) +void turnstile_wait_queue(struct lock_object *, struct thread *, int); #endif /* _KERNEL */ #endif /* _SYS_TURNSTILE_H_ */ Index: vm/vm_map.c =================================================================== RCS file: /cvs/ncvs/src/sys/vm/vm_map.c,v retrieving revision 1.366.2.4 diff -u -r1.366.2.4 vm_map.c --- vm/vm_map.c 30 Aug 2007 02:32:04 -0000 1.366.2.4 +++ vm/vm_map.c 31 Aug 2007 03:12:00 -0000 @@ -429,7 +429,7 @@ if (map->system_map) _mtx_lock_flags(&map->system_mtx, 0, file, line); else - _sx_xlock(&map->lock, file, line); + (void) _sx_xlock(&map->lock, 0, file, line); map->timestamp++; } @@ -450,7 +450,7 @@ if (map->system_map) _mtx_lock_flags(&map->system_mtx, 0, file, line); else - _sx_xlock(&map->lock, file, line); + (void) _sx_xlock(&map->lock, 0, file, line); } void --PPYy/fEw/8QCHSq3--