From owner-freebsd-hackers@freebsd.org Thu Apr 13 12:18:20 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 70AF4D380E3 for ; Thu, 13 Apr 2017 12:18:20 +0000 (UTC) (envelope-from torek@elf.torek.net) Received: from elf.torek.net (mail.torek.net [96.90.199.121]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id DA38CF1C for ; Thu, 13 Apr 2017 12:18:19 +0000 (UTC) (envelope-from torek@elf.torek.net) Received: from elf.torek.net (localhost [127.0.0.1]) by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3DCIBg4093208 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Thu, 13 Apr 2017 05:18:11 -0700 (PDT) (envelope-from torek@elf.torek.net) Received: (from torek@localhost) by elf.torek.net (8.15.2/8.15.2/Submit) id v3DCIBJg093207; Thu, 13 Apr 2017 05:18:11 -0700 (PDT) (envelope-from torek) Date: Thu, 13 Apr 2017 05:18:11 -0700 (PDT) From: Chris Torek Message-Id: <201704131218.v3DCIBJg093207@elf.torek.net> To: ablacktshirt@gmail.com, imp@bsdimp.com Subject: Re: Understanding the FreeBSD locking mechanism Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com, rysto32@gmail.com In-Reply-To: <06a30d21-acff-efb2-ff58-9aa66793e929@gmail.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (elf.torek.net [127.0.0.1]); Thu, 13 Apr 2017 05:18:11 -0700 (PDT) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Apr 2017 12:18:20 -0000 >I discover that in the current implementation in FreeBSD, spinlock >does not disable interrupt entirely: [extra-snipped here] > 610 /* Give interrupts a chance while we spin. */ > 611 spinlock_exit(); > 612 while (m->mtx_lock != MTX_UNOWNED) { [more snip] >This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which >implements the core logic of spinning. However, as you can see, while >spinning, it would enable interrupt "occasionally" and disable it >again... What is the rationale for that? This code snippet is slightly misleading. The full code path runs from mtx_lock_spin() through __mtx_lock_spin(), which first invokes spinlock_enter() and then, in the *contested* case (only), calls _mtx_lock_spin_cookie(). spinlock_enter() is: td = curthread; if (td->td_md.md_spinlock_count == 0) { flags = intr_disable(); td->td_md.md_spinlock_count = 1; td->td_md.md_saved_flags = flags; } else td->td_md.md_spinlock_count++; critical_enter(); so it actualy disables interrupts *only* on the transition from td->td_md.md_spinlock_count = 0 to td->td_md.md_spinlock_count = 1, i.e., the first time we take a spin lock in this thread, whether this is a borrowed thread or not. It's possible that interrupts are actually disabled at this point. If so, td->td_md.md_saved_flags has interrupts disabled as well. This is all just an optimization to use a thread-local variable so as to avoid touching hardware. The details vary widely, but typically, touching the actual hardware controls requires flushing the CPU's instruction pipeline. If the compare-and-swap fails, we enter _mtx_lock_spin_cookie() and loop waiting to see if we can obtain the spin lock in time. In that case, we don't actually *hold* this particular spin lock itself yet, so we can call spinlock_exit() to undo the effect of the outermost spinlock_enter() (in __mtx_lock_spin). That decrements the counter. *If* it goes to zero, that also calls intr_restore(td->td_md.md_saved_flags). Hence, if we have failed to obtain our first spin lock, we restore the interrupt setting to whatever we saved. If interrupts were already locked out (as in a filter type interrupt handler) this is a potentially-somewhat-expensive no-op. If interrupts were enabled previously, this is a somewhat expensive re-enable of interrupts -- but that's OK, and maybe good, because we have no spin locks of our own yet. That means we can take hardware interrupts now, and let them borrow our current thread if they are that kind of interrupt, or schedule another thread to run if appropriate. That might even preempt us, since we do not yet hold any spin locks. (But it won't preempt us if we have done a critical_enter() before this point.) (In fact, the spinlock exit/enter calls that you see inside _mtx_lock_spin_cookie() wrap a loop that does not use compare-and- swap operations at all, but rather ordinary memory reads. These are cheaper than CAS operations on a lot of CPUs, but they may produce wrong answers when two CPUs are racing to write the same location; only a CAS produces a guaranteed answer, which might still be "you lost the race". The inner loop you are looking at occurs after losing a CAS race. Once we think we might *win* a future CAS race, _mtx_lock_spin_cookie() calls spinlock_enter() again and tries the actual CAS operation, _mtx_obtain_lock_fetch(), with interrupts disabled. Note also the calls to cpu_spinwait() -- the Linux equivalent macro is cpu_relax() -- which translates to a "pause" instruction on amd64.) Chris