From owner-freebsd-hackers@freebsd.org Thu Apr 13 13:46:28 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B8E57D3BC97 for ; Thu, 13 Apr 2017 13:46:28 +0000 (UTC) (envelope-from ablacktshirt@gmail.com) Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com [IPv6:2607:f8b0:400e:c00::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 898F9151 for ; Thu, 13 Apr 2017 13:46:28 +0000 (UTC) (envelope-from ablacktshirt@gmail.com) Received: by mail-pf0-x242.google.com with SMTP id o126so10966280pfb.1 for ; Thu, 13 Apr 2017 06:46:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=zAOW7bioTJCjAtfqpCe+yL3nIsRs0wAHd4JyPxzGVCc=; b=SmEWB0yDWfm8bSmLNHEwOk2vhhL1uTNREjQ3cidPWLOuCy9+x9dq6mok9m6/aJKr+Z x/AKmDQatPScz/E8abVAu+GSSp20+ntTAC5PAv/QJvivSq7ZuBIXW9Lz/NIDLipkMvXZ okuf7/SsVMR3DWGwUVyRCWbNk6a7gx1LJZy8lLS1qSSgpdVpRDOGSCj3SBnSJonlBUWP 2o8jY0Pg2G6Fzd6bm8s33np+beU03qzHoJkdbFPXBeR0D7Mx2AoFkayOoa/xndhbvpDW mcA94T92PhAaf/pUQ2dNm3Oq049KAxO3qbOJ+HJ79Tc06/iCvhUoN7iQzgxo1k5ufy18 MAkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=zAOW7bioTJCjAtfqpCe+yL3nIsRs0wAHd4JyPxzGVCc=; b=apiRI+zU/ZS6uDXKFDTWN9+e/l93gnkBNJ2wodRBzOdOjiTZtoI4X4aPuSBWecnO6G P6hhZDcPMhB/O+NZJf8knnJlyqMiqDk2boFomTQWYWOxqmOZ+kDCxfm/mxROteHo1Oqx 45xfBFdFJSxNhnzvjL1vmyXn4hyrJzx4V5J1SuS2FFKRpbuSkOYMDd8Wx7rRRcEx+qT4 Y16RAF4CsdWLNJTwN4SB0az3ZnNUNcta2zaKyh71iP8Cg7UVCFW4c6lC/bJKJQJFCCmp lvff5quHt6cJujz+RQhfLB5Vkj+6OojsfS7OhrehS1+6VegvzrO7ajeG6u09TR6NvTYG 4G1w== X-Gm-Message-State: AN3rC/4y0cDgk2SfY0EUd5Uv9aVj9FAxRjuvfNfh81mYvZDt03dbewiB m1VEpVf5cqMZRA== X-Received: by 10.99.65.4 with SMTP id o4mr3478761pga.90.1492091187944; Thu, 13 Apr 2017 06:46:27 -0700 (PDT) Received: from [192.168.2.211] ([116.56.129.146]) by smtp.gmail.com with ESMTPSA id 74sm42776530pfn.102.2017.04.13.06.46.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 13 Apr 2017 06:46:26 -0700 (PDT) Subject: Re: Understanding the FreeBSD locking mechanism To: Chris Torek , imp@bsdimp.com References: <201704131218.v3DCIBJg093207@elf.torek.net> Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com, rysto32@gmail.com From: Yubin Ruan Message-ID: Date: Thu, 13 Apr 2017 21:46:26 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <201704131218.v3DCIBJg093207@elf.torek.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Apr 2017 13:46:28 -0000 On 2017年04月13日 20:18, Chris Torek wrote: >> I discover that in the current implementation in FreeBSD, spinlock >> does not disable interrupt entirely: > [extra-snipped here] >> 610 /* Give interrupts a chance while we spin. */ >> 611 spinlock_exit(); >> 612 while (m->mtx_lock != MTX_UNOWNED) { > [more snip] > >> This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which >> implements the core logic of spinning. However, as you can see, while >> spinning, it would enable interrupt "occasionally" and disable it >> again... What is the rationale for that? > > This code snippet is slightly misleading. The full code path runs > from mtx_lock_spin() through __mtx_lock_spin(), which first > invokes spinlock_enter() and then, in the *contested* case (only), > calls _mtx_lock_spin_cookie(). > > spinlock_enter() is: > > td = curthread; > if (td->td_md.md_spinlock_count == 0) { > flags = intr_disable(); > td->td_md.md_spinlock_count = 1; > td->td_md.md_saved_flags = flags; > } else > td->td_md.md_spinlock_count++; > critical_enter(); > > so it actualy disables interrupts *only* on the transition from > td->td_md.md_spinlock_count = 0 to td->td_md.md_spinlock_count = 1, > i.e., the first time we take a spin lock in this thread, whether > this is a borrowed thread or not. It's possible that interrupts > are actually disabled at this point. If so, td->td_md.md_saved_flags > has interrupts disabled as well. This is all just an optimization > to use a thread-local variable so as to avoid touching hardware. > The details vary widely, but typically, touching the actual hardware > controls requires flushing the CPU's instruction pipeline. > > If the compare-and-swap fails, we enter _mtx_lock_spin_cookie() > and loop waiting to see if we can obtain the spin lock in time. > In that case, we don't actually *hold* this particular spin lock > itself yet, so we can call spinlock_exit() to undo the effect > of the outermost spinlock_enter() (in __mtx_lock_spin). That > decrements the counter. *If* it goes to zero, that also calls > intr_restore(td->td_md.md_saved_flags). > > Hence, if we have failed to obtain our first spin lock, we restore > the interrupt setting to whatever we saved. If interrupts were > already locked out (as in a filter type interrupt handler) this is > a potentially-somewhat-expensive no-op. If interrupts were > enabled previously, this is a somewhat expensive re-enable of > interrupts -- but that's OK, and maybe good, because we have no > spin locks of our own yet. That means we can take hardware > interrupts now, and let them borrow our current thread if they are > that kind of interrupt, or schedule another thread to run if > appropriate. That might even preempt us, since we do not yet hold > any spin locks. (But it won't preempt us if we have done a > critical_enter() before this point.) Good explanation. I just missed that "local" interrupt point. > (In fact, the spinlock exit/enter calls that you see inside > _mtx_lock_spin_cookie() wrap a loop that does not use compare-and- > swap operations at all, but rather ordinary memory reads. These > are cheaper than CAS operations on a lot of CPUs, but they may > produce wrong answers when two CPUs are racing to write the same why would that produce wrong result? I think what the inner loop wants to do is to perform some no-op for a while before it tries again to acquire the spinlock. So there is no race here. > location; only a CAS produces a guaranteed answer, which might > still be "you lost the race". The inner loop you are looking at > occurs after losing a CAS race. Once we think we might *win* a > future CAS race, _mtx_lock_spin_cookie() calls spinlock_enter() > again and tries the actual CAS operation, _mtx_obtain_lock_fetch(), > with interrupts disabled. Note also the calls to cpu_spinwait() > -- the Linux equivalent macro is cpu_relax() -- which translates > to a "pause" instruction on amd64.) Regards, Yubin Ruan